While statistics may sound like a scary advanced math, you have likely been doing statistics since the time you were in elementary school! Statistics involves observing and analyzing data, often in the form of tables and graphs. One of the basic concepts in statistics is dealing with how to measure central tendency, i.e. mean, median, mode, and range, which you are likely familiar with. Probability is also an important statistical concept. Since these ideas are covered in other lessons, in this lesson we will focus on more advanced statistics topics, such as expected value, percentiles, and standard deviation/variance.
The word percentile comes from the word percent, meaning out of 100. Percentiles divide a group of data into 100 equal parts, and demonstrate where certain data points fall relative to other data points. The ACT uses percentiles to show you how your score compares to other scores.
A percentile shows where a value falls relative to other values in a set of data. For example, if a value is in the 20th percentile, it means that it is greater than or equal to 20% of the values in the set of data. We can calculate percentile rank using the formula:
Example
Given an ordered set of 20 test scores, {40, 42, 56, 60, 65, 65, 68, 72, 73, 74, 76, 80, 82, 86, 90, 92, 94, 95, 96, 97}, find the 85th percentile.
Since the data is already ordered, we do not have to order it. We are trying to find n, the rank that marks the 85th percentile, P. We know that N = 20, since the set contains 20 values.
Plugging into the equation, we find .
Now we count up to the 17th value in the ordered set, which is 94. This means that the scores greater than or equal to 94 fall in the 85th percentile.
If n is not a whole number, round up to the nearest whole number.
We previously defined the percentile or percentile rank as the lowest value that is greater than or equal to a certain percentage of values, n. However, some methodologies define percentile rank as the lowest value that is greater than a certain percentage of values, n. The difference is subtle (≥ vs >), but it can significantly change your answer. Any questions related to percentile rank will make this distinction for you.
Expected value is a concept used in probability that shows the expected outcome of an event, given the probabilities of the different outcomes.
The expected value is kind of like the weighted average of the probabilities of different outcomes. To find the expected value, we multiply each possible outcome by its probability, and add the sums together.
Expected value can be expressed as
where represents the different possible outcomes, and represents the probability of each of those outcomes.
Example
The probability distribution of x is shown in the table below.
x | P(x) |
1 |
0.2 |
2 | 0.3 |
3 | 0.4 |
4 | 0.1 |
What is the expected value of x?
To find the expected value, multiply each value of x by its probability and add it together.
The standard deviation is how much a set of data varies from its mean. The variance is the standard deviation squared.
The variance is the squared deviation from the mean. A reason we square it is to eliminate negatives. To calculate the variance, first calculate the mean of the data, then subtract the mean from each individual value, square the difference, and add the squared differences together. Lastly, divide the sum by the number of values in the set of data.
Example
5 friends have the following heights: 60 inches, 66 inches, 72 inches, 68 inches, and 59 inches. Find the variance of the heights.
A population contains every value in a set of data.
A sample contains one or more values in a set of data.
For example, if we were trying to calculate the variance of the entire elephant population’s weight, we would probably only have time to find the weights of a few elephants, not every single elephant. Therefore, our data would be taken from a sample of elephants, since they represent a larger set of data.
When calculating variance, if your set of data represents an entire population, you will use our original equation, . However, if your data is a sample from a larger set of data, you will want to divide by N - 1, making your equation .
To calculate the standard deviation, simply take the square root of the variance.
In our previous example, we found the variance to be 24. The standard deviation of that example would be . This means that on average, the heights of the friends deviated about 4.9 inches from the mean.
Practice Problems
Answers