Here are my data for the Indianapolis 500. Here's my average speed and my average starting position right here. I'm going to click on this starting position and I'm going to go over here and sort. I could create a filter and the filter would allow me to sort these things from highest to lowest. So here, I've gone to my data and I've created a filter and the filter allows me to sort these things from highest to lowest. So I just did that. When I do that, I see I get a lot of ones than twos and threes and what have you. Now, I've got a count of 69. So 69 lies between these sets of years that I'm looking at. So if I take this, the 34.5th observation is smack dab in the middle. So when looking at the median, there's 34 observations on this side and then there's 34 observations on this side. I want to look at this observation right in the middle. Because when I sum all of those guys right here, that's the 69. So when I have an odd number of observations, then I take this I divide it by two, and then, essentially, looking at the 34th position here and the 36th position here, so I want to identify the 35th position because that's the absolute middle of my observations. So down here looking for the 35th position, since I'm off by one, I'm going to be going down, scrolling down to 36. Thirty-six is one of my favorite drivers, A. J. Foyt. So A. J. Foyt won this race in 1977 and he's ranked from the position from lowest to highest. He was in the fourth position. So the median position of the winner of the Indianapolis 500 during this particular sample was the fourth position. So in this case, we have a mean which is 6.188, we have a median of four, and we have a mode of one. So the fourth position driver has really only won this race four times. The first position driver has won this race 17 times. I would still be much more comfortable saying that the mode, the better measure of the central tendency of these data rather than either of the mean or the median. We're going to give you some additional examples to work on and then we're going to need to debrief these as well in the next video. In order to calculate the mean or the median or the mode, we need some data. Here are some data about the Cubs. We've got the game duration and the attendance at Wrigley Field for each game during the Chicago Cubs 2016 season downloaded from Major League Baseball websites. Calculating the average, the mean, or the median, or mode is fairly simple using Excel with a couple of keystrokes, and I'll show you what those are. Here at the bottom of this row, we can type in equals average. This is literally going to calculate my mean. When I type in equals average, it identifies, that is Excel identifies what is the range or what is the population or the sample that I would like to calculate the average for? In this case, I have all my data from games one through games 162. If I highlight the entire population of these data and hit " Enter", that will give me the mean, the arithmetic average. So I'm summing up the game time for each game, each observation, observation one through 162 dividing by 162, and I get my average game time. Now, what I can also do is I can also calculate a sample population from this same set here. So if I say equals average, and I calculate sample population of, let's say, games 140 through games 150, right here, and hit " Enter", now I get my average game time for those games. Now, that's a sample population and that's a sample mean. Now, these two are considerably different, three hours and four minutes and two hours and 50 minutes. So the sample that I drew had a slightly quicker game time than my entire population. So which one is more accurate, the population or the sample? Well, obviously, if we have an opportunity to grab the entire population, it's going to tell you what the averages of that population. A lot of times it's very difficult for researchers to grab an entire population. So if I were to say what's the average income for the city of Atlanta, it'll be difficult for me to identify what the average income, what the income is for every single citizen in the entire city of Atlanta. In this case, I would be drawing from a sample. So, if it was current population survey or the census that asked a question like this, now, I've got a sample statistic. How well does that sample reflect the population at large? Depends on how well I have drawn from the population, the number that I've drawn from the population, and if it's a random draw or some kind of a bias draw. In this example, here, I have the games ranked from the slowest game to the fastest game or vice-versa. So I have a rained out game here that only lasted one hour and 15 minutes and I have a marathon game here, five hours and three minutes. It's multiple extra innings. So, the average of these games are three minutes and four seconds just like I showed in the previous tab. Let's suppose I draw a population but I draw a population from a ranked sample, a ranked population and I'm drawing a sample from all of the longest games, well, then my average is going to be over an hour and 15 minutes longer than the entire population. This is a very bad way to draw your sample from your population is to rank them and then just grab one set that you like or dislike. It's not really giving you an accurate representation. In this case, the sample mean would be significantly different than the population mean. Going back to my game data here. From games one through games 162, you notice that 10 games towards the end of the season, the average game time was a little faster than the average game time of the entire population. If I draw a larger sample, I am more likely to get closer to the mean of the entire population. So if I grab data from, let's say, games 60 to games 110, you'll notice that my average game time of that sample was 309 which is very close to the average game time for the population. What you do with your data, whether you're drawing a population or a sample, how you draw your sample might give you a slightly different mean between your population and your sample. The goal here is if you're going to use a sample to have a random sample or have the biggest sample you can that accurately represents what's going on in the population. Otherwise, your sample mean is not going to be very reflective of what's going on in your population.