Hello everybody, Welcome to Module 4. In this module, we're going to be removing the assumption of normality that we've been using throughout this entire course so far. We're going to extend hypothesis testing to other parameters that don't have an obvious interpretation in the sample. For example, we've done the population mean, and we've estimated that using the sample mean, we've done the population variance, and we've estimated that using the sample variance. But there can be some pretty crazy PDFs out there that depend on a parameter. It's called Lambda, Gamma, Beta. I don't know and we want to do a hypothesis test on those parameters. No normality and the parameters maybe not having an obvious interpretation in the data. In this particular video, it's going to be very short. I just want to talk about a property of the exponential distribution that we're going to need for the next video when we get back to hypothesis testing. I have X1 through Xn IID or a random sample from the exponential distribution with rate Lambda. Now we already know that the sum of exponentials has a Gamma distribution. In particular, the first parameter for that Gamma distribution is the number of exponentials that you added up. In this case, n. The second parameter is the parameter that matches the exponential distribution. Remember in this course we're using a certain parameterization to describe the exponential and also the Gamma. This could be a little different in other books or websites, you might see that the parameter in the Gamma, when you add exponentials gets flipped or something like that, so do be careful. Now I've mentioned before that it is of interest to us to be able to transform a lot of things that we're working with too chi-squared random variables because the chi-squared distribution is so central to statistics. I think it's as important as the normal distribution. If we were trying to make a hypothesis test for an exponential distribution, and we wanted our result to be in terms of a chi-squared random variable, we're going to have to make a transformation. Now the chi-squared n random variable is defined to be the Gamma n/2, 1/2 and we have in our sum of the exponentials, a Gamma and Lambda. The fact that we have n and not n/2 is not a big deal because we can replace this n with 2n/2. If the other parameter was 1/2, then we would have a chi-squared 2n. Let's just focus on that second parameter. We know that if you take a Gamma random variable and multiply by a constant, it ends up underneath the second parameter of the Gamma distribution. I want to get rid of that Lambda so I'm going to multiply by Lambda and that would give me a Gamma n1. Now to get the 1/2 in there, I also want to multiply by 2 so that gives me a Gamma n 1/2, which is a Gamma 2n/2, 1/2, which is a chi-squared 2n. That's how we can get a chi-squared out of an exponential sample. Now along these same lines, we also know that the sample mean of exponentials has a Gamma distribution. Why? Because we do know that the sum of exponentials is Gamma and the sample mean is just the sum of exponentials multiplied by the constant 1/n. That's not constant in the sequence, if you change the sample size, is not constant, but it is constant in the sense that it is not random the 1/n. The 1/n is supposed to end up under the second parameter so if the sum of n exponentials with rate Lambda has a Gamma distribution with parameters n and Lambda then 1/n times that has a Gamma distribution with parameters n and n Lambda. That n Lambda comes from Lambda divided by 1 over n. If we were doing a hypothesis test involving exponentials and we had things written in terms of the sample mean, we then could possibly, if it was convenient for us, turn that into a chi-squared. Because now we're starting with a Gamma and n Lambda. If I multiply by 2n Lambda, that ends up underneath the n Lambda in the Gamma distribution. So I have a Gamma n 1/2, which is a Gamma 2n over 2,1/2, which is a chi-squared 2n. To change the sum of exponentials into a chi-squared, we need to multiply by two Lambda, and to change the sample mean into a chi-squared, we need to multiply by 2n Lambda. So far what I've done is summarize some things we've already talked about in this course. But here's something we haven't talked about in this course that you may or may not know. That is, if you have exponential random variables, what is the distribution of the minimum of n exponential random variables? What does that even mean? But let's look at grabbing a sample of exponentials. I'm going to take a sample of size 6 and they're going to just show up somewhere on the x-axis and they should be more concentrated down towards zero because that's where all the area is under the pdf or the majority of the area. In this sample of size 6, there is a minimum value right there. That minimum value is what it is, but I want to know its distribution. Meaning that if you change the sample and you get a new sample of size 6, you're going to end up with a new minimum. This red X and the red X on the previous slide, those are two points that I have now sampled from this distribution that is derived from the minimum of exponentials. If you keep grabbing minimums, they're going to change but they're going to tend to be down towards zero. It's going to be really hard to take an exponential sample, especially if it's large and have a minimum way up here. What is that exact distribution? If I were to do this again and again and again and make a red histogram corresponding to the red X's, what would that look like? What pdf with that match? Let's find the distribution of the minimum. I'm going to let Y_n be the minimum of X_1 through X_n. When finding the distribution of a minimum or a maximum, I think it's really convenient to use CDFs or cumulative distribution functions. First of all, the CDF for any one exponential, we're going to call it F of x, and this is the probability that any one of the X's in the sample is less than or equal to x, and we know this is 1 minus e to the minus Lambda X. That is gotten by taking the exponential pdf and integrating from zero to X. Knowing that, let's go for the CDF for the minimum. I'm going to call it F_Y_n of Y and that's a lot of stuff on a subscript, you might just call it F_n, as long as you know that it's the CDF for Y_n. But in either case, this is the probability that Y_n is less than or equal to y. If I plug in what Y_n is, namely this minimum, we're computing the probability that the minimum of the X's is less than or equal to y. So the game to play here is to say, I know the behavior of the individual X's. I know they follow an exponential distribution with rate Lambda. So can I rephrase this probability in terms of the individual X's? That's what we're going to try to do. On this number line here, I've plotted a random sample of size 3, and I put down the fixed value, y, and we can see that this is an example where the minimum value in the sample is less than or equal to y. Let's look at another sample of size 3. Here's a sample of size 3 on a number line, there's the fixed value y and this is of course, another case where the minimum is less than or equal to y. A third one might be that every value in the sample is less than or equal to y. We can see that there's lots of ways for our sample of size 3 to fall so that the minimum is less than or equal to y. It's hard for me at this point to say, this event up here, the minimum being less than or equal to y, is equivalent to X_1 being less than y and X_2 being also less than y and X_3 being above y, there's lots of things to consider there, but not if you turn it around. I'm going to write this probability for this minimum as 1 minus the probability of still the minimum but rather than less than or equal to y, its now greater than y, the complement of that event. Then here is a typical picture for a sample where the minimum is greater than y. Try to draw a sample where the minimum is greater than y and that means that every value in the sample has to be greater than y. That means I can rewrite this probability statement here, the minimum being greater than y as the probability X_1 is greater than y and X_2 is greater than y, all the way on up. Because these are independent random variables, I get to break up that probability into a product of probabilities. Because these are identically distributed, each one of those probabilities are the same. I get 1 minus the probability that X_1 is greater than y and that entire probability raised to the nth power. We know this probability here, X_1, has an exponential distribution with rate Lambda, we know it's CDF which is the less than or equal to probability, is 1 minus e to the minus Lambda y, and what we want to plug in here is 1 minus the CDF. We want to unflip that, look at the complement of the event that X_1 is greater than y. Here I'm plugging in the CDF, and then I am simplifying, I'm canceling the ones. Then here when you raise something to a power, then to another power, you get to multiply the exponents. We end up with the CDF where the minimum is 1 minus e to the minus n Lambda y. You might recognize what distribution this CDF is from because we just wrote it down one or two slides ago. However, in general, we don't tend to recognize CDFs as much as PDFs, so if you don't recognize this, take the derivative with respect to y, and that will get you this PDF, which is n Lambda e to the minus n Lambda y, which is another exponential distribution with rate n Lambda. That was pretty easy to do. Let's look at it graphically just really quickly. Here I've graphed an exponential distribution with rate Lambda, it starts at Lambda and goes down, and the minimum of n exponentials, we've already talked about how that's going to be more concentrated down towards the y-axis, towards x equals 0, and in fact, if you have an exponential with a larger Lambda namely n Lambda, that PDF is going to start at n Lambda and go down. If it starts a lot higher than the original PDF, it has to come down faster because the total area under both curves has to be one. Because it comes down faster, here, it looks like it's actually stopping, but it really is just skimming that y-axis out to infinity. Most of its area is much closer to the origin than it was with the original PDF. We're seeing what we expect to see for a minimum of random variables that are non-negative. Perhaps the more interesting thing we're seeing, is that it's another exponential. There's no reason to think that if you take a random sample from some distribution and you look at the sum of those random variables, or the minimum or the maximum, that that new random variable should come from the same family of distributions, but it did here and that's going to be really useful to us. In the next video, let's get back to hypothesis testing. We're going to do two hypothesis tests for hypotheses for an exponential distribution, one of them is going to be based on the sample mean, the other one's going to be based on the minimum and then we're going to see how to compare two tests to figure out if either one is better. I will see you in the next one.