We're about to start Session 2 of the third week of our course. One goal of this session is to examine a family of distributions and start thinking about how they fit our data. A distribution that fits our data best is a good model of reality. Hi, welcome to Week 3 Session 2 of Modeling Risk and Realities. I'm Senthil Veeraraghavan. I'm an associate professor in the Operations, Information, and Decisions Department at the Walden School. In this session, we're going to be choosing among the family of distributions. And we are going to be looking at discrete and continuous distributions. In this session, we will be looking at some families of distributions that are often used to model realities. First, we examine discrete distributions, and then we will go into continuous distributions. First let's look at some examples, simple examples of random variables which will help us examine other realistic examples that come later on. Example one, a coin is tossed. You'll either see heads or tails. This outcome is a random variable. Example two, a team plays against another team, which is slightly weaker. This team's winning probability is 60%, and the probability of losing is 40%. This outcome is a random variable. Note some similarity with the coin toss example, where it was 50/50. The third example, you can think of a fair die that is cast in the game of dice. The outcomes can be 1 or 2 or 3, 4, or 5, or 6. The probability of 6 turning up is 1/6. The probability of 1 turning up is also the same, it's 1/6. Note that if the die is fair, these probabilities are the same for any outcome. The first example distribution we're going to look at is Bernoulli distribution. The Bernoulli distribution has only two outcomes, each outcome occurring with some probability. The two probabilities, of course, will add up to 1. It's similar to the coin toss example. Let's look at some realistic scenarios that can be modeled using a Bernoulli distribution. For example, will a firm from Europe enter the market in Asia? Will a team ranked fourth in the middle of the season in the English Premier League win when the season concludes? Will a ride share company buy its smaller startup competitor? All these are examples of two possible outcomes, whether it will happen or not, and can be modelled using a Bernoulli distribution. You might ask, what if the number of outcomes is more than two? Let's look at a distribution with three outcomes. Suppose a firm enters a new market. The managers, who are looking into the new market, see three possible scenarios to model with the distribution. The firm's market share next year could be low, medium, or high. They think the likelihood of high market share is 20%, the medium market share is 70%, and low market share is 10%. Let's look further into the three outcome example and probability distribution. Assume you have the following distribution for future market share, which is based on the estimates from experts. Now recall that only three outcomes are possible. Market share is going to be 80% with some probability, let's say that's 0.2. Market share is going to be 50% with some probability, this is the most likely probability, it's 0.7. And market share could be very low at 20% with some probability p3 that's 0.1. We use this simple example that you might have seen in many scenarios before to model the outcomes of market shares. Note all the probabilities are greater than 0, and they sum up to 1. The probability distributions like the one you just saw, can be used to model a number of distinct scenarios, each scenario with an attached probability. And such probability distributions are called discrete probability distributions. And to describe discrete probability distributions, we use a function called the probability density function, or simply called the pdf. Let's look at a pdf in the next slide. Now let's look at the scenarios using a graphical representation, which is helpful for us. These three green bars show us the three possible market outcomes at 20% and 50% market share and 80% market share. You can see the 20% market share happens at 0.1 probability. The 50% market share happens at 0.7 probability and 80% market share happens with 0.2 probability. This is known as a probability density function, or simply, pdf. For any probability distribution, including the simple one that reflects three demand scenarios, two useful descriptors are often calculated. One is the expected value or mean, the other is the standard deviation. For the discrete example that we saw just now, we can calculate the mean using the following formula, p1D1+ p2D2 + p3D3, and this totals up to 53. We can do this for any distribution. The mean is defined as sum of the products of each scenario value and its probability. You might wonder what this signifies. Mean in this case, 53, signifies or reflects the average market share you would likely see if the firm had a chance to try the same action infinite times. In this slide, I'm just showing you where the mean plays out in the probability density function, pdf. We can see the mean 53 is slightly to the right. That's because a higher market share are more likely. Now we are ready to describe standard deviations. Standard deviation, roughly speaking, is how far away your random variables could be from the average. In other words, it tells you how spread out your distribution is around the mean. We can calculate standard deviation by the following process. First, take the difference of the scenario you are interested in from the mean. Square the difference and then multiply by the corresponding probability. If you have three scenarios, do this for every scenario and add them up and take the square root. Let's do it for our three scenario example. We take the probability, 0.2, multiply by the square of differences at 80% market share. Take the second scenario, 0.7, and multiply it by the square of The difference from the average. And then take the last scenario, probably 0.1, and then multiply it by the square of the difference from the mean. Add all the scenario values just you calculated and take the square root. In this case, it comes out to be 16.16. This graph gives you a graphical interpretation of the mean and standard deviation. The mean value tells you the average value you're going to see when you enter the market. And the standard deviation tells you the spread of all the possible values from the average. In this case it's 16.16. Now you'd be interested in what happens if there are multiple outcomes. How do we generate the PDF? Suppose there are n outcomes, more than three outcomes. So, you have outcome D1 with probability P1, outcome D2 with probability P2, outcome D3 with probability P3, and so on up n outcomes. Clearly, all the probabilities should add up to 1. The probability density function or the PDF of this random variable is simply the following. The random variable takes a value equals Dk. Let's see if Dk is D3. It takes a value D3 with probability P3. It takes the value Dk, probability that the random variable is equal to Dk is Pk. The distribution so far we saw were described by PDF, which is the probability density function. It tells you the precise probability of seeing an outcome. Sometimes they're also described by cumulative distribution function or CDF. CDF serves the same purpose as PDF. It's just another way of describing the random variable. It's often represented by capital letters And PDF is often represented by small letters. What's cumulative distribution function? Cumulative distribution function is nothing but the sums of all the little PDFs up to the point of interest. For example, you could ask what is the cumulative probability value of D3, that is F of D3. This is nothing but the probability that the random variable is less than or equal to D3. To calculate the cumulative distribution function, all you have to do is take the PDFs of every value up to three in this case. So it's P1 + P2 + P3 gives you the probability the random variable is going to be less than D3. How do we calculate the mean of the standard deviation of the distribution with n discrete outcomes? The mean is again, the sums of each possible outcome with its corresponding probability. Multiply each of them and then add all possible scenarios up in this case n scenarios. The case of standard deviation again take the difference between the scenario value and the average. And square it, and multiply it by the probability of seeing the scenario. And do this for every possible scenario, add them all up and take the square root. Let me repeat that quickly. You take the difference of the scenario with the average, square it, multiply it by the probability of seeing that scenario. This is P1 D1 minus D bar the whole squared. Then do this for each scenario, P1, P2, P3 and so on up to Pn. Add them all up and then take the square root. That gets you the standard deviation. Let's look an example that multiple outcomes. The dice example. Suppose a die is cast, in this case the random variable is the number that shows up. Let's denote the random variable by X. There are six possible outcomes for X. It could be 1, 2, 3, 4 or 5 or 6. If it's a fair die, all of these outcomes have the same probability of occurring. Therefore, we can write down the probability density function, or PDF very quickly. It is f(n), remember, probability density function we always represent by small letters. f(n) is probability that a random variable takes a value, n. And that's one over six. As longest, the value we are interested in, n is one or two or three or four or five or six. The probability that x takes a value seven is zero. No other random values are possible for this random variable. The distribution of this random variable is an example of a discrete uniform distribution. Why? Because a is discreet and countable, one, two, three and so on. And it's uniform because every outcome has the same probability, it's all uniform. Now let's calculate the mean and the standard deviation for all base example. The mean, recall the example that I wrote out before for three scenario case. We can write this out for the n scenario case. And for the six scenario case here, the mean is the probability you see an outcome let's say one, two, three or four multiplied by it's probability. And add them all up. In this case, you do take each scenario, multiply it by its probability you get 3.5. Clearly, you're not going to see 3.5 when you throw the dice. What is this value represent? This value tells you, if you throw the dice many, many, many times, infinite number of times, on average, you're going to see 3.5. This make sense because your average has to lie somewhere in the middle, between one, two, three and four, five, six. Because all outcomes are equally possible. Let's calculate the standard deviation using the formula that we saw before. We calculate this and it comes out to 1.70. Again, the standard deviation measures how spread out your values could be around the mean. Now let's calculate the probability that the random variable is less than or equal to some value, which is accumulative distribution function or the CDF. The CDF is again represented by big letters. So we have big F(n). The probability that the random variable is going to be less than or equal to n. Again, we sum of all possible probability values up to that scenario, so we have P1 + P2 + P3 up to Pn. Let's do the example further for our dice example. Big F(1) is 1 over 6. If are interested in F(4), what's a probability you throw the dice and you see a value up to 4? That is, you see one, two, three, or four. That is 4/6 or 0.666. So you're going to see four out of six times a number that is four or less than four. Clearly, it goes up to one. The probability of seeing six or less is clearly one because the highest value possible is six. Hence, the cumulative distribution function always takes the value of one at the highest outcome. We are now ready to write out the formulas for the discrete uniform distribution. Let's consider the random variable big X that follows a discrete uniform distribution. The possible values of this discrete uniform distribution, let's say it goes from small n equals 1 to big N. The probability density function is exactly the same for any possible n, and therefore it's one over n. As long as n is between one to n, and the cumulative distribution function is small n over big n. The mean and standard deviations can be calculated. We're going to skip the steps, but when you calculate the mean and standard deviation. You will see the mean is big N plus 1 over 2, and the standard deviation is the square root of N squared minus 1, divided by 12. Having finished uniform distribution. Let's go through another example. This example we're going to go through is called the binomial distribution. Let's look at an example that multiple outcomes. A binomial distribution example. And let's look at a specific scenario where you're thinking about financial investment into a drug that is a new medical drug that's supposed to cure a difficult to cure ailment. The drug's being currently tested in the clinical trials phase. And based on the lab test so far, the new drug has 60% chance of curing the ailment and 40% of failing or showing no effect. Let's say it's being tested on ten similar patients during the clinical trials. How many successes would there be after the trials. The number of success is clearly a random variable. Maybe the drug succeeds on everybody it's tested on. Then the number of successes is ten, or it may not succeed on anybody, and then the number of successes is zero. So the random variable can be zero, one, two, three, four or up to anybody up to ten. So there are 11 possible outcomes. The random variable here is the number of successes. And they're distributed binomially. Every case is either a success or a failure. We can count the number of successes, and it follows the distribution with the pdf as shown on the right. For example, if you're interested in the exact probability that there are six successes, We will look at f(6). Small f(6) which is 25.08%. Thus, the probability of seeing exactly six axis's in our ten trials is 0.25. We can calculate the mean and the standard deviation for this distribution. I'm going to leave it to you as an exercise. You can use the steps I showed you in the slide titled Multiple Outcomes Mean and Standard Deviation to calculate the mean and standard deviation. You should find that the mean comes out to be six, and the standard deviation comes out to be 1.549. In this slide we see a graphical representation of the binomial distribution for the drug trial example. It's called a binomial distribution because for every trial that is every patient, there's either a success or a failure. There are two outcomes for each patient, and therefore it's called binomial distribution. It's all multiple patients here, and the total number of successes Is distributed binomially. We have zero successes with very low probability and slightly higher probability for having all ten successes. And this is because the success probability for each patient is slightly higher than 50%. It's about 60% for each patient. In this table we calculate both the pdf and the cumulative distribution function CDF, for our binomial distribution. The pdf again, the probability of seeing six exactly six successes is 25%. Suppose you want to calculate the probability of seeing six or less successes. That is the number of successes is less than r = 6. This is 61%. How do we calculate this 61%? We add up all possible values from zero to one and its pdf up to six and we will get 61%. So we add up zero, probability of zero. Probability of one, probability of two, three, four, five, and six. And sum of all these probabilities adds up to 0.617. Clearly, the probability of total successes being less than or equal to ten is 100%, because there are only ten trials. In this slide we see a graphical representation of the cumulative distribution function for the binomial distribution that we saw in the table before. This is the probability that the number of successes will be less than or equal to n as any cases. Clearly, a ten the cumulative distribution function value is one. We are now ready to write down the formula for binomial distribution. A random variable is binomially distributed is completely characterized by two parameters. The total number of trials denoted by big N, and the probability of success in each trial denoted by small p. Note two things. The success probability p is the same for each trial and the trial outcomes are independent. Clearly, the possible outcomes can be numbered by small n which can take any value from zero, one, two, three up to big n We can now write the probability of seeing exactly N successes. This is again by the probability density function denoted by small f(n). Small f(n) is the product of N choose N. Multiplied by P raise to N multiplied by one minus P raise to N minus N. What's N choose N? N choose N is the number of ways we can get small n successes from big N trials. And it's simply the ratio of big N factorial to N minus N factorial and N factorial. What's N factorial? N factorial is nothing but the product of all the numbers up to N. For example, four factorial is the product of one, two, three and four, and we get 24. So to summarize, the probability of seeing exactly N successes is the number of ways we can choose N successes from big N trials. Multiplied by the probability of seeing N successes, multiplied by the probability of seeing n minus n failures. We can similarly write down the cumulative distribution function by adding up all the pdf's up to the value we are interested in. Even though the formulas look complicated. You can calculate the mean and standard deviation very quickly using the following formula. For the binomial distribution, the mean Is given by product of N times p, and the standard deviation is given by the following term, which is the square root of the product of N times p times 1 minus p.