Do you ever use the word correlation? Probably, you have. Now, I will explain to you what correlation is in a statistical meaning and what correlation can be used for. Let's start with the definition. Correlation is an index that expresses how strong a relationship between two numerical variables is. This index is always between minus 1 and plus 1. Note that a correlation of 0 means that there is no linear relationship. Let's have a look at an example. You are studying two variables, A and B, and these are the measurements. A and B show a positive correlation. This means that higher values of one of the variables will generally correspond to higher values of the other variables. So, what would a negative correlation look like? A negative correlation means that the variables are correlated in the opposite direction. So higher values of one of the variables generally corresponds to lower values of the other variable. Now, let's take a look at some examples. Can you guess the correlation in this graph? It is equal to 0.93. This is what we call a strong, positive correlation, as the line is upward sloping, the correlation is positive. And because the dots all lie very close to the line, the correlation is also close to 1. Okay, what about this graph? In this example, you need a bit more imagination to see a straight line, but you can, and it is upward sloping. Thus, the correlation is positive, but it is relatively weak because the dots are farther away from the line. The correlation is 0.31, which we call a weak positive relationship. And, what about this third one? In this example there is no correlation, it is very close to 0. The observations are just scattered around. There is no relationship. And now our last example. Can you see a relationship here? Well, it is a curved relationship, a so-called non-linear relationship. So, we say that there is no correlation because correlation is a linear measure. But what does a correlation tell us? Let's look at the interpretation. In Amsterdam, many people take their bikes to work, me too. Now, I want to know if the amount of rain is related to the amount of people that travel to their work by bike. After some research, I find a strong negative correlation. This means that the amount of people that travel by bike to work decreases as the amount of rain increases. It is important to realize that I just assumed a cause and effect relationship. I assumed that rain is the cause for people taking their bikes less. However, just statistics will not help you with that choice. So, be careful with recommendations like, more people should go by bike, because it will then rain less. In such a case, you have mistakenly assumed the wrong cause and effect relationship. You have basically just turned it around. Okay, you can also compute the correlation coefficient in Minitab under Stat > Basic Statistics > Correlation. So, how do you need to interpret a correlation coefficient? Here are some rules of thumb. A correlation coefficient from 0 to 0.4 means that there is no or a weak correlation between the variables. A correlation coefficient from 0.4 to 0.8 means that there is a moderate correlation. A coefficient from 0.8 to 1 means that there is a very strong relationship. If the coefficient is exactly 1, we speak of a perfect correlation between the two variables. However, in reality, it rarely occurs that two variables correlate perfectly. And, as a final warning, always be careful not to mix up correlation with causality.