We have define the modular exponentiation and we showed how it can be applied into first security applications. Now we are going to show how it can be implemented efficiently. And we are going to sh, show that in such efficient implementation whe, whether there'll be any security leak vulnerabilities. So these are the two simple ways we have seen before how we can implement a to the power e mod n. The first one is the one exponentiation and modular, where we first do the exponentiation and then we do the modular. And the second one is the one we called iterative exponentiation and modular. We keep on multiplying by a, and whenever we got a product larger than n, we'd do a modular operation to keep the value less than n. So, for software engineers, or for people who design crypto-algorithms, what they care is, they care about the correctness of the algorithm. They, what they want is, they say, I want to calculate a to the power e mod n. And it is the job of hardware designers to implement this one efficiently. Efficiency in the terms of computation time, how fast you can find out the results. In terms of energy consumption, how much energy, or how much power it's going to take to do such operation. And for hardware designers, we have been doing our job in terms of how to optimize design all the time. I mean, this particular time, one question we want to ask is, can we do multiplication less than e minus 1 times? Because in both these two approaches you have to multiply a with some number for e minus 1 times. And the answer for this is yes. And I'm going to show you the motivation behind this algorithm here. So first, let's see how do we do, well, do we compute a to the power of 16? And you could say okay, I'll do these things, use 15 multiplications. With 16 a's, I'm going to multiply them together, with 15 of these multiplication operators. And then you also realize, a to the power 16 actually equals a to the power 8, and then to the power 2 again, will square again. And then if you try to generalize this, you realize a to the power of 16 is a squared to the power of 2, to the power of 2, to the power of 2 again. So basically, now what you are doing here, if you follow this operation, you're doing square operation four times. Or you multiply the number with itself four times. So we have, we have reduced the number of multiplications from 15 to 4 by doing this implementation. And however, you are going to say okay, so all [INAUDIBLE], because 16 is a power of 2. What happens if it is not? So for example, if I ask you to compute a to the power 23 instead of a to the power 16. [SOUND] In this case I can say okay, I can also try to make this one more efficient, in the sense that I realize a to the power 23 equals a to the power 16 times a to the power 4 times a to the power 2 times a. So there's no need for me to do 22 multiplications. I can calculate a to the power 16. I can compute a to the power 4. And I can compute a to the power 2. And I multiply them together. So, I want to see, I mean, how many multiplications we are doing here, but definitely less than 22. Okay so this motivates us this algorithm which is called a square and multiply, as I showed you earlier, so try to do all this squares, again the multiplication of the power of 2s, and then multiply them together. And depends on the way how you implement this. There are two ways to implement. One is called the left to right implementation. The other one is called the right to left. So the next two slides I'm going to show you, I mean for both of these two approaches. So the first one is the left to right implementation. And our goal is trying to compute a to the power e mod n. The first step what we need to do is, we want to convert the exponent e to binary, so in this case it is s plus 1 bit number. What we do is we set the result b equal to 1. And then we are going to run a loop here. And the loop is going to start from bit s, which is the left-most bit. Or sometimes, people will call it most significant bit. And then it's going to go down all the way to the last bit, the right-most bit, which people call the least significant bit. So what we do for each page is, doing a loop, we do a square of b, and then we check, if this bit k of, k sub i is 1. If it is 1, we're going to do one more multiplication. If the bit is 0, we're not going to do it. And then we do the loop for this many times. At the end, we're going to return b as the result. So I'm going to demonstrate the of this algorithm by showing a very small example here. Let's say a to the power of 23, as we mentioned from the previous slides. So the first thing I'd do is, I'd convert 23 to binary. Which is 16 plus 4 plus 2 plus 1. So following this pseudo code here I start b equal to 1, and I'm going to start i from the most significant bit, which is this bit 1 here. So when this b is 1, what I do I come inside the four loop, I do a square of b, so it become still 1. And then because this bit is 1, so I'm going to do a multiplication, a times 1 gives me a a. So this is the value of b. So after the first bit, b becomes a. As I move on to the second bit. So I'm going to do a square, a square, from this operation. And then since this bit is 0 I'm going to stop here. Move on to the next iteration. So for the next iteration I'm going to do a square of the current b which is a square, so this become a to the power 4. And because now this bit is 1, so what I'm going to do is, I'm going to multiply this by another a, so now I got a to the power 5. And then I move to the next bit, which I first to do a square operation. So this becomes a to the power 10. And then because this bit is 1, so I multiply this by a again. So it becomes a to the power of 11. As I had reach the last bit, I do the square of a to the power 11. This gives me a to the power 22. And this last bit is also a 1. So I need to multiply this into the power 22 by a again, which gives me a to the power 23, which is exactly what you want to compute. And for each of these steps we are doing these things under the modular operation, which is similar to the one I taught, iterative exponentiation and modular. Whenever I get a number which is larger than n, you do the modular operation. So next, we show the other implementation, which is the one called a right to left, square and multiplication algorithm. The first step is the same. Convert e to a binary number, s plus 1 bit. And the difference between the previous one is highlighted in green. In the next step instead of setting b equal to 1 I set b equal to the power of the last bit, the least significant bit. Initially the starting value of b will either be 1 or a depends on the value of k sub 0. If k sub zero is 1 we start with a. If k sub zero is 0, b starts with a to the power 0, which is 1. Okay? And now also put c, another variable I may need in the middle, because initial value to be a. So what I do now is I start i from 1, the second least significant bit from the right, and move all the way, to the, to the left. That is why it's called right to left implementation, and I move all the way to the most significant bit, okay? So at each bit position what I do is I do a square of c, which is the intermediate variable with, we initialized here. And then if the bit value of k equal to, k sub i equal to 1, I'm going to multiply b by c here. And then I move on to the next iteration. Okay? So let's see from the same example again, a to the power 23. How we calculate this? 23 we know it is 10111 in binary. And then in this example here, c starts with a, and then b starts with a to the power of k sub 0. And our k sub 0 is 1, so b also starts with a. And then I count, start from this bit here. What I do first, I do, I do c equal to c square. So c becomes a square. And because this bit is 1, so I do a a, which is b, the value of b here. So a times c, which becomes a cubed here. Okay, I then move on to the next bit. And I do a square of c, so c is a square, now it becomes a to the power 4. And this again, is a bit 1, so I'm going to multiply the current b with c. The current b is a cube. The current c is a to the power of 4. So what I've got is, I've got a to the power of 7. The next bit is a 0, so I'm going to square the current value of c, which becomes a to the power of 8, and I'm not going to do anything on b. And the last bit is a 1, so I'm going to square the current value of c. So a to the power 8 becomes a to the power 16. And because this is a bit 1 here, so I'm going to multiply the current value of b, which is a to the power 7, by the current value of c, which is a to the power 16. If you multiply them up, you get a to the power 23, which is exactly what we're trying to compute. Okay? So this is a different approach, but they have the same as, they are doing these things which we call the square and multiply, which shows, inside a for loop, you have a square, you have a multiplication. 'Kay? So how efficient this will be? So think about in this case, we do a to the power of 23. If I do the naive implementation, I need to do 22 multiplications. However, if I convert this 23 into binary, I know this loop is going to run one, two, three, four times. So each time I'm going to do a square. So I'm going to do four squares. And then how many multiplications I'm going to need to do? That depends on how many 1s I have here. In this case I have one, two, three 1s. So three multiplications. I did not count the least significant bit, because that has been used here. In my for loop, I start from the second bit. So in this particular example, by doing this square and multiplication algorithm, I need only seven multiplications. And compared to this 22, I use less than one-third of it. So now let's see what is any potential security vulnerabilities in such implementation. So one important thing we observe here is, in this case, what we have here is we have this particular operation, this block, which is going to be executed only if, and only if the bit, the key bit equal to 1. And this operation is multiplication and the modular. And we mentioned that if in the size of key size, in the size of n and the oldest members for for Diffie-Hellman, they're O large numbers. So this will be a non-trivial operation here. 'Kay. And this operation is only going to be done when the key bit is equal to 1. And this leads us to the funnel-, follo-, following thinking here. So what happens if you implement this on your smart card, on numerical devices? And what, your taggers, they can observe from side channel. They can observe, for example, the execution information about this algorithm, whether the key bit is 1 or 0. If it is 1, it's going, probably going to take longer time to do this computation. Is probably going to consumes higher power. It have to have higher current to do that. If it is a 0, probably it doesn't take any time. It doesn't consume that much power. So this is least to us what people call side channel attacks. And what you have seen here is, we have this modular exponentiation operation, which is required for crypto algorithms or crypto primitives. And they're proven to be strong enough. However, when we implement this we implement for performance. Trying to make it faster, trying to make it energy efficient. However we've realized it makes some difference when the key bit equal to 1, or the key bit equal to 0. This leaves a loophole, or leaves vulnerability for attackers. They can attack through what people call side channel attacks.