Introduction
Every cipher we have worked with up to this point has been what is called a symmetric key cipher, in that the key with which you encipher a plaintext message is the same as the key with which you decipher a ciphertext message. As we have discussed from time to time, this leads to several problems. One of these is that, somehow, two people who want to use such a system must privately and secretly agree on a secret key. This is quite difficult if they are a long distance apart (it requires either a trusted courier or an expensive trip), and is wholly impractical if there is a whole network of people (for example, an army) who need to communicate. Even the sophisticated Enigma machine required secret keys. In fact, it was exactly the key distribution problem that led to the initial successful attacks on the Enigma machine.
However, in the late 1970's, several people came up with a
remarkable new way to solve the key-distribution problem. This
allows two people to publicly exchange information that leads to a
shared secret without anyone else being able to figure out the secret.
The Diffie-Hellman key exchange
is based on some math that you may not have seen before.
Thus, before we get to the code, we discuss the
necessary mathematical background.
Prime Numbers and Modular Arithmetic
Recall that a prime number is an integer (a whole number) that has as its only factors 1 and itself (for example, 2, 17, 23, and 127 are prime). We'll be working a lot with prime numbers, since they have some special properties associated with them.
Modular arithmetic is basically doing addition (and other operations) not on a line, as you usually do, but on a circle -- the values "wrap around", always staying less than a fixed number called the modulus.
To find, for example, 39 modulo 7, you simply calculate 39/7 (= 5 4/7) and take the remainder. In this case, 7 divides into 39 with a remainder of 4. Thus, 39 modulo 7 = 4. Note that the remainder (when dividing by 7) is always less than 7. Thus, the values "wrap around," as you can see below:
0 mod 7=0 | 6 mod 7=6 |
1 mod 7=1 | 7 mod 7=0 |
2 mod 7=2 | 8 mod 7=1 |
3 mod 7=3 | 9 mod 7=2 |
4 mod 7=4 | 10 mod 7=3 |
5 mod 7=5 |
To do modular addition, you first add the two numbers normally, then divide by the modulus and take the remainder. Thus, (17+20) mod 7 = (37) mod 7 = 2.
Modular arithmetic is not unfamiliar to you; you've used it before when you want to calculate, for example, when you would have to get up in the morning if you want to get a certain number of hours of sleep. Say you're planning to go to bed at 10 PM and want to get 8 hours of sleep. To figure out when to set your alarm for, you count, starting at 10, the hours until midnight (in this case, two). At midnight (12), you reset to zero (you "wrap around" to 0) and keep counting until your total is 8. The result is 6 AM. What you just did is to solve (10+8) mod 12. As long as you don't want to sleep for more than 12 hours, you'll get the right answer using this technique. What happens if you slept more than 12 hours?
Here are some exercises for you to practice modular arithmetic on.Did you notice something funny about the last 5 exercises? While, usually, when we take powers of numbers, the answer gets systematically bigger and bigger, using modular arithmetic has the effect of scrambling the answers. This is, as you may guess, useful for cryptography!
Diffie-Hellman Key Exchange
The premise of the Diffie-Hellman key exchange is that two people, Alice and Bob, want to come up with a shared secret number. However, they're limited to using an insecure telephone line that their adversary, Eve (an eavesdropper), is sure to be listening to. Alice and Bob may use this secret number as their key to a Vigenere cipher, or as their key to some other cipher. If Eve gets the key, then she'll be able to read all of Alice and Bob's correspondence effortlessly. So, what are Alice and Bob to do? The amazing thing is that, using prime numbers and modular arithmetic, Alice and Bob can share their secret, right under Eve's nose! Here's how the key exchange works.
J = NA (mod P)
and sends J to Bob. Similarly, Bob computes the number
K = NB (mod P)
and sends K to Alice. Note that Eve now has both J and K in her possession.
Bob does the same step in his own way, computing
The number they get is the same! Why is this so? Well, remember that K = NB (mod P) and Alice computed KA (mod P) = (NB)A (mod P) = NBA (mod P). Also, Bob used J = NA (mod P), and computed JB (mod P) = (NA)B (mod P) = NAB (mod P).
Thus, without ever knowing Bob's secret exponent, B, Alice was able to compute NAB (mod P). With this number as a key, Alice and Bob can now start communicating privately using some other cipher.
Why Diffie-Hellman Works
At this point, you may be asking, "Why can't Eve break this?" This is indeed, a good question. Eve knows N, P, J, and K. Why can't she find A, B, or, most importantly, NAB(mod P)? Isn't there some sort of inverse process by which Eve can recover A from NA(mod P)?
Well, the thing Eve would most like to do, that is, take the logarithm (base N) of J, to get A, is confounded by the fact that Alice and Bob have done all of their math modulo P. The problem of finding A, given N, P, and NA (mod P) is called the discrete logarithm problem. As of now, there is no fast way known to do this, especially as P gets really large. One way for Eve to solve this is to make a table of all of the powers of N modulo P. However, Eve's table will have (P-1) entries in it. Thus, if P is enormous (say 100 digits long), the table Eve would have to make would have more entries in it than the number of atoms in the universe! Simply storing that table would be impossible, not to mention searching through it to find a particular number. Thus, if P is sufficiently large, Eve doesn't have a good way to recover Alice and Bob's secret.
"That's fine," you counter, "but if P is so huge, how in the world are Alice and Bob going to compute powers of numbers that big?" This is, again, a valid question. Certainly, raising a 100-digit-long number to the power 138239, for example, will produce a ridiculously large number. This is true, but since Alice and Bob are working modulo P, there is a shortcut called the repeated squaring method.
To illustrate the method, we'll use small numbers (it works the same for larger numbers, but it requires more paper to print!). Say we want to compute 729 (mod 17). It's actually possible to do this on a simple four-function calculator. Certainly, 729 is too large for the calculator to handle by itself, so we need to break the problem down into more manageable chunks. First, break the exponent (29) into a sum of powers of two. That is,
(all we're doing here is writing 29 in binary: 11101). Now, make a list of the repeated squares of the base (7) modulo 17:
71 (mod 17) = 7
72 (mod 17) = 49 (mod 17) = 15
74 (mod 17) =
72 * 72 (mod 17) = 15 * 15 (mod 17) = 4
78 (mod 17) = 74 * 74
(mod 17) = 4*4 (mod 17) = 16
716 (mod 17) = 78 * 78 (mod 17) =
16*16 (mod 17) = 1
Then, 729 (mod 17) = 716 * 78 * 74 * 71 (mod 17) = 1 * 16 * 4 * 7 (mod 17) = 448 (mod 17) = 6.
The neat thing is that the numbers in this whole process never got
bigger than 162 = 256 (except for the last step;
if those numbers get too big, you can reduce mod 17 again before multiplying).
Thus, even though P may be huge, Alice's and Bob's computers don't
need to deal with numbers bigger than (P-1)2 at any point.
Key Exchange Worksheet
Here's a worksheet
to try this whole process out on. It may help to work with a few friends!
Questions to Consider