Back to Number Theory and Cryptography


Primes, Modular Arithmetic, and Public Key Cryptography II
(April 22, 2004)

Introduction

Rounding out our study of cryptology, we'll finish with the most-used cipher today. The RSA cipher (named after its creators, Rivest, Shamir, and Adleman) is a public key cipher -- an amazing type of cipher invented in the late 1970’s that doesn’t require the sender and the receiver to have previously shared a secret key. Public key ciphers seem to defy all logic. In a public key cryptosystem, the encipherment step and the decipherment step are two rather different things. Consequently, the "key" to such a system is split in half, into a public key and a private key.

A person, let’s call her Alice, who wants people to be able to send messages to her, distributes her public key widely, possibly in a format like a phone book. She carefully guards the other half of her key, her private key, to ensure that it is kept secret. Her public key allows her correspondents to encipher a message so that only she can read it. If Bob wants to send a message to Alice, Bob looks up Alice’s public key in the book, writes Alice a message, enciphers it and sends it to her. Once Bob has enciphered the message, only Alice can decipher it. Bob can’t even undo what he has done! When Alice gets the message, she uses her private key to decipher it.


Here are two diagrams to show the difference between a public-key system and a symmetric-key system:

Figure 1: In the traditional model, Bob and Alice must agree secretly to the same key K before they can use their cipher. Eve, an eavesdropper, must not know K for the cipher to work. [The letters above each person’s box indicate what he or she knows. M represents a message being sent from Bob to Alice, and K(M) represents the enciphered message. To decipher the message, Alice simply applies K to K(M).]

 

Figure 2: With the public-key model, Alice freely distributes her public key PA, so both Bob and Eve know it. However she keeps her private key, PrA, secret (Bob doesn’t even know it). Even though Eve knows PA and PA(M), Eve can’t recover the message without Alice’s private key. [Again, letters near each box indicate what each person knows. M is the message Bob sends to Alice (he knows it since he wrote it, and Alice knows it since she has deciphered it). PA(M) represents the message enciphered with Alice’s public key. To get M from PA(M), Alice applies PrA to PA(M).]

 

It’s helpful to think of the following analogy for a public key cryptosystem. Alice’s public key is like an open (unlocked) padlock, to which only she has the key (her private key). Alice hands these open padlocks out to anyone who wants one. Then, when Bob wants to send Alice a message, he puts the message in a box, puts Alice’s lock (her public key) on there, and locks it. Once he has done that, only Alice can open the box, since she is the only one with the right key (her private key).

With that in mind, let’s turn to one such public-key system.

The RSA Cipher

The RSA cipher, like the Diffie-Hellman key exchange we have already worked with, is based on properties of prime numbers and modular arithmetic.

  1. Alice chooses two different prime numbers, P and Q, which she keeps secret (in practice, P and Q are enormous — usually about 100 digits long).

  2. She calculates her modulus N by multiplying P and Q together (N = P*Q).

  3. She calculates her Z by multiplying (P-1) and (Q-1) together (Z = (P-1)*(Q-1)). Once she has done this, she can forget what P and Q are.

  4. Alice then picks her encryption exponent E (1 < E < Z) so that E and Z have no factors in common.

  5. Then, she finds her decryption exponent D by solving for the number D such that the product of E and D is 1 modulo Z (E*D (mod Z) = 1). This can be done using the Euclidean algorithm on a computer (or by hand, if the numbers are small enough).

  6. Alice then publicizes the pair of numbers E and N (these numbers make up her public key), and she keeps D secret (this is her private key).

  7. When Bob wants to send a message to Alice, Bob turns his message into a number M (for example, he might use the ASCII code). This is not part of the security of the system, so he can publicize how he does this. In particular, Alice needs to know.

  8. Then, Bob computes C = ME (mod N) and sends C to Alice.

  9. When Alice receives C, she computes CD (mod N) = MED (mod N). By the special properties that E and D have, this yields M back. Thus, Alice gets Bob’s message!

Why this Works

Eve, our eavesdropper, seems rather well-equipped in this scenario. She knows E, N, and C (remember C= ME (mod N)). So, you might think that it would be easy for her to discover M from what she knows. However, this is really quite difficult. The only way to recover M from C is to know D, Alice’s private-key. In order to compute D, Eve needs to know what Z is, and, to do that, she needs to know what P and Q are. That is, Eve needs to be able to factor N (remember, N = P*Q). It’s important to know that the whole system breaks down if Eve can factor N.

What’s so hard about factoring? Well, the numbers used in RSA are typically huge (N is usually about 200 digits long), and there is no easy or fast way known to factor such large numbers. In fact, the fastest ways known are not too much faster than checking all of the prime numbers less than the square root of N, which would certainly take a while! Factoring quickly is a problem that mathematicians have worked on for thousands of years, and it seems like there is no easy answer. In fact, there are large cash prizes (up to $200,000) for people who can factor specific numbers (visit http://www.rsasecurity.com/rsalabs/challenges/factoring/index.html for details).

The beauty of this system is that, while it’s hard for Eve to factor N into P and Q, it’s not hard for Alice to compute N from P and Q. Basically, multiplying is easy, but factoring is hard. For a simple example of this, multiply 7 and 13 in your head. Now try to factor 1001 in your head. The factors of 1001 are fairly small numbers, but I bet it took you a while to find them.

Finally, we’ll address the issue of why MED (mod N) = M. This is obviously important for the cipher to work! However, the math behind this is a bit advanced, so if you had rather skip it, just go on to the next section.

There is a theorem called Fermat’s Little Theorem that states:

For any prime number p and any number a with a < p, ap (mod p) = a.

To convince yourself of this, you can try it out on a few different numbers with a few different primes (however, that won’t be a proof!). Now, to extend this to two primes (remember, in our case, N = P*Q), we’ll use an extension of this theorem that Euler proved:

If n = p*q is the product of two different primes, and a is any number such that a < n,

then a(p —1)*(q —1)+ 1 (mod n) = a.

Notice that the exponent in the conclusion of Euler’s theorem is (p -1)*(q -1) + 1. This number is equal to 1 modulo (p -1)*(q -1). Also, remember that Z = (P -1)*(Q -1) in our above discussion. It ends up that for any exponent equal to 1 modulo (p -1)*(q -1), a to that power is a modulo n. By the way we constructed D from E, E*D = 1 (mod (P -1)*(Q -1)), so MED (mod N) = M.

Try working with this cipher yourself, using the RSA secret-sharing worksheet.

Final Thoughts

Now, if you ever hear anything in the news about a large number being factored, you’ll know why it’s important. Nearly everything that is transmitted over the Internet securely is sent using the RSA cipher. This includes all of your credit card transactions, your banking information, and whatever else you send back and forth online.

Since, as you have experienced, the RSA cipher takes some rather hefty computation, it’s fairly slow to send large messages back and forth enciphered with this cipher. Thus, in practice what is done is that two entities (for example, amazon.com and your computer) will exchange public keys and will use RSA to send each other a new secret key which they will use for another, much faster cipher called DES (or, sometimes Triple-DES). Once they’ve agreed on that secret key, they abandon RSA for the much faster cipher.

In addition, RSA is used to solve the problems of authentication, non-repudiation, and many other Internet security issues. (To find out more about Internet security issues, helpful websites include