In this episode several different mathematical topics are mentioned, including conjoint analysis and onion routing, but we're going to focus on public key cryptography.
Cryptography is a branch of mathematics concerned with the study of hiding and revealing information and also of proving authorship of messages. A typical application of cryptography is one person wanting to send another person a private message, but they can only send messages over public channels. The original message is called the plain-text. An encryption cipher is an algorithm that transforms the plain-text into illegible data called the cipher-text, possibly using another piece of information called the encryption key. The decryption cipher is another algorithm which transforms the cipher-text back into plain-text, also possibly using another piece of information called the decryption key. Encryption and decryption ciphers always come in pairs, as a decryption cipher can only decrypt cipher-texts encrypted with a particular encryption cipher. If one of the two ciphers in the pair uses a key, then both do, and the keys are always intimately related, if not identical.
In symmetric key cryptography, the encryption and decryption keys are either identical, or knowing one, you can easily compute the other one. Almost all ciphers ever invented have been symmetric key ciphers.
The primary disadvantage of symmetric key cryptography is key exchange. If two people using symmetric key cryptography have never met each other and can only communicate over insecure channels (such as the internet), then how are they to exchange keys so that they can read each other's encrypted messages? If they have no other tools at their disposal, then there is no way that they can exchange keys securely. A spy can always listen in to their conversation at the beginning to capture the key exchange, and then the rest of their conversation is completely legible to the spy. The two people must arrange some secure channel beforehand to exchange keys, such as meeting in person, otherwise symmetric key cryptography is vulnerable.
Another disadvantage of symmetric key ciphers is that once an attacker figures out the decryption key (meaning they can read encrypted messages), then they also can encrypt messages to make the message appear as if it is coming from a friendly source. Conversely, once an attacker figures out the encryption key, they can also read all encrypted messages as ell.
The Caesar Cipher is an example of a symmetric key cipher. The key is some integer from 0 to 25.
The encryption cipher goes as follows. Every letter of the alphabet in the message is represented as an integer. 'A' is represented by 0, 'B' as 1, and so on to 'Z' being represented by 25. Non-alphabetic characters are unchanged. To every such integer add the key. If the resulting sum is 26 or greater, then subtract 26 from the result (this is the same as modulo 26 arithmetic). This will be an integer from 0 to 25, which corresponds to a given letter. The cipher-text will have that letter in the corresponding location. Capitalization of the letter in the cipher-text is the same as in the plain-text.
Decryption is much the same as encryption, with the exception that the key is subtracted from the number corresponding to a letter in the cipher-text. The resulting difference could be negative, and if it is, 26 is added to the difference. The result will be a number from 0 to 25, which corresponds to a letter of the plain-text. Again, capitalization is preserved, and non-alphabetic characters remain unchanged.
The Caesar Cipher is extraordinarily easy to break. There are variants that change the key every couple words, or even with every letter, according to some other defined pattern. These variants are somewhat harder to crack, but they all yield rather easily to more sophisticated mathematical analyses.
Asymmetric key ciphers are all much more computationally involved than most symmetric key ciphers, meaning that they are much slower to run. For most applications that involve the communication of more than a modicum of data a hybrid approach is usually used. First public keys are exchanged, and then a key for the symmetric key cipher is encrypted using public key cryptography and then exchanged.
This approach melds the advantage of public key cryptography of not needing a separate secure channel to exchange keys with the speed advantage of symmetric key encryption.
With public key cryptography is a form of asymmetric cryptography. The encryption key is still related to the decryption key, but the decryption key cannot be computed efficiently only knowing the encryption key. However, knowing the decryption key allows one to compute the encryption key with relative ease.
The tremendous advantage of public key cryptography is that a person can publish their public encryption key for all of the world to see. Anyone who sees this public encryption key can use it to encrypt a message to send a message back, and yet the person who publishes his public key can rest assured that it would be practically impossible for an eavesdropper to figure out the private encryption key and thus decode the messages sent encrypted with the published encryption key.
All of the messages in this scenario are over public channels, and yet the communications are secure against all eavesdropping attacks! This is a tremendous improvement over symmetric key cryptography.
Public key cryptography depends on the existence of a type of mathematical functions called trapdoor functions. Trapdoor functions have the property that they are very computationally easy to compute but are very difficult to invert. To invert a function means to figure out what you need to input into a function in order to get a desired output.
The following is a simple example of a function which is easy to compute, but more difficult to invert. Let's pick some prime number p = 19. The particular number doesn't matter so long as it's prime. The larger the prime number, the harder this function will be to invert. We will define our function as f(n) = 3n (mod) p . The (mod) p in this equations means the same thing as dividing by p and then taking only the remainder as your answer.
The first of these questions asked you to evaluate the function at a particular value. The second asked you to invert the function for a particular value. Which one of these two tasks was easier? Why do you think that is?
Trying to invert this function is known as the discrete-log problem, and no known efficient algorithm is known to solve this. If someone ever figures out a way to quickly compute discrete logs, public key cryptography depending on this trapdoor function will become insecure. However, it is thought by informed researchers to be an intractable problem, and it is unlikely that anyone will make such a mathematical break-through in the foreseeable future.
In 1977 Ron Rivest, Adi Shamir, and Leonard Adleman developed a public key encryption algorithm RSA, named after their initials. Theirs was not the first asymmetric key system, but it is the most widely used today.
First, a person publishing a public key in this algorithm must generate both their public and private keys. The following is how he does this:
The public encryption key is e along with n and the private decryption key is d along with n. In order to encrypt a message, it must be represented as a number between 0 and n . Long messages are often broken up into a series of smaller numbers in much the same way that in the Caesar Cipher, words made up of letters are broken up into sequences of numbers.
Suppose we have a plain-text message m which is an integer between 0 and n that we would like to encrypt using RSA. The following is how we would encrypt a message using the private key, which consists of d along with n.