Back to Number Theory and Cryptography


Polyalphabetic Substitution Ciphers (March 18, 2004)

About the Ciphers

Last week we worked on monoalphabetic substitution ciphers -- ones which were encoded using only one fixed alphabet (hence the Greek root "mono" meaning "one"). As you saw, especially when the spaces between words are still there, these are fairly easy to break. Given a few minutes and several people working on a message, the secret contents are revealed. So, how can you make this harder? Well, one way is to use more than one alphabet, switching between them systematically. This type of cipher is called a polyalphabetic substitution cipher ("poly" is the Greek root for "many"). The difference, as you will see, is that frequency analysis no longer works the same way to break these.

One such cipher is the famous Vigenere cipher, which was thought to be unbreakable for almost 300 years! The Vigenere cipher uses the power of 26 possible shift ciphers (which we met last week).

How this Cipher Works

  1. Pick a keyword (for our example, the keyword will be "MEC").

  2. Write your keyword across the top of the text you want to encipher, repeating it as many times as necessary.

  3. For each letter, look at the letter of the keyword above it (if it was 'M', then you would go to the row that starts with an 'M'), and find that row in the Vigenere table.

  4. Then find the column of your plaintext letter (for example, 'w', so the twenty-third column).

  5. Finally, trace down that column until you reach the row you found before and write down the letter in the cell where they intersect (in this case, you find an 'I' there).

Example:

Keyword: M E C M E C M E C M E C M E C M E C M E C M
Plaintext: w e n e e d m o r e s u p p l i e s f a s t
Ciphertext: I I P Q I F Y S T Q W W B T N U I U R E U F

Thus, the urgent message "We need more supplies fast!" comes out:

I I P Q I F Y S T Q W W B T N U I U R E U F

So, as you can see, the letter 'e' is enciphered sometimes as an 'I' and sometimes as a 'Q'. Not only that, but 'I' represents two different letters, sometimes a 'w' and sometimes an 'e'. This renders our favorite tool, frequency analysis, nearly useless. Even though 'e' is used very often in the plaintext, the letters that replace it ('I' and 'Q') don't show up as frequently. Also, now if we check doubled letters in the ciphertext (say 'II' or 'WW'), these are not doubled letters in the plaintext.

You may, then, ask yourself "is there any hope?" Fortunately, there is! Given a long enough piece of ciphertext, certain words or parts of words (like "the") will line up with the keyword several times, giving rise to a repeated string of letters in the ciphertext ("the" may be enciphered as "KPQ" more than once). This can give us a clue as to the length of the keyword. After that, we can use frequency analysis on each piece that was enciphered with the same letter to crack the code. Consequently, cracking these ciphers hinges on finding repeated strings of letters in the ciphertext.

How to crack this cipher:

  1. Search the ciphertext for repeated strings of letters; the longer strings you find the better (say you find the string "KPQ" four times). Note where they are by circling them or highlighting them in some manner.

  2. For each occurrence of a repeated string, count how many letters are between the first letters in the string and add one (for example, if our ciphertext contains KPQRE IIJKO KPQAE, we count that there are nine letters between the first 'K' in the first "KPQ" and the first 'K' in the second "KPQ"; adding one yields ten).

  3. Factor the number you got in the above computation (2 and 5 are factors of 10)

  4. Repeat this process with each repeated string you find and make a table of common factors. The most common factor is probably the length of the keyword that was used to encipher the ciphertext (in our case, assume it was five). Call this number 'n'.

  5. Do a frequency count on the ciphertext, on every nth letter. You should end up with n different frequency counts.

  6. Compare these counts to standard frequency tables to figure out how much each letter was shifted by.

  7. Undo the shifts and read off the message!

Examples

  1. Encipher the following message using the Vigenere cipher and the keyword "IHS":

    there is a secret passage behind the picture frame

    Answer

  2. There is an easier way to use the Vigenere cipher, using modular arithmetic. Discuss how you might do this (hint: represent each letter by a number, starting with 'a' = 0). Using this method, encipher the above message using the key 19, 15, 22

    Answer



  3. Decipher the following message (work as a team!):

    I C J E V A Q I P W B C I J R Q F V I F A Z C P Q Y M J A H N G F

    Y D H W E Q R N A R E L K B R Y G P C S P K W B U P G K B K Z W D

    S Z X S A F Z L O I W E T V P S I T Q I S O T F K K V T Q P S E O

    W K P V R L J I E C H O H I T F P S U D X X A R C L J S N L U B O

    I P R J H Y P I E F J E R B T V M U Q O I J Z A G Y L O H S E O H

    W J F C L J G G T W A C W E K E G K Z N A S G E K A I E T W A R J

    E D P S J Y H Q H I L O E B K S H A J V Y W K T K S L O B F E V Q

    Q T P H Z W E R Z A A R V H I S O T F K O G C R L C J L O K T R Y

    D H Z Z L Q Y S F Y W D S W Z O H C N T Q C P R D L O A R V H S O

    I E R C S K S H N A R V H L S R N H P C X P W D S I L P L Z V Q L

    J O E N L W Z J F S L C I E D J R R Y X J R V C V P O E O L J U F

    Y R Q F G L U P H Y L W I S O T F K W J E R N S T Z Q M I V C W D

    S C Z V P H V C U E H F C B E B K P A W G E P Z I S O T F K O E O

    D N W Q Z Q W H Y P V A H K W H I S E E G A H R T O E G C P I P H

    F J R Q

    Answer



Challenge Problems

After you have tried the examples above, try the ciphers on the challenge sheet.

Other Links*

* These links are for informational purposes only and are from sources outside MEC and Cornell University