In this lab, you will learn about the history of codes and ciphers, and write code for encoding and decoding messages. In doing so, you will be introduced to several new features of JavaScript strings. In particular, you will learn how to locate and access individual characters in a string, and even access entire substrings.
The use of codes (or ciphers) as a means of hiding the meaning of messages traces its roots to ancient history. The first documented use of codes was by Hebrew scribes in approximately 500 - 600 B.C. The Atbash cipher specified that each letter in a message would be encoded using the corresponding letter in the alphabet reversed. For example, 'A' would be encoded as 'Z', 'B' would be encoded as 'Y', 'C' would be encoded as 'X', and so on. The first known military use of codes was by Julius Caesar in 50 - 60 B.C. The Caesar cipher specified that each letter in the alphabet would be encoded using the letter three later in the alphabet. For example, 'A' would be encoded as 'D', 'B' would be encoded as 'E', 'C' would be encoded as 'F', and so on. The code wraps around at the end of the alphabet, so 'X', 'Y' and 'Z' would be encoded as 'A', 'B', and 'C', respectively.
Both the Atbash and Caesar ciphers are examples of substitution ciphers, codes in which one letter of the alphabet is substituted for another. A substitution cipher can be described succinctly by specifying its key, i.e., the sequence of letters to which the alphabet is mapped. For example, the keys for the Atbash and Caesar ciphers are listed below. To encode a specific letter using one of these ciphers, simply find the corresponding letter in the key below it.
| message | Atbash | Caesar | Mystery |
| | |||
| ABCDE | |||
| FOO | |||
| SECRET | |||
As the previous section showed, encoding messages using a substitution cipher is relatively straightforward. The following steps must take place:
for as many letters as there are in the message
get the next character in the message
find its position in the alphabet
find the corresponding letter in the key
use that letter to encode the letter in the message
These steps can be encoded in JavaScript as a function which takes the key and a message as inputs and returns the encoded message. The Encode function below performs this encoding using the variable coded to accumulate the coded message. The variable is initially assigned to be the empty string, and as each letter in the message is processed, the appropriate code letter is concatenated onto the end of coded. After traversing the entire message, coded will contain the complete encoded version of the message. (Note: code displayed in color is explained below.)
function Encode(key, message)
// Given : key is a string of the 26 letters in arbitrary order,
// message is the string to be encoded using the key
// Returns: the coded version of message using the substitution key
{
var alphabet, coded, i, ch, index;
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
coded = "";
for (i = 0; i < message.length; i++) { // for as many letters as there are
ch = message.charAt(i); // access the letter in the message
index = alphabet.indexOf(ch); // find its position in alphabet
if (index == -1) { // if it's not a letter,
coded = coded + ch; // then leave it as is & add
} // otherwise,
else { // find the corresponding
coded = coded + key.charAt(index); // letter in the key & add
}
}
return coded;
}
The Encode function introduces several new features of the JavaScript string type (shown in color). In JavaScript, a string is known as an object. As opposed to just storing a simple value, an object may encapsulate numerous attributes (i.e., variables) and operations on those attributes (i.e., functions) in a single entity. These attributes and operations can be accessed by specifying the string variable, followed by a period, followed by the name of the attribute or operation. For example, a string variable has an attribute called length which specifies how many characters are stored in the string. In the Encode function, this attribute is accessed (message.length) and used to specify the number of loop repetitions. The charAt function on strings will return the character stored at a particular index of a string (the first character in the string is considered to be at index 0, the second character at index 1, and so on). This function is used at two places in the Encode function, to access each character in the message (message.charAt(i)) and to find the corresponding character in the key (key.charAt(index)). Similarly, the indexOf function will return the index of the first occurrence of a character in the string (or -1 if not found). This function is used in Encode to find the location of each character in the alphabet (alphabet.indexOf(ch)). The following HTML code uses the Encode function to encode messages entered in by the user. The user specifies the subsitution key in a text box and the message to be encoded in a text area. Then, by clicking on a button, the Encode function is called to encode the message, and the resulting code is displayed in another text area.
|
|
Given an encoded message and the key by which it was encoded, decoding a message is a straightforward process. The steps in the encoding must simply be performed in reverse. That is, each coded letter must be mapped back into the corresponding letter of the alphabet. Consider the Atbash cipher, for example:
for as many letters as there are in the encoded message
get the next character in the encoded message
find its position in the key
find the corresponding letter in the alphabet
use that letter to decode the letter in the encoded message
Note the similarities between the steps in encoding and decoding messages!
In practice, substitution ciphers can often be broken (decoded without the key) using insight and computational power. As a good example of this, you may be familiar with Crypto-quotes puzzles that appear in many newspapers. These puzzles use a substitution cipher to encode a quotation, and the challenge is to decode the quotation. This can be done by analyzing the patterns of words and the frequency of letters in the quotation. For example, if the same three-letter sequence appears numerous times in the coded quotation, you might work under the assumption that it represents the common word "the".
The weakness of substitution ciphers is that they always map the same letter to the same key letter. Thus, it is possible to look at the coded message and look for patterns. An interesting variation on substitution ciphers was adopted by Nazi Germany during World War II. The Germans used a machine called an Enigma for encoding all military messages. The Enigma machine utilized a series of interconnected rotors to encode letters. In essense, the rotors defined a substitution cipher, mapping one letter to another. What made the Enigma machine so effective, however, was that the rotors were rotated after each letter was encoded, essentially changing the substitution cipher after every letter! This added wrinkle made the Enigma codes virtually impossible to break (until electronic computers were built).
This same effect can be obtained in a simple substitution cipher by rotating the key after each letter is encoded (ignoring non-letters). For example, suppose you are using the Atbash cipher to encode "AAA". After mapping 'A' to 'Z', the key "ZYXWVUTSRQPONMLKJIHGFEDCBA" would be rotated (say one letter to the left) to obtain "YXWVUTSRQPONMLKJIHGFEDCBAZ". Thus, the second 'A' would be mapped to 'Y'. Similarly, another rotation of the key would cause the third 'A' to be mapped to 'X'. Using a rotating key, each occurrence of a letter in the message is mapped to a different letter, and so pattern analysis is more difficult.
The following JavaScript function can be used to rotate a string one letter to the left:
function Rotate(letters)
// Given : letters is a string of letters
// Returns: a copy of that string, with characters rotated (to the left)
{
return letters.substring(1, letters.length) + letters.charAt(0);
}
This function uses another string operation, substring, to access the part of the string starting at the second character (index 1) and going to the end. It then concatenates the first character of the string onto the end of that substring to obtain the desired rotation.
Once you have written your code, you should test it thoroughly. List test values below that you used to convince yourself that encodings/decodings are being performed correctly.
Hand in a printout of the cipher.html document, attached to these sheets.