Cryptography involves sending messages that only certain intended people can receive and understand. This involves codes and ciphers. A code substitutes one string for a longer message; there is a code book in which the code strings are associated with their relevant message. The string “A76” could mean “retreat 100 meters.” Code books had to be changed regularly because eventually one would fall into the hands of someone who was not supposed to have one.
A cipher is an algorithm that converts one string of characters into another one of generally the same length. It can operate on bits, on characters, or on blocks of characters. A cipher does not have a code book but does have a key, which is a string of numbers or characters, that the algorithm uses to transform the original string (called the plaintext) into the encrypted string (called the ciphertext). The ciphertext can be transmitted safely because it cannot be understood without the key.
Cryptography has become much more important in the last 30 years or so. It’s not just that the world is an uncertain place. It is more that people wish to share private information across the Internet. If a purchase is made with a credit card, then the card number should be encrypted before sending it to the seller. Access to certain sites that have valuable services or information requires a password. Installing new software requires an access key. These are all examples where encryption is required.
It should be mentioned that the secure transfer of information depends on operational security as well as on encryption. Someone with a password can access all services and data associated with that password, so keys and passwords must be protected. This aspect is beyond the ability of a programmer to control, and is often the way security systems are broken.
There is some terminology that needs to be understood. A symmetric key system uses one key to encode and the same key to decode. Asymmetric systems like public key systems use one key to encrypt the message, a key that anyone can know, and a second, private key that only the recipient knows and is used to decrypt. A block cipher applies a key to a collection (block) of data, often a size of 64, 128, or 256 bits at a time. A stream cipher is usually a symmetric key cipher that encrypts a plain text character with a character from the key. It’s also called a state cipher because the encryption of the next character depends on what has happened before.
Knowing a little about encryption is important, but it is also important to understand that it is a very complex and highly mathematical subject, and requires a significant amount of study to become an expert.
1. One-Time Pad
Having just said how complex the field of cryptography is, the first algorithm to be examined is, in fact, rather old and perfectly secure, if difficult to use in practice. Suppose person A wishes to send person B the message “Meet you at nine pm at location alpha.” Encoding this requires a sequence of random characters at least as long as the message. In actual use, this cipher often used pages from books as keys, books that were easily accessible by both parties. In this case the following text is used as the key: “it was the best of times, it was the worst of times.” The encryption process, known to both, and in fact not really a secret, is to apply the exclusive OR operation to corresponding characters in the message and the key to produce the ciphertext:
The exclusive OR operation is a bit-by-bit logical operator that is 0 if the two bits are equal and is 1 otherwise. It is applied to the numerical representations of the characters. This is quite handy because it is very fast and can easily be accomplished using simple hardware. Consider the first character in the message “m” The first character in the key is “i” The ASCII codes are the numbers 109 and 105, respectively, or in binary, this is
0 1 1 0 1 1 0 1 109 “m”
0 1 1 0 1 0 0 1 105 “i”
0 0 0 0 0 1 0 0 4 Exclusive OR
One interesting observation here is that different characters can be encrypted to the same cipher text byte, as in the above string where “s” and “t” both encrypt to 26. Now, this ciphertext is transmitted to B and is decoded in exactly the same way that it was encoded: apply the exclusive OR between the ciphertext and the same key (symmetric key):
The Python code that can do the basic encryption is as follows:
pt = “meetyouatninepmatlocationalpha”
key = “itwasthebestoftimesitwastheworstoftimes”
ct = “”
xt = “”
for i in range(0,len(pt)):
v = ord(pt[i])^ord(key[i])
ct = ct + chr(v)
The exclusive-OR operator is “^,” and the expression ord(pt[i])^ord(key[i]) performs the XOR on the message and the key bytes, as numbers. Doing it again with the same key gets the message back.
The reason that this is called a one-time pad is that the key can only be used once, otherwise the cipher is not secure. The security lies in the randomness of the key, and reusing it reduces the randomness. Eventually, if the same key is used often enough, an observer, someone who can intercept all of the messages, can extract the pattern and determine the key. In practice, the keys were written on pads of paper and, once used, were destroyed. Keeping the pads synchronized between the sender and receiver can be a problem, especially if there are many of each. Hence, although the system is secure, it is not used very often.
2. Public Key Encryption (RSA)
A public key system is commonly used for secure communication across computer networks, and involves one key for encryption and another for decryption. There are many variations on the basic idea, some being much too complex to discuss in a few pages, but the RSA algorithm is relatively simple, quite popular, and very secure. It is named for its inventors Rivest, Shamir, and Adleman.
The mathematical idea that underlies RSA is that one can find three very large integers e, d, and n:
for any m, and that even knowing e and n or even m, it can be extremely difficult to find d. The values d and e are the keys, and m is the message.
Encrypting a message would work as follows: A sends message m to B using B’s publicly known encryption key e:
c = me mod n
The value of c is the ciphertext and can be transmitted to B. When B receives the message, it is decrypted using their private key d:
m = cd mod n
where n > m. This works because of the original assertion that (me)d mod n = m. The success of this method depends on a few other things: can cd mod n be calculated quickly enough for large numbers (i.e., 500 bits), and can the numbers d, e, and n be found to make this work?
The first step in determining the keys is to select two very large prime numbers, p and q. Let n = p*q. A large number in this context has hundreds of bits, but that creates a cumbersome example, so smaller numbers are used in this discussion.
Now calculate j(n) = (p-1)*(q-1) and find an integer e so that e and n are coprime; that is, the greatest common divisor between e and n is 1.
Let d = (e-1) mod 9(n) so that d*e mod n = 1. This can be found using a search, which may be infeasible due to the size of the numbers:
for i in range (e, n):
if (i*e)%j == 1:
d = i
A mathematical process that uses Euler’s theorem can give the answer faster, and code has been provided for this on the accompanying disk.
3. Example: Encrypt the Message “Depart at Dawn” Using RSA
The first step is to determine some keys to use and to distribute the public key. Using the prime numbers 73 and 83 (far too small for a real situation), the determination of the keys is as follows:
n is 6059 and j(n) is 5904
e is 17, chosen because it is prime. Now find d such that d*e mod n = 1. Searching for it is practical for numbers this size and one gets the following result:
d = 3473
The public key is ( 17,6059 ), and the private key is ( 3473 ).
The message is 14 characters long and is 112 bits; n is only 10 bits long, and the message has to be shorter than this. In this instance, the message can be sent one character at a time, but this is generally poor practice. Normally, larger blocks of data are encrypted at one time. The plaintext string is converted into integers using ord(), and each one is encrypted using the formula
c = me mod n
An example is
message = “Depart at dawn”
imessage = ()
cmessage = ()
for i in range (0, len(message)):
m = ord(message[i])
imessage = imessage +(ord(message[i]),)
c = (m**e) % n
cmessage = cmessage + (c,)
Now the message consists of 14 blocks of 1 character each. It can be transmitted to the recipient, who is normally named B or Bob, in this form. The sender, named A or Alice, had access to the public key only, which is all that is needed to encrypt the message. It cannot be decrypted using the public key.
d given d-e = 1 (mod φ(n))
Bob receives the ciphertext message, which is
(4652, 3518, 4274, 5770, 1663, 344, 2498, 5770, 344, 2498, 2144, 5770, 1725, 4601)
He takes each block and decrypts it using the following formula:
m = c d mod n
The Python code for this is
dmessage = ()
for i in range (0, len(cmessage)):
c = cmessage[i]
m = (c ** d) % n
dmessage = dmessage + (m, )
The resulting decrypted message is
(68, 101, 112, 97, 114, 116, 32, 97, 116, 32, 100, 97, 119, 110)
which is the original message. Notice that because only one block per character was encrypted, the effect is that of a substitution cipher, in which each letter has been replaced by another. This is very easy to decrypt by noting patterns of letters and frequencies of letters in the language; the letter “e” is usually the most commonly used letter in an English message. That is why the message is encrypted as blocks of characters. It is highly unlikely that a large block would be repeated exactly, and if it were it would be difficult to guess what it was anyway.
Source: Parker James R. (2021), Python: An Introduction to Programming, Mercury Learning and Information; Second edition.