Useful Notes: Encryption
the arts of codemaking and codebreaking advanced. Eventually, the algorithms became so complex that machines (such as the Enigma device used in World War II) were required to encrypt and decrypt messages with reasonable speed and accuracy. Asymmetric encryption is a newer form of encryption, devised in The Seventies; in this form, the key used to encrypt the message and the one used to decrypt it are not the same. In an asymmetric cipher, each party has a pair of keys: a public key and a private key. If Alice wants to send Bob a message, she uses Bob's public key to encrypt the plaintext, and Bob uses his private key to decrypt it. Public keys, as the name indicates, are not required to be secret; private keys are. In short, encryption and decryption are not inverse to each other in asymmetric encryption schemes, hence the name. The advantage of asymmetric encryption is that there is no need for the sender and recipient to know a shared secret key. Suppose you wanted to send an encrypted message to somebody, and you tried to do so using a symmetric cipher. How would you send them the secret key if you're concerned that somebody might eavesdrop? To send them the key, you need to use a special, secure channel that is resistant to eavesdropping—for example, an in-person meeting. Another advantage of asymmetric encryption is that it can be used in reverse, encrypting a message with your private key to create a ciphertext that anyone can decrypt using your public key. This gives the message a digital signature that proves that the private key owner wrote it, because it was encrypted with a key nobody else knows. The two processes can be combined, so that Alice can send Bob a message that nobody else could have written and nobody else can read. The biggest practical disadvantage of asymmetric encryption is that you need to "trust" that what you think is the recipient's public key really is theirs, and that their private key has not been disclosed — I can pretend to be the President of the United States and send you a public key, and if you mistakenly believe me, you might unwittingly send your top secret messages to me instead of the President and accept my digitally signed messages as if they came from the President. The normal method to verify this is for a third-party public key repository to digitally sign and store usernames and public keys. This does require a third party trusted to be impartial and an accurate record-keeper. Another alternative is a "web of trust", in which people sign each other's keys, so that (for example) Alice can verify that Carol and Dave have signed Bob's key, vouching for the fact that it actually belongs to Bob. Alice then decides whether or not to take their word for it (much as she would if Carol and Dave were vouching for Bob in person). Another disadvantage of asymmetric encryption is that it is more computationally expensive than symmetric systems. Most secure encrypted channel schemes get around this problem by using the asymmetric encryption solely to transmit a randomly generated one-time symmetric encryption key, then switch to symmetric encryption for the bulk of the transaction. Security certificates — the bits of bits that tell us that individuals online are who they say they are — use the above described digital-signature technique to generate a "seal of approval" that can be read by everyone, but only manufactured by the issuing authority. One-time pad The one-time pad is a special kind of cipher that is completely unbreakable if used correctly—but very weak if used incorrectly, and also very impractical. The trick is that the secret key must be as long as the plaintext, must be completely random, and must never ever be reused. The reason one-time pads are unbreakable is that for any conceivable plaintext, there exists a possible key that would produce that plaintext from the encrypted message. This means that if you try to guess what the key is, there are exponentially many more false positives than the real message, and no way to tell a false positive from a true positive. But if the users of a one-time pad get sloppy and reuse a key for more than one message, it becomes trivial to break. If the keys are not truly randomly generated, it can be broken too. A number of historical codebreaking successes resulted because somebody tried to use one-time pads but either reused the keys or generated them in a non-random fashion. Then there is also the problem of communicating the keys, which is even harder than in the normal case because (a) you need as many keys as messages, (b) the keys are as long as the messages. A novel method based on one-time pads is the cryptographic nonce (Number used once). This is often used in challenge-response type authentication. In this, the Alice gives the Bob a nonce and asks for a password. The password is encrypted with that and sent back. Alice then does the same thing on her end and if the outputs match, she knows Bob is legitimate. This is to prevent an eavesdropper from using Bob's response, even though they don't know Bob's password. However, the eavesdropper 'Eve' can pretend to be Alice and ask Bob, Charlie, and others and build tables to find out their passwords. To circumvent that, Bob can send his own nonce, encrypt the password with both his nonce and the one given to him, and send that. Since Bob is using a random value with every response, it makes it harder for Eve to figure out what the password is. Cryptanalysis The act of analyzing the cipher and the ciphertext in order to retrieve the original plaintext. It is not true that any ciphertext can be cracked. Using a wrong key can sometimes result in a valid-looking plaintext that is in fact not the correct plain text (one-time pads work this way). To recover a plaintext from a ciphertext, the key and the algorithm used are required. Having only the ciphertext is the hardest problem: the cryptanalist must guess both the algorithm and the key. This is called a ciphertext-only attack and it requires the experience and the intuition of the analyst, knowledge of the circumstances, the sender, the receiver, current events, etc... While statistical analysis of the ciphertexts could provide information about the algorithm, it requires plenty of ciphertexts or it doesn't give any meaningful information. With modern encryption algorithms, ciphertext-only cryptanalysis is basically impossible no matter how much data you have. If the algorithm is known, the recovery can be easier: only the key (usually a password, though other things can be considered as keys) is required. When evaluating the security of an encryption system, it is prudent to assume that the attacker knows the algorithm (a dictum known as Kerckhoffs's principle, named after cryptographer Auguste Kerckhoffs). The simplest method of cracking a password is known as brute force: trying every possible password. The problem with this is that it can take a very long time to find the right password. The number of possibilities for a password increases with every character added to the length of the password and every character added to the range of options. For example, if you wanted to to find a password that was six (uppercase only) letters long, you might have to try 26^6 = 308,915,776 possible passwords. At the rate of a thousand guesses per second, it would take three and a half days to run through the list. Trying every seven-letter password at the same rate would take three months. If, instead of uppercase letters only, the passwords use lowercase letters, uppercase letters, and digits (26 + 26 + 10 = 62 options for each character), a six-character password requires 1.8 years to exhaustively search at this rate, and a seven-character one requires 111.5 years. The problem for the user is that memorizing a truly random string of characters is very difficult. It's easier to use actual words as passwords. However, this is more vulnerable to brute-force attack: the number of words in the dictionary is much smaller than the number of random combinations of characters. Using odd spelling (such as "leetspeak" substitutions of other characters for letters) and using unusual words makes a dictionary attack more difficult; however, sophisticated attackers will use an exhaustive vocabulary and try a range of variations for each word. It is possible to combine randomness and easy memorization using tricks such as remembering a phrase and using the first letter of each word (e.g. "This website will ruin your life" becomes "Twwryl"). Another option is to use a password manager program to store an encrypted database of passwords; the user then only needs to remember one master password to access all the others. Of course, if the encryption algorithm itself is weak, even an unguessable password won't help you. Cryptographers consider an algorithm broken if there is a way to figure out the key faster than brute forcing it. Sometimes, this is only of theoretical interest (for example, even with the speedup it would still take longer than the age of the universe). Other times the algorithm is so broken that the key can be recovered quickly and easily. There are a large variety of attack techniques using advanced math, and new cryptosystems are expected to show evidence of resistance to them. If after years of analysis by expert cryptographers there aren't any practical attacks discovered, then it's considered probably secure. That little code you created yourself, however, doesn't stand a chance. As mentioned above, the key doesn't have to be a password. For example, in Cryptonomicon, two people communicate using the "Solitaire cypher". The cypher uses a deck of cards; their initial arrangement is the key leaving 54! (54 factorial, 54×53×52×...×2×1 = about 2.3 times 10 to the 71st power) possible keys and no dictionary to use. The knowledge of the plaintext or parts of the plaintext (so-called "cribs") can make a cryptanalysis problem exponentially easier. The plaintext - or parts of - could be acquired by old-fashioned spying or, more inventively, by feeding the mole. This is called a known plaintext attack. And then (as the xkcd comic at the top of the page illustrates) there's the age-old standby of rubber hose cryptanalysis — beating/torturing the key out of a holder. (The name comes from the rather vivid image of the keyholder being beaten across their bare feet with a rubber hose). This does not have a direct counter, but many applications (such as VeraCrypt) allow a defense based on plausible deniability for an encrypted volume to decrypt to a 'decoy', which hides a second encrypted volume with a different key. Thus, someone coerced into giving up a key can reveal one secret while hiding a bigger one. The interrogator may suspect the presence of a hidden inner volume, but its existence can not be proved or disproved.note Of course, no encryption could protect you from stupidity. If you ever find yourself in a situation where the secret service is digging through your trash and anything you say might spell your doom if it ever gets in the wrong hands (because, be honest, who doesn't get into situations like this?), remember the following:
- Use good passwords. Single words that can be easily guessed will easily fold under a dictionary attack, and short passwords are relatively easy to brute-force. There are lots of resources regarding strong password generation on the web.
- Keep the keys secret! This is pretty obvious, if someone knows the key, your encryption is fucked.
- Choose the algorithm carefully! Don't use any algorithm that has been cracked (such as the Enigma)! And whatever you do, NEVER make up your own encryption. For that matter, try to avoid writing your own code to implement existing cryptosystems too, and use existing protocols and libraries as much as possible. Encryption is notoriously difficult to get right, and you almost certainly won't.
- Be weary of tells, habits, and other repeated phrases you use. What allowed code breakers to defeat Enigma (among other things) was that the German military always sent the same type of message at specific times and ended each message the same.