Cryptanalysis
The goal of cryptanalysis is to find some weakness or insecurity in a cryptographic scheme, thus permitting its subversion. It is an essential part of communications intelligence. Cryptanalysis might be undertaken by a malicious attacker, attempting to subvert a system, or by the system's designer (or others) attempting to evaluate whether a system has vulnerabilities. In modern practice, however, quality cryptographic algorithms and protocols have usually been carefully examined and many have been proved that establish practical security of the system (at least, under clear -- and hopefully reasonable -- assumptions).
It is a commonly held misconception that every encryption method can be broken. In connection with his WWII work at Bell Labs, Claude Shannon proved that the one-time pad cipher is unbreakable, provided the key material is truly random, never reused, kept secret from all possible attackers, and of equal or greater length than the message[1]. That is, an enemy who intercepts an encrypted message has provably no better chance of guessing the contents than an enemy who only knows the length of the message.
Even two-time use of the keys can lead to compromise, as shown by the VENONA project that allowed cryptanalysis of Soviet espionage traffic, in which a one-time pad was used more than once. [2]
Any cipher except a one-time pad can be broken with enough computational effort (by brute force attack if nothing else), but the amount of effort needed to break a cipher may be exponentially dependent on the key size, as compared to the effort needed to use the cipher. In such cases, effective security can still be achieved if some conditions (e.g., key size) are such that the effort ('work factor' in Shannon's terms) is beyond the ability of any adversary.
Before discussing classic cryptanalyis, be aware that mathematical cryptanalysis is not the only way to access protected content.
Practical cryptanalysis
"Practical cryptanalysis" is a euphemism for using physical or social means to compromise the cryptosystem, such as clandestinely breaking into a communications center and copying the keys.
And, of course, social engineering, and other attacks against personnel who work with cryptosystems or the messages they handle (e.g., bribery, extortion, blackmail, espionage, ...) may be most productive attacks of all. One variant is referred to as "rubber hose cryptanalysis" — beating, torturing or threatening someone to get him or her to reveal keys.
Side channel attacks
While pure cryptanalysis uses weaknesses in the algorithms themselves, other attacks on cryptosystems are based on actual use of the algorithms in real devices, known as side-channel attack. If a cryptanalyst has access to, say, the amount of time the device took to encrypt a number of plaintexts or report an error in a password or PIN character, he may be able to use a timing attack to break a cipher that is otherwise resistant to analysis. An attacker might also study the pattern and length of messages to derive valuable information; this is known as traffic analysis[3], and can be quite useful to an alert adversary.
Mathematical cryptanalysis
General types of cryptanalytic attacks
There are a wide variety of cryptanalytic attacks, and they can be classified in any of several ways. One distinction turns on what an attacker knows and can do.
Ciphertext-only
The ciphertext-only attack is the most common case, where the cryptanalyst has access only to the ciphertext (modern cryptosystems are usually effectively immune to ciphertext-only attacks).
Known plaintext
In a known-plaintext attack, the cryptanalyst has access to a ciphertext and its corresponding plaintext (or to many such pairs).
Sometimes it is enough for the attacker to have partial knowledge of the plaintext — perhaps that it is ASCII text with the top bit of every byte zero, or that it is radar data in a known format. This gives him something to go on, a way to check if a decryption or partial decryption is correct; that may be all he needs.
Often the attacker can guess some plaintext. In British World War II ULTRA codebreaking, such guesses were known as "cribs". Many messages contain fixed text like dates or formal phrases like "your humble and obedient servant", and various system such as compression algorithms or email handlers insert fixed-format headers; all these are free gifts to the cryptananlyst. In war, names of enemy officers, bases, ships or units (or their codenames) are good guesses, also perhaps words like "order" and "ammunition".
Language structure may also provide cribs. Consider ordinary English text, where about one in seven characters are spaces, and "the" and "of" are the commonest words. Suppose the cipher uses 64-bit (8 character) blocks. The chance that one of them encodes the 8-character string " of the " is significant. If the cipher user (foolishly) sends large volumes of data with the same key and the attacker has the determination and the (huge) resources to test them all, this crib is almost certain to break the cipher eventually.
Generally, if a true known plaintext attack (where the attacker actually knows some plaintext) is feasible, then variants based on guessed plaintext or on partial knowledge of plaintext will be more difficult, but not prohibitively so. Suppose there is a known plaintext attack that breaks the cipher at reasonable cost, but the attacker has only some guessed plaintext that has a 1% chance of being right. That gives him a 1% chance of solving the cipher on the first try. If he has many guessed cribs available, he is quite likely to solve it eventually at some cost near 100 times the cost of a pure known plaintext attack.
Or suppose he only knows the plaintext is ASCII; the top bit of every byte is zero. Suppose we are dealing with a block cipher that has 64-bit blocks. If there is a feasible known plaintext attack, then an enemy who knows only 64 bits of plaintext in a single block can break the cipher. However, if the data is known ASCII and the enemy has intercepted N blocks, then he knows that 8N bits of the plaintext are zero. Whether this lets him break the cipher or not is an extremely complex question depending on all the details of the cipher. However, if you are trying to keep the data secure, you should guess "yes" and choose a cipher that is secure against known plaintext attacks..
In general, if there is an effective known plaintext on the cipher. then the cipher should be considered insecure.
A number of attacks require known (including guessed) plaintext to work:
- a brute force search tries all possible keys; you need to know one block of plaintext so you can tell when you have found the right key
- a meet-in-the-middle attack finds a middle value in two ways, by half-encrypting a block of known plaintext and by half-decrypting the matching cyphertext
- an algebraic attack writes the cipher operations as equations in some algebraic system, usually Boolean, then plugs in known values fpr plaintext and ciphertext and solves for the key
All these should be completely impractical against any well-designed cipher.
Chosen plaintext
In a chosen-plaintext attack, the cryptanalyst may choose a plaintext and learn its corresponding ciphertext (perhaps many times); an example is the gardening used by the British during WWII.
Chosen ciphertext
In a chosen-ciphertext attack, the cryptanalyst may choose ciphertexts and learn their corresponding plaintexts[4]. Also important, often overwhelmingly so, are mistakes (generally in the design or use of one of the protocols involved; see ULTRA for some historical examples of this).
Strategies against symmetric cryptosystems
Cryptanalysis of symmetric-key techniques typically involves looking for attacks against the block ciphers or stream ciphers that are more efficient than any attack that could be against a perfect cipher. For example, a simple brute force attack against DES requires one known plaintext and 255 decryptions, trying approximately half of the possible keys, before chances are better than even the key will have been found. But this may not be enough assurance; a linear cryptanalysis attack against DES requires 243 known plaintexts and approximately 243 DES operations[5]. This is a considerable improvement on brute force attacks.
Strategies against asymmetric cryptosystems
Public-key algorithms are based on the computational difficulty of various problems. The most famous of these is integer factorization (the RSA cryptosystem is based on a problem related to factoring), but the discrete logarithm problem is also important. Much public-key cryptanalysis concerns numerical algorithms for solving these computational problems, or some of them, efficiently. For instance, the best algorithms for solving the elliptic curve-based version of discrete logarithm are much more time-consuming than the best known algorithms for factoring, at least for problems of equivalent size. Thus, other things being equal, to achieve an equivalent strength of attack resistance, factoring-based encryption techniques must use larger keys than elliptic curve techniques. For this reason, public-key cryptosystems based on elliptic curves have become popular since their invention in the mid-1990s.
Vulnerabilities of cryptographic primitives
Much of the theoretical work in cryptography concerns cryptographic primitives — algorithms with basic cryptographic properties — and their relationship to other cryptographic problems. For example, a one-way function is a function intended to be easy to compute but hard to invert. In a very general sense, for any cryptographic application to be secure (if based on such computational feasibility assumptions), one-way functions must exist. However, if one-way functions exist, this implies that P ≠ NP.[6]. Since the P versus NP problem is currently unsolved, we don't know if one-way functions exist. If they do, however, we can build other cryptographic tools from them. For instance, if one-way functions exist, then secure pseudorandom generators and secure pseudorandom functions exist[7].
Other cryptographic primitives include cipher algorithms themselves, one-way permutations, trapdoor permutations, etc.
Vulnerabilities of cryptographic protocols
In many cases, cryptographic techniques involve back and forth communication among two or more parties in space or across time (e.g., cryptographically protected backup data). The term cryptographic protocol captures this general idea. Cryptographic protocols have been developed for a wide range of problems, including relatively simple ones like interactive proofs[8], secret sharing[9][10], and zero-knowledge[11], and much more complex ones like electronic cash[12] and secure multiparty computation[13].
When the security of a cryptographic system fails, it is rare that the vulnerabilty leading to the breach will have been in a quality cryptographic primitive. Instead, weaknesses are often mistakes in the protocol design (often due to inadequate design procedures or less than thoroughly informed designers), in the implementation (e.g., a software bug), in a failure of the assumptions on which the design was based (e.g., proper training of those who will be using the system), or some other human error. Many cryptographic protocols have been designed and analyzed using ad hoc methods. Methods for formally analyzing the security of protocols, based on techniques from mathematical logic (see for example BAN logic), and more recently from concrete security principles, have been the subject of research for the past few decades[14][15][16]. Unfortunately, these tools are cumbersome and not widely used for complex designs.
References
- ↑ "Shannon": Claude Shannon and Warren Weaver, "The Mathematical Theory of Communication", University of Illinois Press, 1963, ISBN 0-252-72548-4
- ↑ National Security Agency, VENONA
- ↑ Dawn Song, David Wagner, and Xuqing Tian, "Timing Analysis of Keystrokes and Timing Attacks on SSH", In Tenth USENIX Security Symposium, 2001.
- ↑ Menezes, AJ; PC van Oorschot & SA Vanstone (Fifth Edition, 2001), Handbook of Applied Cryptography, ISBN 0-8493-8523-7
- ↑ Pascal Junod, "On the Complexity of Matsui's Attack", SAC 2001.
- ↑ Goldreich, Oded (2001), Foundations of Cryptography, Volume 1: Basic Tools, Cambridge University Press, ISBN 0-521-79172-3
- ↑ J. Håstad, R. Impagliazzo, L.A. Levin, and M. Luby, "A Pseudorandom Generator From Any One-Way Function", SIAM J. Computing, vol. 28 num. 4, pp 1364–1396, 1999.
- ↑ László Babai. "Trading group theory for randomness". Proceedings of the Seventeenth Annual Symposium on the Theory of Computing, ACM, 1985.
- ↑ G. Blakley. "Safeguarding cryptographic keys." In Proceedings of AFIPS 1979, volume 48, pp. 313-317, June 1979.
- ↑ A. Shamir. "How to share a secret." In Communications of the ACM, volume 22, pp. 612-613, ACM, 1979.
- ↑ S. Goldwasser, S. Micali, and C. Rackoff, "The Knowledge Complexity of Interactive Proof Systems", SIAM J. Computing, vol. 18, num. 1, pp. 186-208, 1989.
- ↑ S. Brands, "Untraceable Off-line Cash in Wallets with Observers", In Advances in Cryptology -- Proceedings of CRYPTO, Springer-Verlag, 1994.
- ↑ R. Canetti, "Universally composable security: a new paradigm for cryptographic protocols", In Proceedings of the 42nd annual Symposium on the Foundations of Computer Science (FOCS), pp. 136-154, IEEE, 2001.
- ↑ D. Dolev and A. Yao, "On the security of public key protocols", IEEE transactions on information theory, vol. 29 num. 2, pp. 198-208, IEEE, 1983.
- ↑ M. Abadi and P. Rogaway, "Reconciling two views of cryptography (the computational soundness of formal encryption)." In IFIP International Conference on Theoretical Computer Science (IFIP TCS 2000), Springer-Verlag, 2000.
- ↑ D. Song, "Athena, an automatic checker for security protocol analysis", In Proceedings of the 12th IEEE Computer Security Foundations Workshop (CSFW), IEEE, 1999.