CRYPTOREVERSING
Courtesy of fravia's page of reverse engineering
This was orginally at http://ns.ssl.stu.neva.ru/psw/publications/crypto_eng.html

On cryptosystems untrustworthiness

Pavel V. Semjanov

Information security centre, St. Petersburg Technical University

psw@ssl.stu.neva.ru

Cryptalgorithms are widely used in modern software not only for data encryption but also for authentication and integrity. To date there exist well-known and approved cryptalgorithms (both symmetric and asymmetric), whose strength is either mathematically proved or based on mathematically hard computational problem (factorization, discrete logarithm, etc.). The most famous ones are DES, GOST, RSA. These algorithms can be breaked only by solving this problem or by brute force.

On the other hand, information on errors and flaws in one or other program or on cracking some programs (including those using cryptalgorithms) appears in computer world and about all the time. It causes distrust to both specific programs and ability to protect anything with cryptography in general, not only against secret services, but also against hackers.

Thus, when designing secure systems it is necessary to learn history of attacks and flaws in cryptosystems and to understand the reasons of their occurrence. The promising research branch in this field is the analysis of successful attacks or detected cryptosystem vulnerabilities in order to generalize, classify and detect the reasons and regularity of their appearance and existence. This is the item of this paper.

By analogy with computer systems security violation reasons taxonomy [1], let us pick out the reasons for which cryptographic programs cannot be considered secure (see figure 1.):

    1. Impossibility of strong cryptalgorithms use;
    2. Cryptalgorithm implementation errors;
    3. Cryptalgorithm wrong application;
    4. Human factor.

Note that reasons in concern cover only two potential threats, namely confidentiality and integrity ones, leaving aside denial of service, which becomes more and more significant with the development of distributed cryptosystems.

Figure 1. Cryptosystems untrustworthiness reasons.

1. Impossibility of applying strong cryptalgorithms

This kind of reasons is the most prevalent because of following factors.

Low rate of strong cryptalgorithms

It is one of the main factors hampering good algorithms usage in total ciphering or "on-the-fly" ciphering systems. In particular, while Norton DiskReet program is DES-implemented, it may not encrypt all the disk when the user changes his key, as it will take too much time. Just so, there is the option of compressed data password protection in Stac Electronicsí Stacker on fly compression program. But it is physically unable to encipher its own file (usually of hundreds megabytes) with this password, thatís why it contents with very weak algorithm and keeps hash-function of password along with data being protected. Cryptographic strength1 of this function was examined and appeared to be 28, i.e. a password can be breaked trivially.

Export restrictions

This reason is connected with cryptalgorithm export or with necessity of getting patent or rights. In particular, U.S. cryptalgorithms export is limited to 40-bit key length2. Such strength evidently cannot be secure because of modern computer power. Even for personal computer, setting exhaustive search rate to 50000 passwords per second, we get exhaustive search time of about 4 months at average.

Among known export restricted software last versions of Internet browsers, Netscape Communicationsí Netscape Navigator and Microsoftís Internet Explorer, can be mentioned. They represent encryption with 128-bit key for U.S. users and with 40-bit key for the rest. The last version of ARJ 2.60 archiver, known for its weak archive enciphering algorithm, also can be referred to this group. Today U.S. users can use cryptographically strong GOST. The funny side of situation is that while algorithm is Russian, due to U.S. law even Russians still cannot use it in ARJ program.

Own cryptalgorithms usage

Paradoxical situation, when developers ignore known algorithms, often occurs, especially with freeware and shareware, in archivers or file encryption tools, for instance.

As we have already mentioned, ARJ archiver (up to version 2.60 inclusive) uses very weak encryption algorithm, simple XOR with password. It may seem that it could be used, because archived text must not absolutely be redundant and statistical cryptanalysis methods would not do here. But after detailed consideration it appeared that archived text possesses some nonrandom information (and it holds for any archiver), for example, Huffman table and other housekeeping information. Thatís why, if one knows exactly or can predict with certain probability the values of these housekeeping variables, he can also determine corresponding password characters with the same probability.

Further usage of such a weak algorithm leads to successful plaintext attack: if an adversary knows at least one file from encrypted archive he could easily determine archive password and extract all the rest files (for given plaintext, ARJ strength is 20 !). Even if all the files are encrypted, simple XOR allows to get exhaustive search rate of 350000 password per second on Pentium.

The same we have in the case of popular programs from Microsoft Office 95, where to determine a password one must know only 16 plain bytes of .doc or .xls file which are to be present there, whereupon it is sufficient to look through only 24 variants. Microsoft Office 97 contains improved encryption algorithms, allowing exhaustive search only, but... not everywhere! MS Access 97 uses absolutely primitive algorithm, while XOR-encrypting with fixed constant not data but password!

Novell Netware network operating system (3.x and 4.x versions) also uses its own hash algorithm. Hash-function has 32-byte value, derived from user original password by truncation a password with the length of more than 32 characters with XOR operation or by replication a password with the length of less than 32 characters, in the input and 16-byte hash-value (Hash16) in the output. It is this value (for Novell Netware 3.x) to be stored in bindery data base as "PASSWORD" property.

One of the main features of cryptographically strong hash-function is that it should not allow easy collisions (such as crypt() function, used in UNIX and based on mathematically strong cryptalgorithm DES). It is this feature that is violated in Novell Netwareís hash function.

The procedure was built, which for given hash-value after some search (about few seconds on 80386+ computer) generates 32-byte sequence, which is not, of course, a true password, but is perceived by Novell Netware as such, since being processed by hash algorithm it has exactly available hash value in the output.

The same hash algorithm remained also in Novell Netware 4 version.

Microsoft, in turn, also has significant shortcomings in its main hash algorithm, applied for local (NetBIOS protocol) and global (CIFS and http protocols) networks authentication in all of its operating systems since Windows 3.11, so called LM-hash (Lan Manager-hash) [4]. (However, Microsoft pleads that it remained since OS/2 and was designed by IBM).

It runs as follows:

  1. Password is converted into 14-character string by truncating longer passwords or by complementing shorter passwords with zeros.
  2. All the lowercase characters are changed into uppercase ones. Numeric and special characters are stayed unchanged.
  3. 14-byte string is divided into two 7-byte halves.
  4. Fixed constant is encrypted on each of these halves as on DES-key, resulting in two 8-byte strings.
  5. These strings are merged to form 16-bit hash-value.

It is obvious that attacks against LM-hash are easily crowned with success due to the following:

2. Cryptalgorithm implementation errors

Although cryptographically strong or certified algorithms are used in this case, this group of reasons leads to cryptosystem security violation because of algorithms wrong implementation.

Decrease of cryptographic strength while key generation

This reason has various examples, when cryptosystem truncates user password or generates data from it of the bit length less than a password one is. For example:

  1. A lot of (old) UNIX versions truncate user password up to 8 byte before hashing. Itís interesting, that Linux 2.0, for example, while keeping users to enter alphanumerical passwords, does not test if 8-character password beginning is also alphanumerical. That is why the user, having set the highly safe passwordIsgood19 password, will be surprised at hacker logging in on his behalf with a trivial password password.
  2. Novell Netware allows user passwords up to 128 byte, i.e. 68128 ª 2779 combinations (including case-insensitive Latin alphabetic, numeric and special characters). But here hash-function (vide supra) has only 32-byte value in the input, limiting effective password length to the same value. Moreover, hash output has only 128 bit length, i.e. 2128 combinations. It causes further decrease of effective length to log68(2^128)=21 characters3, i.e. 6 times as against original.
  3. Entirely the same situation is with RAR-archiver of 1.5x version, where password of 10 characters length does not increase time for its disclosure.

While the upper bound of password length is formed by cryptalgorithms use, the lower limitation is connected with bit or entropy notation. In case with Novell Netware to get hash-value with 128-bit entropy one must use a password of at least 128*8/log2(68)=69 bit4 length or at least 22 characters5. Unlimiting minimal password length in a lot of cryptosystems results in successful exhaustive search when searching not for keys but for passwords.

Lack of weak key testing

Some cryptalgorithms (in particular, DES, IDEA), when ciphering on specific keys, cannot guarantee proper cryptographic strength level. Such keys are called weak. It is known 4 weak and 12 semi-weak keys for DES. And while the probability of meeting them is 16/2^56ª 2× 10-16, we should not neglect it in serious cryptographic systems.

The cardinality of weak keys set for IDEA is neither more nor less than 251 (however, since there are 2128 keys on the whole, the probability of meeting them is 3× 107 less than in DES).

Insufficient protection against malicious software

Malicious software includes computer viruses, Trojan horses and other software, able to eavesdrop secret key or plaintext and to substitute strong algorithm for weak one. If the programmer does not provide for sufficient protection against malicious software, it can violate cryptosystem security. It is mostly topical for operating systems without built-in security or access control services, such as MS DOS or Windows 95:

  1. Password eavesdropping. Let us recall the oldest way of password eavesdropping, known since mainframe, when the ìphantomî program emulates operating system prompt, suggesting a user to enter his name and password, stores it in certain file and terminates with the message ìInvalid passwordî. In MS DOS and Windows there exists a lot of logic bombs for reading and storing keyed-in passwords (by eavesdropping suitable interrupt), for example, when DiskReet v. 6.0 utility is running.
  2. Cryptalgorithm substitution. As an example we may consider a logic bomb, posed as accelerator application, Turbo Krypton, for instance. This logic bomb substitutes GOST 28147-89 encryption algorithm, implemented by ìKrypton-3î board (demo version), for another simple and easy decryptable algorithm [1].
  3. E-mail Trojan horse. The last example is the attempt of ìTrojan horseî to percolate through electronic mail, occurred in June 1998. The letter contained a pornographic picture and FREECD.EXE file, which decrypted provider connection passwords (Dial-Up) and sent them to ispp@usa.net, while user was amusing himself by the letter.

Dependence on time of key processing

This is relatively new aspect of cryptalgorithms incorrect implementation, called "Timing attacks" and discussed in [2], where it is shown that a lot of cryptosystems process different inputs for different time interval. The reasons for this are both hardware (different number of cycles per operation, hitting into processor cash, etc.) and software (especially while time optimization of the program). Time interval may depend on both encryption key and encrypted data.

Hence, an adversary, possessing detailed information on cryptalgorithm implementation or encrypted data and being able to measure the time of data processing somehow (for example, by analysis of data packet sending time), can try to fit a secret key. This paper [2] describes in detail attacks against RSA, Diffie-Hellman and DSS implementing systems, while the key can be obtained by elaboration bit by bit, a number of necessary time measuring is proportionate to key length.

While these research has not been brought to specific result (to compute a secret key), considered example demonstrates that crucial systems (including cryptosystems) programming is to be extremely careful and perhaps require special secure programming approaches and special design tools (especially compilers).

Software implementation errors

It is clear that this factor will always be present until programs are written by human beings. Novell Netware 3.12 operating system can serve a good example, where in spite of highly considered authentication system, in which, as Novell says, ìunencrypted password is never transmittedî, a bug in SYSCON program v. 3.76 was detected, due to which just plain password finds itself in one of network packets. This is not the case in neither earlier nor later versions of this program, so we may call it pure programmerís error. This error can be revealed only if supervisor changes oneís password (including himself). Apparently, keyboard buffer gets somehow into network packet.

Trapdoors

Trapdoors occur in cryptosystems for obvious reasons: the designer wishes to control information, being processed in his system, and reserves the right to decrypt it without user key. Perhaps, they are used for debugging too and are left in finished products for certain reasons. Naturally it becomes known to sufficiently wide circle of persons sooner or later and the value of such cryptosystem becomes next to nothing. The most wide-known ones are AWARD BIOS (up to 4.51PG version) with its universal ìAWARD_SWî password and Paradox database management system from Borland International, also including ìsuperpasswordsî ìjIGGAeî and ìnx66ppxî.

In real earnest of implementation trapdoors (in this case they evidently use explicitly weak algorithms or store a key along with data) are those algorithms, which allow a third party to read encrypted message, as in caused a sensation CLIPPER project, where the state, known for snooping its citizensí secrets, had played a role of a third party.

Random number generator (RNG) shortcomings

Strong, mathematically tested and correctly implemented RNG is of great importance for cryptosystem as well as mathematically strong and correct cryptalgorithm, otherwise its shortcomings may influence overall cryptographic strength of the system. Moreover, pseudorandom number generator (PRNG) is generally used for RNG computer modeling, characterized by period, dispersion and seed. PRNG application can be scarcely called a happy choice for cryptosystems, so strong cryptosystems use physical RNG (special board) for these purposes, or at least generate a number for PRNG initialization, with the use of physical values (time of userís keystroke, for instance).

Short period and bad dispersion are among mathematical shortcomings of RNG and appear when you take your own RNG for some reasons. In other words, it is dangerous to use your own RNG, as well as your own cryptalgorithm.

In a case of short period (when a number of pseudorandom values, produced by generator, is less than a number of possible key values) an adversary can cut down time overhead for key searching, by looking through not keys, but pseudorandom values and generating keys from these values.

Given generator with a bad dispersion, an adversary can also cut down average search overhead, if he looks through the most likely pseudorandom numbers.

The most common error, even for tested PRNG, is its wrong initialization. Here initialization number either has less information bits than generator itself or is being chosen from nonrandom numbers and can be predicted with one or other probability.

Such situation occurred with Netscape Navigator of version 1.1. It initiated PRNG with current time in seconds (sec) and microseconds (usec) and process identificators (pid and ppid). The researches have cleared up that such scenario gives 47 significant bits of information at most (note, that this generator was used for getting 40- or 128(!)-bit keys). But when adversary

  1. was able to intercept network transferred packets, and
  2. had an account at the computer with running program,

he could easily learn sec, pid and ppid with high probability. When the second condition was not met, an adversary could try to set time with network demon time; pid could be obtained through SMTP demon (generally it was located in Message-ID field). Furthermore, ppid was either a few different from pid, or just equal to 1.

The unssl program was written, which found 40-bit secret key in a minute at average, searching for microseconds.

3. Cryptalgorithms wrong application

Because of this group of reasons cryptographically strong and correctly implemented algorithms appear to be untrustworthy.

Short key

This is the most obvious reason. The question arises: why strong cryptalgorithms may use short keys? Perhaps, because of following:

  1. some algorithms can work with variable-length key, providing different cryptographic strength; it is the designer to choose necessary length for desirable strength and efficiency. Sometimes this desire is limited by other circumstances, such as export restrictions. (An example here is Norton Secret Stuff program that can be breaked in a few days because of 32-bit Blowfish key use).