Title: Hash Functions -- Python Cryptography Toolkit
Next: Encryption Algorithms Prev: Introduction Up: Top Top: Top
2. Hash FunctionsHash functions take arbitrary strings as input, and produce an output of fixed size that is dependent on the input; it should never be possible to derive the input data given only the hash function's output. One simple hash function consists of simply adding together all the bytes of the input, and taking the result modulo 256. For a hash function to be cryptographically secure, it must be very difficult to find two messages with the same hash value, or to find a message with a given hash value. The simple additive hash function fails this criterion miserably; the hash functions described below do not. Examples of cryptographically secure hash functions include MD2, MD5, SHA, and Snefru.Hash functions can be used simply as a checksum, or, in association with a public-key algorithm, can be used to implement digital signatures. The hashing algorithms currently implemented are listed in the following table:
new() function to create a new hashing
object. (In older versions of the Python interpreter, the md5()
was used to perform this task. If you're modifying an old script, you
should change any calls to md5() to use new() instead.) You
can now feed arbitrary strings into the object, and can ask for the
hash value at any time. The new() function can also be passed an
optional string parameter, which will be hashed immediately.
Hash function modules define one variable:
Or, more compactly:
2.1. Security NotesHashing algorithms are considered to be broken if it's easy to compute a string that produces a given hash value, or if it's easy to find two messages that produce the same hash value. Consider an example where Alison and Bob are using digital signatures to sign a contract. Alison computes the hash value of the text of the contract and signs it. Bob could then compute a different contract that has the same hash value, and it would appear that Alison has signed that bogus contract; she'd have no way to prove otherwise. Finding such a message by brute force takespow(2, b-1) operations, where the hash function produces
b-bit hashes.
If Bob can only find two messages with the same hash value but can't
choose the resulting hash value, he can look for two messages with
different meanings, such as "I will mow Bob's lawn for $10" and "I owe
Bob $1,000,000", and ask Alison to sign the first, innocuous contract.
This attack is easier for Bob, since finding two such messages by brute
force will take None of these algorithms have been completely broken. There are no attacks on MD2, but it's rather slow at 18 K/sec. MD4 is faster at 677 K/sec but there have been some partial attacks. MD4 operates in three iterations of a basic mixing operation; two of the three rounds have been cryptanalyzed, but the attack can't be extended to the full algorithm. MD5 is a strengthened version of MD4 with four rounds; an attack against one round has been found. Because MD5 is more commonly used, the implementation is better optimized and thus faster on Intel processors (893 K/sec). MD4 may be faster than MD5 when other processors and compilers are used. All the MD algorithms produce 128-bit hashes; SHA produces a larger 160-bit hash, and there are no known attacks against it. The first version of SHA had a weakness which was later corrected; the code used here implements the second, corrected, version. It operates at 336 K/sec.
2.2. CreditsThe MD2 and MD4 code was written by A.M. Kuchling, and the MD5 code was implemented by Colin Plumb. The SHA code was originally written by Peter Gutmann.
Next: Encryption Algorithms Prev: Introduction Up: Top Top: Top
|