Annotation of pgp/src/md4.doc, revision 1.1.1.1

1.1       root        1: License to copy and use this document and the software described
                      2: herein is granted provided it is identified as the "RSA Data
                      3: Security, Inc. MD4 Message Digest Algorithm" in all materials
                      4: mentioning or referencing this software, function, or document.
                      5: 
                      6: License is also granted to make derivative works provided that such
                      7: works are identified as "derived from the RSA Data Security, Inc. MD4
                      8: Message Digest Algorithm" in all material mentioning or referencing
                      9: the derived work.
                     10: 
                     11: RSA Data Security, Inc. makes no representations concerning the
                     12: merchantability of this algorithm or software or their suitability
                     13: for any specific purpose.  It is provided "as is" without express or
                     14: implied warranty of any kind.
                     15: 
                     16: These notices must be retained in any copies of any part of this
                     17: documentation and/or software.
                     18: ------------------------------------------------------------------------
                     19: 
                     20:                 The MD4 Message Digest Algorithm
                     21:                 --------------------------------
                     22:                        by Ronald L. Rivest
                     23:        MIT Laboratory for Computer Science, Cambridge, Mass. 02139
                     24:                                and
                     25:        RSA Data Security, Inc., Redwood City, California 94065
                     26:                    (Version 2/17/90 -- Revised)
                     27: 
                     28: 
                     29: Abstract:
                     30: ---------
                     31:                            
                     32: This note describes the MD4 message digest algorithm.  The algorithm
                     33: takes as input an input message of arbitrary length and produces as
                     34: output a 128-bit ``fingerprint'' or ``message digest'' of the input.
                     35: It is conjectured that it is computationally infeasible to produce two
                     36: messages having the same message digest, or to produce any message
                     37: having a given prespecified target message digest.  The MD4 algorithm
                     38: is thus ideal for digital signature applications, where a large file
                     39: must be ``compressed'' in a secure manner before being signed with the
                     40: RSA public-key cryptosystem.
                     41: 
                     42: The MD4 algorithm is designed to be quite fast on 32-bit machines.  On
                     43: a SUN Sparc station, MD4 runs at 1,450,000 bytes/second.  On a DEC
                     44: MicroVax II, MD4 runs at approximately 70,000 bytes/second.  On a 20MHz
                     45: 80286, MD4 runs at approximately 32,000 bytes/second.  In addition, the
                     46: MD4 algorithm does not require any large substitution tables; the
                     47: algorithm can be coded quite compactly.
                     48: 
                     49: The MD4 algorithm is being placed in the public domain for review and
                     50: possible adoption as a standard.  
                     51: 
                     52: (Note: The document supersedes an earlier draft.  The algorithm described
                     53:        here is a slight modification of the one described in the draft.)
                     54: 
                     55: 
                     56: I. Terminology and Notation
                     57: ---------------------------
                     58: 
                     59: In this note a ``word'' is a 32-bit quantity and a byte is an 8-byte
                     60: quantity.  A sequence of bits can be interpreted in a natural manner
                     61: as a sequence of bytes, where each consecutive group of 8 bits is
                     62: interpreted as a byte with the high-order (most significant) bit of
                     63: each byte listed first.  Similarly, a sequence of bytes can be
                     64: interpreted as a sequence of 32-bit words, where each consecutive
                     65: group of 4 bytes is interpreted as a word with the low-order (least
                     66: significant) byte given first.
                     67: 
                     68: Let x_i denote ``x sub i''.  If the subscript is an expression, we
                     69: surround it in braces, as in x_{i+1}.  Similarly, we use ^ for
                     70: superscripts (exponentiation), so that x^i denotes x to the i-th
                     71: power.
                     72: 
                     73: Let the symbol ``+'' denote addition of words (i.e., modulo-2^32
                     74: addition). Let X <<< s denote the 32-bit value obtained by circularly
                     75: shifting (rotating) X left by s bit positions.  Let not(X) denote the
                     76: bit-wise complement of X, and let X v Y denote the bit-wise OR of X
                     77: and Y.  Let X xor Y denote the bit-wise XOR of X and Y, and let XY
                     78: denote the bit-wise AND of X and Y.
                     79: 
                     80: 
                     81: II. MD4 Algorithm Description
                     82: -----------------------------
                     83: 
                     84: We begin by supposing that we have a b-bit message as input, and
                     85: that we wish to find its message digest.  Here b is an arbitrary
                     86: nonnegative integer; b may be zero, it need not be a multiple of 8,
                     87: and it may be arbitrarily large. We imagine the bits of the message
                     88: written down as follows:
                     89: 
                     90:        m_0 m_1 ... m_{b-1} .
                     91: 
                     92: The following five steps are performed to compute the message digest of the
                     93: message.
                     94: 
                     95: 
                     96: Step 1. Append padding bits
                     97: ---------------------------
                     98: 
                     99: The message is ``padded'' (extended) so that its length (in bits) is
                    100: congruent to 448, modulo 512.  That is, the message is extended so
                    101: that it is just 64 bits shy of being a multiple of 512 bits long.
                    102: Padding is always performed, even if the length of the message is
                    103: already congruent to 448, modulo 512 (in which case 512 bits of
                    104: padding are added).
                    105: 
                    106: Padding is performed as follows: a single ``1'' bit is appended to the
                    107: message, and then enough zero bits are appended so that the length in bits
                    108: of the padded message becomes congruent to 448, modulo 512.
                    109: 
                    110: 
                    111: Step 2. Append length
                    112: ---------------------
                    113: 
                    114: A 64-bit representation of b (the length of the message before the
                    115: padding bits were added) is appended to the result of the previous
                    116: step.  In the unlikely event that b is greater than 2^64, then only
                    117: the low-order 64 bits of b are used.  (These bits are appended as two
                    118: 32-bit words and appended low-order word first in accordance with the
                    119: previous conventions.)
                    120: 
                    121: At this point the resulting message (after padding with bits and with
                    122: b) has a length that is an exact multiple of 512 bits.  Equivalently,
                    123: this message has a length that is an exact multiple of 16 (32-bit)
                    124: words.  Let M[0 ... N-1] denote the words of the resulting message,
                    125: where N is a multiple of 16.
                    126: 
                    127: 
                    128: Step 3. Initialize MD buffer
                    129: ----------------------------
                    130: 
                    131: A 4-word buffer (A,B,C,D) is used to compute the message digest.  Here
                    132: each of A,B,C,D are 32-bit registers.  These registers are initialized
                    133: to the following values (in hexadecimal, low-order bytes first):
                    134: 
                    135:        word A:    01 23 45 67
                    136:        word B:    89 ab cd ef
                    137:        word C:    fe dc ba 98
                    138:        word D:    76 54 32 10
                    139: 
                    140: 
                    141: Step 4. Process message in 16-word blocks
                    142: -----------------------------------------
                    143: 
                    144: We first define three auxiliary functions that each take as input
                    145: three 32-bit words and produce as output one 32-bit word.
                    146: 
                    147:        f(X,Y,Z)  =  XY v not(X)Z 
                    148:        g(X,Y,Z)  =  XY v XZ v YZ 
                    149:        h(X,Y,Z)  =  X xor Y xor Z 
                    150: 
                    151: In each bit position f acts as a conditional: if x then y else z.
                    152: (The function f could have been defined using + instead of v since XY
                    153: and not(X)Z will never have 1's in the same bit position.)  In each
                    154: bit position g acts as a majority function: if at least two of x, y, z
                    155: are on, then g has a one in that bit position, else g has a zero. It
                    156: is interesting to note that if the bits of X, Y, and Z are independent
                    157: and unbiased, the each bit of f(X,Y,Z) will be independent and
                    158: unbiased, and similarly each bit of g(X,Y,Z) will be independent and
                    159: unbiased.  The function h is the bit-wise ``xor'' or ``parity'' function;
                    160: it has properties similar to those of f and g.
                    161: 
                    162: Do the following:
                    163: 
                    164: For i = 0 to N/16-1 do /* process each 16-word block */
                    165:        For j = 0 to 15 do: /* copy block i into X */
                    166:          Set X[j] to M[i*16+j].
                    167:        end /* of loop on j */
                    168:        Save A as AA, B as BB, C as CC, and D as DD.
                    169: 
                    170:        [Round 1]
                    171:          Let [A B C D i s] denote the operation
                    172:                A = (A + f(B,C,D) + X[i]) <<< s  .
                    173:          Do the following 16 operations:
                    174:                [A B C D 0 3] 
                    175:                [D A B C 1 7] 
                    176:                [C D A B 2 11] 
                    177:                [B C D A 3 19] 
                    178:                [A B C D 4 3] 
                    179:                [D A B C 5 7] 
                    180:                [C D A B 6 11] 
                    181:                [B C D A 7 19] 
                    182:                [A B C D 8 3] 
                    183:                [D A B C 9 7] 
                    184:                [C D A B 10 11] 
                    185:                [B C D A 11 19] 
                    186:                [A B C D 12 3] 
                    187:                [D A B C 13 7] 
                    188:                [C D A B 14 11] 
                    189:                [B C D A 15 19] 
                    190: 
                    191:        [Round 2]
                    192:          Let [A B C D i s] denote the operation
                    193:                A = (A + g(B,C,D) + X[i] + 5A827999) <<< s .
                    194:          (The value 5A..99 is a hexadecimal 32-bit constant, written with
                    195:          the high-order digit first. This constant represents the square
                    196:          root of 2.  The octal value of this constant is 013240474631.
                    197:           See Knuth, The Art of Programming, Volume 2
                    198:          (Seminumerical Algorithms), Second Edition (1981), Addison-Wesley.
                    199:          Table 2, page 660.)
                    200:          Do the following 16 operations:
                    201:                [A B C D 0  3] 
                    202:                [D A B C 4  5] 
                    203:                [C D A B 8  9] 
                    204:                [B C D A 12 13] 
                    205:                [A B C D 1  3] 
                    206:                [D A B C 5  5] 
                    207:                [C D A B 9  9] 
                    208:                [B C D A 13 13] 
                    209:                [A B C D 2  3] 
                    210:                [D A B C 6  5] 
                    211:                [C D A B 10 9] 
                    212:                [B C D A 14 13] 
                    213:                [A B C D 3  3] 
                    214:                [D A B C 7  5] 
                    215:                [C D A B 11 9] 
                    216:                [B C D A 15 13] 
                    217: 
                    218:        [Round 3]
                    219:          Let [A B C D i s] denote the operation
                    220:                A = (A + h(B,C,D) + X[i] + 6ED9EBA1) <<< s
                    221:          (The value 6E..A1 is a hexadecimal 32-bit constant, written with
                    222:          the high-order digit first. This constant represents the square
                    223:          root of 3.  The octal value of this constant is 015666365641.
                    224:           See Knuth, The Art of Programming, Volume 2
                    225:          (Seminumerical Algorithms), Second Edition (1981), Addison-Wesley.
                    226:          Table 2, page 660.)
                    227:          Do the following 16 operations:
                    228:                [A B C D 0  3] 
                    229:                [D A B C 8  9] 
                    230:                [C D A B 4  11] 
                    231:                [B C D A 12 15] 
                    232:                [A B C D 2  3] 
                    233:                [D A B C 10 9] 
                    234:                [C D A B 6  11] 
                    235:                [B C D A 14 15] 
                    236:                [A B C D 1  3] 
                    237:                [D A B C 9  9] 
                    238:                [C D A B 5  11] 
                    239:                [B C D A 13 15] 
                    240:                [A B C D 3  3] 
                    241:                [D A B C 11 9] 
                    242:                [C D A B 7  11] 
                    243:                [B C D A 15 15] 
                    244: 
                    245: Then perform the following additions:
                    246:                A = A + AA
                    247:                B = B + BB
                    248:                C = C + CC
                    249:                D = D + DD
                    250: (That is, each of the four registers is incremented by the value it had
                    251: before this block was started.)
                    252: 
                    253: end /* of loop on i */
                    254: 
                    255: 
                    256: Step 5. Output
                    257: --------------
                    258: 
                    259: The message digest produced as output is A,B,C,D.
                    260: That is, we begin with the low-order byte of A, and end with the
                    261: high-order byte of D.
                    262: 
                    263: This completes the description of MD4.  A reference implementation in
                    264: C is given in the Appendix.
                    265: 
                    266: 
                    267: III. Extensions
                    268: ---------------
                    269: 
                    270: If more than 128 bits of output are required, then the following
                    271: procedure is recommended to obtain a 256-bit output.  (There is no
                    272: provision made for obtaining more than 256 bits.)
                    273: 
                    274: Two copies of MD4 are run in parallel over the input.  The first copy
                    275: is standard as described above.  The second copy is modified as follows.
                    276: 
                    277: The initial state of the second copy is:
                    278:        word A:    00 11 22 33
                    279:        word B:    44 55 66 77
                    280:        word C:    88 99 aa bb
                    281:        word D:    cc dd ee ff
                    282: 
                    283: The magic constants in rounds 2 and 3 for the second copy of MD4 are
                    284: changed from sqrt(2) and sqrt(3) to cuberoot(2) and cuberoot(3):
                    285:                                Octal           Hex
                    286:        Round 2 constant        012050505746    50a28be6 
                    287:        Round 3 constant        013423350444    5c4dd124
                    288: 
                    289: Finally, after every 16-word block is processed (including the last
                    290: block), the values of the A registers in the two copies are exchanged.
                    291: 
                    292: The final message digest is obtaining by appending the result of the
                    293: second copy of MD4 to the end of the result of the first copy of MD4.
                    294: 
                    295: 
                    296: IV. Summary
                    297: ------------
                    298: 
                    299: The MD4 message digest algorithm is simple to implement, and provides
                    300: a ``fingerprint'' or message digest of a message of arbitrary length.
                    301: 
                    302: It is conjectured that the difficulty of coming up with two messages
                    303: having the same message digest is on the order of 2^64 operations, and
                    304: that the difficulty of coming up with any message having a given
                    305: message digest is on the order of 2^128 operations.  The MD4 algorithm
                    306: has been carefully scrutinized for weaknesses.  It is, however, a
                    307: relatively new algorithm and further security analysis is of course
                    308: justified, as is the case with any new proposal of this sort.  The
                    309: level of security provided by MD4 should be sufficient for
                    310: implementing very high security hybrid digital signature schemes based
                    311: on MD4 and the RSA public-key cryptosystem.
                    312: 
                    313: V. Acknowledgements
                    314: -------------------
                    315: 
                    316: I'd like to thank Don Coppersmith, Burt Kaliski, Ralph Merkle, and
                    317: Noam Nisan for numerous helpful comments and suggestions.
                    318: 
                    319: 
                    320: APPENDIX. Reference Implementation
                    321: ----------------------------------
                    322: 
                    323: This appendix contains the following files:
                    324:        md4.h           -- a header file for using the MD4 implementation
                    325:        md4.c           -- the source code for the MD4 routines
                    326:        md4driver.c     -- a sample ``user'' routine
                    327:        session         -- sample results of running md4driver
                    328: 
                    329: 

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.