Taken from alt.security.pgp 
   
                                Passphrase FAQ
                                       
   V. 1.0
   2 October 1993
   
   '"PGP," warns Dorothy Denning, a Georgetown University professor who
   has worked closely with the National Security Agency, "could
   potentially become a widespread problem.' -- (E. Dexheimer)
   
   Comments to: Grady Ward, grady@netcom.com
   
   Contributors:
   John Kelsey, c445585@mizzou1.missouri.edu (Appendix A)
   RSA Data Security (Appendix C. The MD5 Algorithm)
   Jim Gillogly (Appendix D. The Secure Hash Algorithm)
   
FAQ: How do I choose a good password or phrase?

   ANS: Shocking nonsense makes the most sense
   
   With the intrinsic strength of some of the modern encryption,
   authentication, and message digest algorithms such as RSA, MD5, SHS
   and IDEA the user password or phrase is becoming more and more the
   focus of vulnerability.
   
   For example, Deputy Ponder with the Los Angeles County Sheriff's
   Department admitted in early 1993 that both they and the FBI despaired
   of breaking the PGP 1.0 system except through a successful dictionary
   attack (trying many possible passwords or phrases from lists of
   probable choices and their variations) rather than "breaking" the
   underlying cryptographic algorithm mathematically.
   
   The fundamental reason why attacking or trying to guess the user's
   password or phrase will increasingly be the focus of cryptanalysis is
   that the user's choice of password may represent a much simpler
   cryptographic key than optimal for the encryption algorithm being
   used. This weakness of the user's password choice provides the
   potential cryptanalytic wedge.
   
   For example, suppose a user chooses the password 'david.' On the
   surface the entropy of this key (or the number of different
   equiprobable key states) appears to be five characters chosen from a
   set of twenty-six with replacements: 26^5 or 1.188 x 10^7. But since
   the user is apparently biased toward common given names, which a
   majority appear in lists numbering only 6,000-7,000 entries, the true
   entropy is undoubtedly much closer to 6.5 x 10^3, or about four orders
   of magnitude smaller than the raw length might suggest. (In fact this
   password probably possesses a much smaller entropy than even this for
   the very common name "david" would be one of the first names to be
   checked by an optimized dictionary attack program.)
   
   In other words the "entropy" of a keyspace is not a fixed physical
   quantity: the cryptanalyst can exploit whole cultural biases and
   contexts, not just byte frequencies, digraphs, or even whole-word
   correlations to reduce the key space he or she is trying to explore.
   
   To thwart this avenue of attack we would like to discover a method of
   selecting passwords or phrases that have at least as many bits of
   entropy (or "hard-to-guessness") as the entropy of the cryptographic
   key of the underlying algorithm being used.
   
   To compare, DES (Data Encryption Standard) is believed to have about
   54-55 bits (~4 x 10 ^16) of entropy while the IDEA algorithm is
   believed to have about 128 bits (~3.5 x 10^38) of entropy. The closer
   the entropy of the user's password or phrase is to the intrinsic
   entropy of the cryptographic key of the underlying algorithm being
   used, the more likely an attacker would need to search a substantially
   larger portion of the algorithm's key space in order to rediscover the
   key.
   
   Unfortunately many documents suggest choosing passwords or phrases
   that are distinctly inferior to the latest method. For example, one
   white paper widely archived on the internet suggests selecting an
   original password by constructing an acronym from a popular song lyric
   or from a line of script from, for example, the SF movie "Star Wars".
   Both of these ideas turn out to be weak because both the entire script
   to Stars Wars and entire sets of song lyrics to thousands of popular
   songs are available on-line to everyone and, in some cases, are
   already embedded into "crack" dictionary attack programs (See
   ftp.uwp.edu).
   
   However, the conflict between choosing an easy-to-remember key and
   choosing a key with a high level of entropy is not a hopeless task if
   we exploit mnemonic devices that have been used for a long time
   outside the field of cryptography. With the goal of making up a
   passphrase not included in any existing corpus yet very easy to
   remember, an effective technique is one known as "shocking nonsense."
   
   "Shocking nonsense" means to make up a short phrase or sentence that
   is both nonsensical and shocking in the culture of the user, that is,
   it contains grossly obscene, racist, impossible or other extreme
   juxtaposition of ideas. This technique is permissable because the
   passphrase, by its nature, is never revealed to anyone with
   sensibilities to be offended.
   
   Shocking nonsense is unlikely to be duplicated anywhere because it
   does not describe a matter-of-fact that could be accidentally
   rediscovered by anyone else and the emotional evocation makes it
   difficult for the creator to forget. A mild example of such shocking
   nonsense might be: "mollusks peck my galloping genitals ." The reader
   can undoubtedly make up many far more shocking or entertaining
   examples for himself or herself.
   
   Even relatively short phrases offer acceptable entropy because the far
   larger "alphabet" pool of word symbols that may be chosen than the 26
   characters that form the Roman alphabet. Even choosing from a
   vocabulary of a few thousand words a five word phrase might have on
   the order of 58 to 60 bits of entropy -- more than what is needed for
   the DES algorithm, for example.
   
   When you are permitted to use passphrases of arbitrary length (in PGP
   for example) it is not necessary to further perturb your 'shocking
   nonsense' passphrase to include numbers or special symbols because the
   pool of word choices is already very high. Not needing those special
   symbols or numbers (that are not intrinsically meaningful) makes the
   shocking nonsense passphrase that much easier to remember.
   
   If you are forced to use, say, a Unix password utility that permits
   only passwords of restricted length, one good strategy is to process a
   your secret passphrase using MD5 or SHA, then UUENCODE the result and
   select your shorter key from the output. See Appendix C and D for
   actual MD5 and SHA source implmentations.
   
Appendix A. For software developers

   For software developers designing "front-ends" or user interfaces to
   conventional short-password applications, very good results will come
   from permitting the user arbitrary length passphrases that are then
   "crunched" or processed using a strong digest algorithm such as the
   160-bit SHS (Secure Hash Standard) or the 128-bit MD5 (Message Digest
   rev.5).[See following Appendices] The interface program then chooses
   the appropriate number of bits from the digest and supplies them to
   the engine enforcing a short password. This 'key crunching' technique
   will assure the developer that even the short password key space will
   have a far greater opportunity of being fully exploited by the user.
   
   John Kelsey writes:
   "I think it's a really good idea to use a randomly-generated salt to
   generate a key from a password, and that this salt should be as large
   as possible. Basically, this is to keep an attacker from spending lots
   of computer power once to generate a dictionary of likely keys. If
   users use good techniques to choose passwords, this won't matter much,
   but if they don't, this may save them from having their encrypted
   files or transmissions routinely read. The simplest scheme I can see
   for this is simply to prepend a 128-bit salt (generated as strongly as
   possible) to each encrypted file. Generate the key from the password
   by pre-filling a buffer with the 128-bit salt, then XORing in the
   keyed-in password, or by appending the key to the keyed-in password.
   Then, run SHA or MD5 or whatever to get the key.
   A secondary point: Adding a random salt ensures that people who use
   the same password/passphrase for lots of files/transmissions don't get
   the same key every time. Since most successful attacks against modern
   encryption schemes use lots of ciphertext from the same key, this
   might add some practical security, at relatively low cost."
   --John Kelsey, c445585@mizzou1.missouri.edu
   
Appendix B. A tool to experimentally investigate entropy

   A practical Unix tool for investigating the entropy of typical user
   keys can be found in Wu and Manber's 'agrep' (approximate grep)
   similarity pattern matching tool available in C source from
   cs.arizona.edu [192.12.69.5]. This tool can determine the "edit
   distance," that is, the number of insertions, substitutions, or
   deletions that would be required of an arbitrary pattern in order for
   it to match any of a large corpus of words or phrases, say the
   usr/dict word list, or over the set of Star Trek trivia archives. The
   user can then adjust the pattern to give an arbitrary high threshold
   difference between it and common words and phrases in the corpus to
   make crack programs that systematically vary known strings less likely
   to succeed. It is often surprising to discover that a substring
   pattern like "hxirtes" is only of edit distance two from as many as
   forty separate words ranging from "bushfires" to "whitest." Certainly
   no password or phrase ought to be chosen as a working password or
   phrase that is within two or fewer edit distance from a known string
   or substring in any on-line collection.
   
Appendix C. An implementation in C of the Message Digest Rev.5 Algorithm

/*

md5.h -- header file for implementation of MD5
RSA Data Security, Inc. MD5 Message-Digest Algorithm
Created: 2/17/90 RLR
Revised: 12/27/90 SRD,AJ,BSK,JT Reference C version
Revised (for MD5): RLR 4/27/91
-- G modified to have y&~z instead of y&z                       --
FF, GG, HH modified to add in last register done             --
Access pattern: round 2 works mod 5, round 3 works mod 3     --
distinct additive constant for each step                     --
round 4 added, working mod 7
*/

/*

Copyright (C) 1990, RSA Data Security, Inc. All rights reserved.
License to copy and use this software is granted provided that it
is identified as the "RSA Data Security, Inc. MD5 Message-Digest
Algorithm" in all material mentioning or referencing this software
or this function.

License is also granted to make and use derivative works
provided that such works are identified as "derived from the RSA
Data Security, Inc. MD5 Message-Digest Algorithm" in all
material mentioning or referencing the derived work.
RSA Data Security, Inc. makes no representations concerning
either the merchantability of this software or the suitability of
this software for any particular purpose.  It is provided "as is"
without express or implied warranty of any kind.

These notices must be retained in any copies of any part of this
documentation and/or software.

*/

typedef unsigned int UINT4;

/* Data structure for MD5 (Message-Digest) computation */
typedef struct {
  UINT4 i[2]; /* number of _bits_ handled mod 2^64 */
  UINT4 buf[4]; /* scratch buffer */
  unsigned char in[64]; /* input buffer */
  unsigned char digest[16]; /* actual digest after MD5Final call
*/
} MD5_CTX;

void MD5Init(MD5_CTX *);
void MD5Update(MD5_CTX *,unsigned char *,unsigned int);
void MD5Final(MD5_CTX *);
void Transform(UINT4 *,UINT4 *);

#include <stdio.h>
#include <stdlib.h>

#include "md5.h"

void
findhash( FILE *fp )
{
        MD5_CTX         ctx;
        char            buffer[ BUFSIZ ];
        size_t          count;
        int             i,j;

        MD5Init( &ctx );
        while ( (count=fread(buffer,1,sizeof(buffer),fp)) > 0 )
                MD5Update( &ctx, buffer, count );
        MD5Final( &ctx );
                j = 0;
        for ( i=0; i<sizeof(ctx.digest); i++ )
        {
                printf( "%02x",
                        (unsigned int) ctx.digest[i],
                        i == (sizeof(ctx.digest)-1) ? '\n' : ' '
);
                                j++;
                                if ( j == 4 || j == 8 || j == 12)
{printf(" ");}
                                if ( j == 16 ) printf("\n");
        }
}




int
main( int argc, char **argv )
{
        int     i;

        if ( argc <= 1 )
        {
                fprintf( stderr, "Usage: %s filename\n", argv[0]
);
                exit( 1 );
        }

        for ( i=1; i<argc; i++ )
        {
                FILE    *fp;

                fp = fopen( argv[i], "rb" );
                if ( fp == NULL )
                {
                        perror( argv[i] );
                        exit( 1 );
                }
                printf( "%s:\t\t", argv[i] );
                findhash( fp );
                fclose( fp );
        }
        return 0;
}




/*

md5.c -- the source code for MD5 routines

RSA Data Security, Inc. MD5 Message-Digest Algorithm
Created: 2/17/90 RLR
Revised: 1/91 SRD,AJ,BSK,JT Reference C Version
*/



/*

Message-digest routines:

To form the message digest for a message M

(1) Initialize a context buffer mdContext using MD5Init
(2) Call MD5Update on mdContext and M
(3) Call MD5Final on mdContext
The message digest is now in mdContext->digest[0...15]

*/

static unsigned char PADDING[64] = {
  0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
  0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
};

/* F, G, H and I are basic MD5 functions */
#define F(x, y, z) (((x) & (y)) | ((~x) & (z)))
#define G(x, y, z) (((x) & (z)) | ((y) & (~z)))
#define H(x, y, z) ((x) ^ (y) ^ (z))
#define I(x, y, z) ((y) ^ ((x) | (~z)))

/* ROTATE_LEFT rotates x left n bits */
#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32-(n))))

/* FF, GG, HH, and II transformations for rounds 1, 2, 3, and 4 */
/* Rotation is separate from addition to prevent recomputation */
#define FF(a, b, c, d, x, s, ac) \
  {(a) += F ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }
#define GG(a, b, c, d, x, s, ac) \
  {(a) += G ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }
#define HH(a, b, c, d, x, s, ac) \
  {(a) += H ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }
#define II(a, b, c, d, x, s, ac) \
  {(a) += I ((b), (c), (d)) + (x) + (UINT4)(ac); \
   (a) = ROTATE_LEFT ((a), (s)); \
   (a) += (b); \
  }

/* The routine MD5Init initializes the message-digest context
   mdContext. All fields are set to zero.
 */
void MD5Init ( MD5_CTX *mdContext)
{
  mdContext->i[0] = mdContext->i[1] = (UINT4)0;

  /* Load magic initialization constants.
   */
  mdContext->buf[0] = (UINT4)0x67452301L;
  mdContext->buf[1] = (UINT4)0xefcdab89L;
  mdContext->buf[2] = (UINT4)0x98badcfeL;
  mdContext->buf[3] = (UINT4)0x10325476L;
}

/*
The routine MD5Update updates the message-digest context to
account for the presence of each of the characters
inBuf[0..inLen-1]
in the message whose digest is being computed.
*/

void MD5Update (register MD5_CTX *mdContext, unsigned char *inBuf,
                 unsigned int inLen)
{
  register int i, ii;
  int mdi;
  UINT4 in[16];

  /* compute number of bytes mod 64 */
  mdi = (int)((mdContext->i[0] >> 3) & 0x3F);

  /* update number of bits */
  if ((mdContext->i[0] + ((UINT4)inLen << 3)) < mdContext->i[0])
    mdContext->i[1]++;
  mdContext->i[0] += ((UINT4)inLen << 3);
  mdContext->i[1] += ((UINT4)inLen >> 29);

  while (inLen--) {
    /* add new character to buffer, increment mdi */
    mdContext->in[mdi++] = *inBuf++;

    /* transform if necessary */
    if (mdi == 0x40) {
      for (i = 0, ii = 0; i < 16; i++, ii += 4)
        in[i] = (((UINT4)mdContext->in[ii+3]) << 24) |
                (((UINT4)mdContext->in[ii+2]) << 16) |
                (((UINT4)mdContext->in[ii+1]) << 8) |
                ((UINT4)mdContext->in[ii]);
      Transform (mdContext->buf, in);
      mdi = 0;
    }
  }
}

/*
The routine MD5Final terminates the message-digest computation and
ends with the desired message digest in mdContext->digest[0...15].
*/
void MD5Final (MD5_CTX *mdContext)
{
  UINT4 in[16];
  int mdi;
  unsigned int i, ii;
  unsigned int padLen;

  /* save number of bits */
  in[14] = mdContext->i[0];
  in[15] = mdContext->i[1];

  /* compute number of bytes mod 64 */
  mdi = (int)((mdContext->i[0] >> 3) & 0x3F);

  /* pad out to 56 mod 64 */
  padLen = (mdi < 56) ? (56 - mdi) : (120 - mdi);
  MD5Update (mdContext, PADDING, padLen);

  /* append length in bits and transform */
  for (i = 0, ii = 0; i < 14; i++, ii += 4)
    in[i] = (((UINT4)mdContext->in[ii+3]) << 24) |
            (((UINT4)mdContext->in[ii+2]) << 16) |
            (((UINT4)mdContext->in[ii+1]) << 8) |
            ((UINT4)mdContext->in[ii]);
  Transform (mdContext->buf, in);

  /* store buffer in digest */
  for (i = 0, ii = 0; i < 4; i++, ii += 4) {
    mdContext->digest[ii] = (unsigned char)(mdContext->buf[i] &
0xFF);
    mdContext->digest[ii+1] =
      (unsigned char)((mdContext->buf[i] >> 8) & 0xFF);
    mdContext->digest[ii+2] =
      (unsigned char)((mdContext->buf[i] >> 16) & 0xFF);
    mdContext->digest[ii+3] =
      (unsigned char)((mdContext->buf[i] >> 24) & 0xFF);
  }
}

/*
Basic MD5 step. Transforms buf based on in.  Note that if the
Mysterious Constants are arranged backwards in little-endian order
and decrypted with the DES they produce OCCULT MESSAGES!
*/
void Transform(register UINT4 *buf,register UINT4 *in)
{
register UINT4 a = buf[0], b = buf[1], c = buf[2], d = buf[3];

  /* Round 1 */
#define S11 7
#define S12 12
#define S13 17
#define S14 22
  FF ( a, b, c, d, in[ 0], S11, 0xD76AA478L); /* 1 */
  FF ( d, a, b, c, in[ 1], S12, 0xE8C7B756L); /* 2 */
  FF ( c, d, a, b, in[ 2], S13, 0x242070DBL); /* 3 */
  FF ( b, c, d, a, in[ 3], S14, 0xC1BDCEEEL); /* 4 */
  FF ( a, b, c, d, in[ 4], S11, 0xF57C0FAFL); /* 5 */
  FF ( d, a, b, c, in[ 5], S12, 0x4787C62AL); /* 6 */
  FF ( c, d, a, b, in[ 6], S13, 0xA8304613L); /* 7 */
  FF ( b, c, d, a, in[ 7], S14, 0xFD469501L); /* 8 */
  FF ( a, b, c, d, in[ 8], S11, 0x698098D8L); /* 9 */
  FF ( d, a, b, c, in[ 9], S12, 0x8B44F7AFL); /* 10 */
  FF ( c, d, a, b, in[10], S13, 0xFFFF5BB1L); /* 11 */
  FF ( b, c, d, a, in[11], S14, 0x895CD7BEL); /* 12 */
  FF ( a, b, c, d, in[12], S11, 0x6B901122L); /* 13 */
  FF ( d, a, b, c, in[13], S12, 0xFD987193L); /* 14 */
  FF ( c, d, a, b, in[14], S13, 0xA679438EL); /* 15 */
  FF ( b, c, d, a, in[15], S14, 0x49B40821L); /* 16 */

  /* Round 2 */
#define S21 5
#define S22 9
#define S23 14
#define S24 20
  GG ( a, b, c, d, in[ 1], S21, 0xF61E2562L); /* 17 */
  GG ( d, a, b, c, in[ 6], S22, 0xC040B340L); /* 18 */
  GG ( c, d, a, b, in[11], S23, 0x265E5A51L); /* 19 */
  GG ( b, c, d, a, in[ 0], S24, 0xE9B6C7AAL); /* 20 */
  GG ( a, b, c, d, in[ 5], S21, 0xD62F105DL); /* 21 */
  GG ( d, a, b, c, in[10], S22, 0x02441453L); /* 22 */
  GG ( c, d, a, b, in[15], S23, 0xD8A1E681L); /* 23 */
  GG ( b, c, d, a, in[ 4], S24, 0xE7D3FBC8L); /* 24 */
  GG ( a, b, c, d, in[ 9], S21, 0x21E1CDE6L); /* 25 */
  GG ( d, a, b, c, in[14], S22, 0xC33707D6L); /* 26 */
  GG ( c, d, a, b, in[ 3], S23, 0xF4D50D87L); /* 27 */
  GG ( b, c, d, a, in[ 8], S24, 0x455A14EDL); /* 28 */
  GG ( a, b, c, d, in[13], S21, 0xA9E3E905L); /* 29 */
  GG ( d, a, b, c, in[ 2], S22, 0xFCEFA3F8L); /* 30 */
  GG ( c, d, a, b, in[ 7], S23, 0x676F02D9L); /* 31 */
  GG ( b, c, d, a, in[12], S24, 0x8D2A4C8AL); /* 32 */

  /* Round 3 */
#define S31 4
#define S32 11
#define S33 16
#define S34 23
  HH ( a, b, c, d, in[ 5], S31, 0xFFFA3942L); /* 33 */
  HH ( d, a, b, c, in[ 8], S32, 0x8771F681L); /* 34 */
  HH ( c, d, a, b, in[11], S33, 0x6D9D6122L); /* 35 */
  HH ( b, c, d, a, in[14], S34, 0xFDE5380CL); /* 36 */
  HH ( a, b, c, d, in[ 1], S31, 0xA4BEEA44L); /* 37 */
  HH ( d, a, b, c, in[ 4], S32, 0x4BDECFA9L); /* 38 */
  HH ( c, d, a, b, in[ 7], S33, 0xF6BB4B60L); /* 39 */
  HH ( b, c, d, a, in[10], S34, 0xBEBFBC70L); /* 40 */
  HH ( a, b, c, d, in[13], S31, 0x289B7EC6L); /* 41 */
  HH ( d, a, b, c, in[ 0], S32, 0xEAA127FAL); /* 42 */
  HH ( c, d, a, b, in[ 3], S33, 0xD4EF3085L); /* 43 */
  HH ( b, c, d, a, in[ 6], S34, 0x04881D05L); /* 44 */
  HH ( a, b, c, d, in[ 9], S31, 0xD9D4D039L); /* 45 */
  HH ( d, a, b, c, in[12], S32, 0xE6DB99E5L); /* 46 */
  HH ( c, d, a, b, in[15], S33, 0x1FA27CF8L); /* 47 */
  HH ( b, c, d, a, in[ 2], S34, 0xC4AC5665L); /* 48 */

  /* Round 4 */
#define S41 6
#define S42 10
#define S43 15
#define S44 21
  II ( a, b, c, d, in[ 0], S41, 0xF4292244L); /* 49 */
  II ( d, a, b, c, in[ 7], S42, 0x432AFF97L); /* 50 */
  II ( c, d, a, b, in[14], S43, 0xAB9423A7L); /* 51 */
  II ( b, c, d, a, in[ 5], S44, 0xFC93A039L); /* 52 */
  II ( a, b, c, d, in[12], S41, 0x655B59C3L); /* 53 */
  II ( d, a, b, c, in[ 3], S42, 0x8F0CCC92L); /* 54 */
  II ( c, d, a, b, in[10], S43, 0xFFEFF47DL); /* 55 */
  II ( b, c, d, a, in[ 1], S44, 0x85845DD1L); /* 56 */
  II ( a, b, c, d, in[ 8], S41, 0x6FA87E4FL); /* 57 */
  II ( d, a, b, c, in[15], S42, 0xFE2CE6E0L); /* 58 */
  II ( c, d, a, b, in[ 6], S43, 0xA3014314L); /* 59 */
  II ( b, c, d, a, in[13], S44, 0x4E0811A1L); /* 60 */
  II ( a, b, c, d, in[ 4], S41, 0xF7537E82L); /* 61 */
  II ( d, a, b, c, in[11], S42, 0xBD3AF235L); /* 62 */
  II ( c, d, a, b, in[ 2], S43, 0x2AD7D2BBL); /* 63 */
  II ( b, c, d, a, in[ 9], S44, 0xEB86D391L); /* 64 */

  buf[0] += a;
  buf[1] += b;
  buf[2] += c;
  buf[3] += d;
}


/*

MD5 test vectors:
d41d8cd98f00b204e9800998ecf8427e ""
0cc175b9c0f1b6a831c399e269772661 "a"
900150983cd24fb0d6963f7d28e17f72 "abc"
f96b697d7cb7938d525a2f31aaf161d0 "message digest"
c3fcd3d76192e4007dfb496cca67e13b "abcdefghijklmnopqrstuvwxyz"
d174ab98d277d9f5a5611c2c9f419d9f
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijk\
lmnopqrstuvwxyz0123456789"
57edf4a22be3c955ac49da2e2107b67a
"1234567890123456789012345678901234567\
8901234567890123456789012345678901234567890"
900150983cd24fb0d6963f7d28e17f72 foo

*/

Appendix D. The Secure Hash Algorithm

/* Implementation of NIST's Secure Hash Algorithm (FIPS 180)
 * Lightly bummed for execution efficiency.
 *
 * Jim Gillogly 3 May 1993
 *
 * Compile: cc -O -o sha sha.c
 *
 * To remove the test wrapper and use just the nist_hash()
routine,
 * compile with -DONT_WRAP
 *
 * Usage: sha [-vt] [filename ...]
 *
 *      -v switch: output the filename as well
 *      -t switch: suppress spaces between 32-bit blocks
 *
 *      If no input files are specified, process standard input.
 *
 * Output: 40-hex-digit digest of each file specified (160 bits)
 *
 * Synopsis of the function calls:
 *
 *   sha_file(char *filename, unsigned long *buffer)
 *      Filename is a file to be opened and processed.
 *      buffer is a user-supplied array of 5 or more longs.
 *      The 5-word buffer is filled with 160 bits of non-
terminated hash.
 *      Returns 0 if successful, non-zero if bad file.
 *
 *   void sha_stream(FILE *stream, unsigned long *buffer)
 *      Input is from already-opened stream, not file.
 *
 * void sha_memory(char *mem, long length, unsigned long *buffer)
 *      Input is a memory block "length" bytes long.
 *
 * Caveat:
 * Not tested for case that requires the high word of the length,
 *      which would be files larger than 1/2 gig or so.
 *
 * Limitation:
 * sha_memory (the memory block function) will deal with blocks no
longer
 * than 4 gigabytes; for longer samples, the stream version will
 * probably be most convenient (e.g. perl moby_data.pl | sha).
 *
 * Bugs:
 *      The standard is defined for bit strings; I assume bytes.
 *
 * test vector:
"abcdbcdecdefdefgefghfghighijhijkijkljklmklmnlmnomnopnopq"
 * should compute:
 *                  0xd2516ee1L
 *                  0xacfa5bafL
 *                  0x33dfc1c4L
 *                  0x71e43844L
 *                  0x9ef134c8L
 *
 * test vector: "abc"
 * should compute:
 *                  0x0164b8a9L
 *                  0x14cd2a5eL
 *                  0x74c4f7ffL
 *                  0x082c4d97L
 *                  0xf1edf880L
 *
 * Copyright 1993, Dr. James J. Gillogly
 * This code may be freely used in any application.
 */

#include <stdio.h>
#include <memory.h>

#define TRUE  1
#define FALSE 0

#define SUCCESS 0
#define FAILURE -1

int sha_file();                         /* External entries */
void sha_stream(), sha_memory();

static void nist_guts();

#ifndef ONT_WRAP        /* Using just the hash routine itself */

#define HASH_SIZE 5     /* Produces 160-bit digest of the message
*/

main(argc, argv)
int argc;
char **argv;
{
    unsigned long hbuf[HASH_SIZE];
    char *s;
    int file_args = FALSE;  /* If no files, take it from stdin */
    int verbose = FALSE;
    int terse = FALSE;

#ifdef MEMTEST
    sha_memory("abc", 3l, hbuf);         /* NIST test value from
appendix A */
    if (verbose) printf("Memory:");
    if (terse) printf("%08x%08x%08x%08x%08x\n",
        hbuf[0], hbuf[1], hbuf[2], hbuf[3], hbuf[4]);
    else printf("%08x %08x %08x %08x %08x\n",
        hbuf[0], hbuf[1], hbuf[2], hbuf[3], hbuf[4]);
#endif

    for (++argv; --argc; ++argv)  /* March down the arg list */
    {
        verbose = TRUE;
        if (**argv == '-')         /* Process one or more flags */
            for (s = &(*argv)[1]; *s; s++) /* Obfuscated C contest
entry */
                switch(*s)
                {
                    case 'v': case 'V':
                        verbose = TRUE;
                        break;
                    case 't': case 'T':
                        terse = TRUE;
                        break;
                    default:
                        fprintf(stderr, "Unrecognized flag: %c\n", *s);
                        return FALSE;
                }
        else                            /* Process a file */
        {
            if (verbose) printf("%s:\t\t", *argv);
            file_args = TRUE;           /* Whether or not we could
read it */

            if (sha_file(*argv, hbuf) == FAILURE)
                printf("Can't open file %s.\n", *argv);
            else
                if (terse) printf("%08x%08x%08x%08x%08x\n",
                    hbuf[0], hbuf[1], hbuf[2], hbuf[3], hbuf[4]);
                else printf("%08x %08x %08x %08x %08x\n",
                    hbuf[0], hbuf[1], hbuf[2], hbuf[3], hbuf[4]);
        }
    }
    if (! file_args)    /* No file specified */
    {
        if (verbose) printf("%s:\t\t", *argv);
        sha_stream(stdin, hbuf);

        if (terse) printf("%08x%08x%08x%08x%08x\n",
            hbuf[0], hbuf[1], hbuf[2], hbuf[3], hbuf[4]);
        else printf("%08x %08x %08x %08x %08x\n",
            hbuf[0], hbuf[1], hbuf[2], hbuf[3], hbuf[4]);
    }
    return TRUE;
}

#endif ONT_WRAP

union longbyte
{
    unsigned long W[80];        /* Process 16 32-bit words at a
time */
    char B[320];                /* But read them as bytes for
counting */
};

sha_file(filename, buffer)      /* Hash a file */
char *filename;
unsigned long *buffer;
{
    FILE *infile;

    if ((infile = fopen(filename, "rb")) == NULL)
    {
        int i;

        for (i = 0; i < 5; i++)
            buffer[i] = 0xdeadbeef;
        return FAILURE;
    }
    (void) sha_stream(infile, buffer);
    fclose(infile);
    return SUCCESS;
}

void sha_memory(mem, length, buffer)    /* Hash a memory block */
char *mem;
unsigned long length;
unsigned long *buffer;
{
    nist_guts(FALSE, (FILE *) NULL, mem, length, buffer);
}

void sha_stream(stream, buffer)
FILE *stream;
unsigned long *buffer;
{
    nist_guts(TRUE, stream, (char *) NULL, 0l, buffer);
}

#define f0(x,y,z) (z ^ (x & (y ^ z)))           /* Magic functions
*/
#define f1(x,y,z) (x ^ y ^ z)
#define f2(x,y,z) ((x & y) | (z & (x | y)))
#define f3(x,y,z) (x ^ y ^ z)

#define K0 0x5a827999                           /* Magic constants
*/
#define K1 0x6ed9eba1
#define K2 0x8f1bbcdc
#define K3 0xca62c1d6

#define S(n, X) ((X << n) | (X >> (32 - n)))    /* Barrel roll */

#define r0(f, K) \
    temp = S(5, A) + f(B, C, D) + E + *p0++ + K; \
    E = D;  \
    D = C;  \
    C = S(30, B); \
    B = A;  \
    A = temp

#define r1(f, K) \
    temp = S(5, A) + f(B, C, D) + E + \
           (*p0++ = *p1++ ^ *p2++ ^ *p3++ ^ *p4++) + K; \
    E = D;  \
    D = C;  \
    C = S(30, B); \
    B = A;  \
    A = temp

static void nist_guts(file_flag, stream, mem, length, buf)
int file_flag;                  /* Input from memory, or from
stream? */
FILE *stream;
char *mem;
unsigned long length;
unsigned long *buf;
{
    int i, nread, nbits;
    union longbyte d;
    unsigned long hi_length, lo_length;
    int padded;
    char *s;

    register unsigned long *p0, *p1, *p2, *p3, *p4;
    unsigned long A, B, C, D, E, temp;

    unsigned long h0, h1, h2, h3, h4;

    h0 = 0x67452301;                            /* Accumulators */
    h1 = 0xefcdab89;
    h2 = 0x98badcfe;
    h3 = 0x10325476;
    h4 = 0xc3d2e1f0;

    padded = FALSE;
    s = mem;
    for (hi_length = lo_length = 0; ;)  /* Process 16 longs at a
time */
    {
        if (file_flag)
        {
                nread = fread(d.B, 1, 64, stream);  /* Read as 64
bytes */
        }
        else
        {
                if (length < 64) nread = length;
                else             nread = 64;
                length -= nread;
                memcpy(d.B, s, nread);
                s += nread;
        }
        if (nread < 64)   /* Partial block? */
        {
                nbits = nread << 3;               /* Length: bits */
                if ((lo_length += nbits) < nbits)
                        hi_length++;              /* 64-bit integer */

                if (nread < 64 && ! padded)  /* Append a single bit */
                {
                        d.B[nread++] = 0x80; /* Using up next byte */
                        padded = TRUE;       /* Single bit once */
                }
                for (i = nread; i < 64; i++) /* Pad with nulls */
                        d.B[i] = 0;
                if (nread <= 56)   /* Room for length in this block */
                {
                        d.W[14] = hi_length;
                        d.W[15] = lo_length;
                }
        }
        else    /* Full block -- get efficient */
        {
                if ((lo_length += 512) < 512)
                        hi_length++;    /* 64-bit integer */
        }

        p0 = d.W;
        A = h0; B = h1; C = h2; D = h3; E = h4;

        r0(f0,K0); r0(f0,K0); r0(f0,K0); r0(f0,K0); r0(f0,K0);
        r0(f0,K0); r0(f0,K0); r0(f0,K0); r0(f0,K0); r0(f0,K0);
        r0(f0,K0); r0(f0,K0); r0(f0,K0); r0(f0,K0); r0(f0,K0);
        r0(f0,K0);

        p1 = &d.W[13]; p2 = &d.W[8]; p3 = &d.W[2]; p4 = &d.W[0];

                   r1(f0,K0); r1(f0,K0); r1(f0,K0); r1(f0,K0);
        r1(f1,K1); r1(f1,K1); r1(f1,K1); r1(f1,K1); r1(f1,K1);
        r1(f1,K1); r1(f1,K1); r1(f1,K1); r1(f1,K1); r1(f1,K1);
        r1(f1,K1); r1(f1,K1); r1(f1,K1); r1(f1,K1); r1(f1,K1);
        r1(f1,K1); r1(f1,K1); r1(f1,K1); r1(f1,K1); r1(f1,K1);
        r1(f2,K2); r1(f2,K2); r1(f2,K2); r1(f2,K2); r1(f2,K2);
        r1(f2,K2); r1(f2,K2); r1(f2,K2); r1(f2,K2); r1(f2,K2);
        r1(f2,K2); r1(f2,K2); r1(f2,K2); r1(f2,K2); r1(f2,K2);
        r1(f2,K2); r1(f2,K2); r1(f2,K2); r1(f2,K2); r1(f2,K2);
        r1(f3,K3); r1(f3,K3); r1(f3,K3); r1(f3,K3); r1(f3,K3);
        r1(f3,K3); r1(f3,K3); r1(f3,K3); r1(f3,K3); r1(f3,K3);
        r1(f3,K3); r1(f3,K3); r1(f3,K3); r1(f3,K3); r1(f3,K3);
        r1(f3,K3); r1(f3,K3); r1(f3,K3); r1(f3,K3); r1(f3,K3);

        h0 += A; h1 += B; h2 += C; h3 += D; h4 += E;

        if (nread <= 56) break; /* If it's greater, length in next
block */
    }
    buf[0] = h0; buf[1] = h1; buf[2] = h2; buf[3] = h3; buf[4] =
h4;
}


Appendix E. Select References

   Selection and of Passwords in Differing Threat Environments,
   Department of Defense Password Management Guideline, CSC-STD-002-85,
   published by the Computer Security Center of the Department of Defense
   Fort George G. Meade, MD 20755
   
   Discovering Weak Passwords, The COPS Security Checker System by D.
   Farmer, E. Spafford, Purdue University Technical Report CSD-TR-993,
   West Lafayette, IN 47907
   
   An Example of Automated Key Cracking, With Microscope and Tweezers: An
   Analysis of the Internet Virus of 1988, by M. Eichin, J. Rochlis,
   Massachusetts Institute of Technology, Cambridge, MA 02139
   
   Password Vulnerabilities in Distributed Systems, Computer Emergency
   Response - An International Problem, by R. Pethia, K. van Wyk
   CERT/Software Engineering Institute, Carnegie Mellon University,
   Pittsburgh, PA 15213
   
   Key Metrics and the MD5 Message Digest Algorithm, Answers to
   Frequently Asked Questions About Today's Cryptography, Second edition,
   by Paul Fahn, RSA Laboratories, Redwood City, CA 94065
   (available through anonymous FTP from rsa.com)
   
   Implementation Details of the MD5 Message Digest Algorithm, RFC-1321
   ('request for comments') The MD5 algorithm, by R. Rivest, MIT Center
   for Computer Science
   (available on the internet from gatekeeper.dec.com)
   
   Implementation Details of the NIST Secure Hash Standard, The Secure
   Hash Standard (SHS) Specification, Jan 1992 DRAFT Federal Information
   Processing Standards Publication YY Director, Computer Systems
   Laboratory, National Institute of Standards and Technology,
   Gaithersburg, MD 20899
   (The SHS was approved as a Federal Standard in May, 1993)
   
   Other Possible Approaches to Password Generation, Automated Password
   Generator, NIST publication ????, Director, Computer Systems
   Laboratory, National Institute of Standards and Technology,
   Gaithersburg, MD 20899
   (a pronounceable password algorithm using DES)
     _________________________________________________________________
   
   PGP main page
   Meine Homepage 
   UNIX-AG Homepage
     _________________________________________________________________
   
   
    conrad@unix-ag.uni-kl.de