Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C++

Applied Crypto++: Pseudo Random Number Generators

4.77/5 (35 votes)
10 Apr 2008CPOL17 min read 3   5.9K  
How to Choose a Crypto++ Pseudo Random Number Generator

Random numbers should not be generated
with a method chosen at random

- Donald Knuth
The Art of Computer Programming
Volume II: Seminumerical Algorithms

Introduction

Random numbers are a primitive element not only to cryptographers, but Computer Scientist and Programmers in general. What is desired is a method which produces a pseudo random stream of numbers fast which are cryptographically secure. However, the goals are diametrically opposed - pseudo random sequences can be produced quickly, or can be produced strongly; but usually not quickly with properties which resist Cryptanalysis.

This article will examine the applications of Pseudo Random Number Generators. Topics covered will include:

  • Usefulness of Random Numbers
  • Random Number Generator Classes
  • Types of Generators
  • Random Number Sources
  • Testing Sequences
  • Seeding
  • Key Derivation Function
  • Standard C++ rand( )
  • Linear Congruential Generator
  • Minimal Standard Generator
  • Non Linear Congruential Generators
  • Win32 API CryptGenRandom( )
  • Compiling and Integrating Crypto++
  • Crypto++ Random Number Generators
    • LC_RNG
    • RandomPool
    • AutoSeededRandomPool
    • AutoSeeded917RNG
  • Application Table
  • Additional CSPRNGs
  • Poor Generators
    • Middle Square Method
    • IBM Mainframe RANDU

Downloads

There are seven downloads available with this article. The downloads are presented at the end of the article.

Usefulness of Random Numbers

In his prolific work <a href="http: en.wikipedia.org="" wiki="" the_art_of_computer_programming"="">The Art of Computer Programming, Volume II: Seminumerical Algorithms, Donald Knuth identifies the following common situations in which one may require random numbers:

  • Simulation
  • Sampling
  • Numerical Analysis
  • Computer Programming
  • Decision Making
  • Aesthetics
  • Recreation

Simulation and sampling are self explanatory. Numerical Analysis uses random numbers to solve complicated problems. For example, a Predator/Prey model using differential equations. Four is equally obvious. Taking from Knuth on item five, Decision Making:

There are reports that many executives make their decisions by flipping a coin or by throwing darts, etc. It is also rumored that some college professors prepare their grades on such a basis. Sometimes it is important to make a completely "unbiased" decision. Randomness is also an essential part of optimal strategies in the theory of matrix games.

The human ear can distinguish between synthesized music (generated from a computer) versus that of an artist. The artist will have small deviations in rhythm which makes the music more pleasing. Additionally, a small bit of randomness makes computer generated graphics appear softer. Note the rigid structure below of the left image versus the softness of the right image. Below the right image was produced with a filter which added a small random amount noise. In Signal Processing, this is known as <a href="http: en.wikipedia.org="" wiki="" anti-aliasing"="">Anti-Aliasing.

Image 1

Image 2

No Random Noise

Random Noise

To further impress aesthetics upon the reader, Adobe Photoshop employs random numbers in many of its filters, including Blur, Distort, and Noise.

Image 3

Adobe Photoshop Filters

Finally, recreation would include shuffling cards or rolling dice.

Random Number Generator Classes

Taking frm NIST, FIPS 140-2:

Random Number Generators fall into one of two classes: deterministic and nondeterministic. A deterministic RNG consists of an algorithm that produces a sequence of bits from an initial value called a seed. A nondeterministic RNG produces output that is dependent on some unpredictable physical source that is outside human control.

Note that layman generally refer to the nondeterministic generator as a 'true random number generator'.

Types of Random Number Generators

There are two types of deterministic random number generartors from which a programmer may choose:

  • Pseudo Random Number Generators
  • Cryptographically Secure Pseudo Random Number Generators

Pseudo Random Number Generators would include generators such as Linear Congruential Generators and Mersenne Twisters. They are generally good at quickly providing a uniformly distributed stream over the interval [0, 1). They offer little to no cryptographic security.

Cryptographically Secure Pseudo Random Number Generators have additional properties which make them suitable for use in Cryptography. Cryptographic uses would include:

  • Key Generation
  • Nonces
  • Salts
  • One Time Pads

A Nonce is a number or bit string used only once. It is a pseudo random number issued in an authentication protocol to ensure that old communications cannot be reused in replay attacks. To ensure that a nonce is used only once, it should be time-variant (including a suitably granular timestamp in its value), or generated with enough random bits to ensure a probabilistically insignificant chance of repeating a previously generated value.

The One Time Pad (invented circa 1917) is an encryption algorithm where the plaintext is combined with a random key or "pad" that is as long as the plaintext and used only once. A modular addition is used to combine the plaintext with the pad. For binary data, the operation XOR amounts is the equivalent. If the key is truly random, never reused, and kept secret, the one time pad provides perfect secrecy.

Random Number Sources

Though NIST does not currently recognize any nondeterministic RNGs, one may use a deterministic RNG to seed a cryptographically secure pseudo random number generator. Taking from FIPS 140-2:

Until such time as an Approved nondeterministic RNG standard exists, nondeterministic RNGs approved for use in classified applications may be used for key generation or to seed Approved deterministic RNGs used in key generation

NIST provides tests which allows one to develop heuristics for determining the of the quality of the sequence from the generator inquestion. Included for download are NIST Validation Suite, FIPS 140-2, and and FIPS 186. Additionally, the reader may want to examine ANSI 9.17, Appendix C (Approved Random Number Generators for FIPS 140-2, Security Requirements for Cryptographic Modules).

Also of interset is NIST SP800-90, <a href="http: csrc.nist.gov="" publications="" nistpubs="" 800-90="" sp800-90revised_march2007.pdf" target="_blank">Recommendation for Random Number Generation Using Deterministic Random Bit Generators. Some conservative cryptographers do not recommend using Dual_EC-DRBG (the elliptic curve variant) but does recommend using CTR_DRBG or Hash_DRBG. Schneier's article can be found at <a href="http: www.wired.com="" politics="" security="" commentary="" securitymatters="" 2007="" 11="" securitymatters_1115"="">Did NSA Put a Secret Backdoor in New Encryption Standard?

Background on Tests

In a random sequence, on would expect each of the ten decimal digits to occur approximately 1/10 times. Should the radix be 2 (digits of either 0 or 1), each digit should represent approximately 50% of the sequence.

Testing involves differentiating good sources from poor choices. For example, the binary stream ...1111111110000000000... would perform ideally with a simple Frequency test, but fail at advanced tests such as Runs, Longest Runs in a Block, and Cumulative Sums. A counter intuitive point with the presented binary stream is that it is a valid sequence of random numbers. Each stream is as equally likely to occur as any other in an unbiased generator.

Testing Sequences

Below the reader will find various tests which should be used when evaluating the effectiveness of a generator. The reader should refer to NIST's Guide to the Statistical Tests and Knuth's The <a href="http: en.wikipedia.org="" wiki="" the_art_of_computer_programming"="">Art of Computer Programming for a complete description.

In addition, NIST's SP800-22, A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications should be consulted. NIST's site includes software for testing the generator.

NIST requires a PRNG pass 16 statistical tests. The tests are listed below with a brief description.

  • Frequency - proportion of zeroes and ones for the entire sequence.
  • Frequency within a Block - determine whether the frequency of ones is an M-bit block is approximately M/2.
  • Runs - the total number of zero and one runs in the entire sequence, where a run is an uninterrupted sequence of identical bits.
  • Longest Run of Ones in a Block - determine whether the length of the longest run of ones within the tested sequence is consistent with the length of the longest run of ones that would be expected in a random sequence.
  • Random Binary Matrix Rank - check for linear dependence among fixed length substrings of the original sequence.
  • Discrete Fourier Transform (Spectral) - detect periodic features.
  • Non-overlapping (Aperiodic) Template Matching - number of occurrences of predefined target substrings.
  • Overlapping (Periodic) Template Matching - number of occurrences of predefined target substrings.
  • Maurer's Universal Statistical Test - number of bits between matching patterns. The purpose of the test is to detect whether or not the sequence can be significantly compressed without loss of information.
  • Lempel-Ziv Complexity - how far the tested sequence can be compressed. The sequence is considered to be non-random if it can be significantly compressed.
  • Linear Complexity - length of a generating feedback register.
  • Serial - frequency of each and every overlapping m-bit pattern across the entire sequence.
  • Approximate Entropy - frequency of each and every overlapping m-bit pattern.
  • Cumulative Sum - determine whether the cumulative sum of the partial sequences occurring in the tested sequence is too large or too small relative to the expected behavior of that cumulative sum for random sequences.
  • Random Excursions - determine if the number of visits to a state within a random walk exceeds what one would expect for a random sequence.
  • Random Excursions Variant - detect deviations from the expected number of occurrences of various states in the random walk.

Knuth sates a random sequence should pass 13 statistical tests. Tests which are not included in NIST's requirements are listed below.

  • Chi-Square - Classic Definition
  • Kolmogorov-Smirnov - Extends the Chi-Square test to the set of Real Numbers
  • Gap - detect gaps between a number over a range of numbers in a sequence
  • Poker (Partition) - n groups of five successive integers from the stream, observing the resulting pattern. One pair is aabcd, full house is aaabb, etc.
  • Coupon Collector's - Examines the length of the sequence required to observe all numbers in the set 0 to d-1.
  • Permutation - number of orderings of grouping the sequence into partitions
  • Collision - Used when the Chi-Squared Test exceeds a certain number of collisions
  • Birthday Spacing - Similar to the Birthday Paradox when selecting two integers in the sequence

Seeding

<a href="http: en.wikipedia.org="" wiki="" cryptographically_secure_pseudorandom_number_generator"="">Cryptographically Secure RNGs share an Achilles' heel with the other pseudo random number generators - both require an starting seed as an input. Should the secret seed become compromised, an adversary can generate the entire output sequence instantly by using the same algorithm.

In 1995 David Wagner, then a graduate student at Berkeley, and fellow student Ian Goldberg cracked the random number generator used by the Netscape web browser to secure online transactions. The pair determined that Netscape was constructing its seeds using easily predicted numbers, such as the time of day.

Key Derivation Functions

Key derivation functions are used (among others) to derive keys from secret passwords or passphrases. For guidance one can use <a href="http: www.faqs.org="" rfcs="" rfc2898.html"="" target="_blank">RFC 2898, Password-Based Cryptography Specification. In addition, the .NET platform provides <a href="http: msdn2.microsoft.com="" en-us="" library="" system.security.cryptography.rfc2898derivebytes.aspx" target="_blank">Rfc2898DeriveBytes, which is part of <a href="http: msdn2.microsoft.com="" en-us="" library="" system.security.cryptography.aspx" target="_blank">System.Security.Cryptography.

Key derivation functions are often used in conjunction with non-secret parameters to derive a keys from a common secret value. A KDF may also be used to ensure that derived keys have other desirable properties, such as avoiding weak keys in some specific encryption systems.

Key derivation functions are also used in applications to derive keys from secret passwords or passphrases, which typically do not have the desired properties to be used directly as cryptographic keys. In such applications, it is generally recommended that the key derivation function be made deliberately slow so as to frustrate brute force attack or dictionary attack on the password or passphrase input value.

Such use may be expressed as Dk = KDF(Key, Salt, Iterations) where Dk is the derived key, KDF is the key derivation function, Key is the original key or password, Salt is a random number which acts as cryptographic salt, and Iterations refers to the number of iterations of a sub-function. The derived key is used instead of the original key or password as the key to the system. The values of the salt and the number of iterations (if it isn't fixed) are stored with the hashed password or sent as plaintext with an encrypted message.

Standard C++ rand( ) Function

The Standard C++ rand() function uses a linear congruential generator. It offers a uniformly distributed bit stream quickly when the parameters m, a, c, and X0 are appropriately chosen. The LCG offers no Cryptographic Security.

X0 is colloquially referred to as the seed, often using the system time. The generator obtains numbers by using the following recurrence relation:

Xn+1 = (aXn + c) mod m

The values chosen for the generator in Microsoft's Visual C++ environment are shown below. Note that the & 0x7FFF is performing the modular reduction.

return (((holdrand = holdrand * 214013L + 2531011L) >> 16) & 0x7fff);

Joan Boyar's PhD dissertation, <a href="http: www.springerlink.com="" content="" p4526x2j040m7j12="" "="">Inferring Sequences Produced by a Linear Congruential Generator Missing low-order Bits is an interesting read. The astute reader should realize there probably exists an online gaming site using an insecure system. Note that this is a different attack than Wonging (named after Stanford Wong, a former IBM researcher and professional gambler), which is basically a card counting system.

Minimal Standard Generator

First proposed by Miller, Lewis, and Goldman in 1969, this generator is is a linear congruential generator with c=0. The refinement of dropping the constant resulted in a Multiplcative Generator:

Xn+1 = aXn mod m

Park and Miller suggest the values of a = 75 = 16807, m = 231-1 = 2147483647. 231-1 produces a period of 231-2. Other valuse of a exist when using 231-1: 48271 and 69627. No other values should be used.

Non Linear Congruential Generator

There are many fast random number generators available for use as a replacement to the standard C/C++ library's rand() and srand() funtions. The reader should familiarize themselves with <a href="http: portal.acm.org="" citation.cfm?id="63042"">Random Number Generators: Good Ones Are Hard To Find. Steve Park provides the source code to the Lehmer generator at his <a href="http: www.cs.wm.edu="" ~va="" software="" park="" park.html"="">Random Number Generator webpage. Park's code provides additional PRNGs based on the following distributions (these generators are referred to as Non Linear Congruential Generators):

  • Bernoulli
  • Binomial
  • Equilikely
  • Geometric
  • Pascal
  • Poisson

A Mersenne Twister, developed in 1997 by Makoto Matsumoto and Takuji Nishimura would also fall under this category. An interesting factiod in the world of computer viruses is that <a href="http: www.symantec.com="" enterprise="" security_response="" weblog="" 2007="" 03="" the_dread_pirate_roberts.html" target="_blank">30 viruses use the generator. It is speculated one author is responsible for these viruses.

Win32 API CryptGenRandom( )

CryptGenRandom() generates a requested number of bytes for the user. GenCryptRandom() is an excellent source of entropy on Windows since it is available for all Operating Systems except NT 3.51 and earlier, and Windows 95 (SR1) and earlier.

The latest seed value for CryptGenRandom is stored in the Windows Registry under the key \SOFTWARE\Microsoft\Cryptography\RNG\Seed of the HKEY_LOCAL_MACHINE hive. The seed is developed by the Operating Systems using various system parameters. For example, on a Windows CE device, the following will be used:

  • Thread and kernel switches (CeGetRandomSeed)
  • The current process identifier (GetCurrentProcessId)
  • The current thread identifier (GetCurrentThreadId)
  • Ticks since boot (GetTickCount)
  • Current time (GetLocalTime)
  • Memory information (GlobalMemoryStatus)
  • Object store statistics (GetStoreInformation)

After the seed is developed, it undergoes two cryptographic transforms: an MD4 hash and a RC4 encryption for additional mixing and chopping.

Sample 1 demonstrates using the Windows API to acquire pseudo random values. To remove the burden of an SDK, the program uses LoadLibrary() and GetProcAddress().

Compiling and Integrating Crypto++

Crypto++Please see the related article, Compiling and Integrating Crypto++ into the Microsoft Visual C++ Environment. This article is based upon basic assumptions presented in the previously mentioned article. It also addresses most problems encountered with projects from Command Line to MFC (Errors C1083, C1189, LNK1104, LNK2001, and LNK2005). Additionally, it provides some tips and other niceties for using the Crypto++ Library.

LC_RNG

LC_RNG is a Linear Congruential RNG. Though this generator has no cryptographic value, it does allow one to reproduce results when debugging a program. Additionally, it is generally faster at generating a byte block (or stream). If one seeds the LCG with 0x00, a steady stream of 0x80 is the result. This seed value produces a period of 1. Other seeds perform as desired. Sample 1 may be used to compare the results of various seed values of Linear Congruential Generators.

RandomPool

The RandomPool behaves similar to an LCG in that the same seed produces the same results. However, unlike LC_RNG, the cipher behind the RandomPool is currently MD5<SHA>. randpoool.cpp provides

typedef
MDC<SHA> RandomPoolCipher
. Then RandomPool would be initialized and used as follows (as demonstrated in Sample 3):

// Must be at least 16 for RandomPool
const unsigned int SEEDSIZE = 16;
byte pcbSeed[ SEEDSIZE ];

// Scratch Area
const unsigned int BLOCKSIZE = 16 * 8;
byte pcbScratch[ BLOCKSIZE ];

...

// Random Pool Initialization
CryptoPP::RandomPool rng( SEEDSIZE );
rng.Put( pcbSeed, SEEDSIZE );

// Use
rng.GenerateBlock( pcbScratch, BLOCKSIZE );

AutoSeededRandomPool

An auto seeded random pool was suggested by Leonard Janke, which Wei Dai later incorporated into Crypto++. AutoSeededRandomPool uses the RandPool design from PGP. It is suitable for all cryptographic purposes including generating keys and IVs. Depending on the Operating System, the generator will be seeded using the following:

  • CryptGenRandom() by way of a Cryptographic Service Provider
  • /dev/urand
  • /dev/rand

Sample 4 trivially demonstrates using an AutoSeedeRandomPool.

// Scratch Area
const unsigned int BLOCKSIZE = 16 * 8;
byte pcbScratch[ BLOCKSIZE ];

// Construction
AutoSeededRandomPool rng;

// Random Block
rng.GenerateBlock( pcbScratch, BLOCKSIZE );

AutoSeededX917RNG

Unlike LG_RNG and RandomPool, AutoSeededX917RNG does not require seeding. However, one must specify an approved Block Cipher as a template parameter. The two ANSI 9.17 approved ciphers are 3-key Triple DES and SHA1. Sample 5 demonstrates use with SHA1.

// Scratch Area
const unsigned int BLOCKSIZE = 16 * 8;
byte pcbScratch[ BLOCKSIZE ];

// Construction
AutoSeededX917RNG< DES_EDE3 > rng;

// Random Block
rng.GenerateBlock( pcbScratch, BLOCKSIZE );

Application Table

The following table should be applied when chosing a Random Number Generator in practice.

Image 5

Additional CSPRNGs

In addition to the previously mentioned, various FIPS standards recognizes other cryptographically secure generators. As the reader should now realize, a Cryptographically Secure Pseudo Random Number Generator wraps a deterministic generator in a difficult problem. FIPS 186-3 approves the Digital Signature Algorithm (DSA) and Elliptic Curve DSA (ECDSA) as CSPRNGs.

Poor Generators

This section is included for pandectic resaons. I hope the reader will appreciate (and sometimes enjoy) the shortcomings.

Middle Square Method

The Middle Square Method of generation was proposed by John von Nuemann. The generator was used from 1946 to the mid 1950s. Keep in mind that in 1946, the computer was 1 year old. The generator suffered from the fact it converged to a sequence of 0s too quickly; and once at 0, it was latched at 0.

IBM Mainframe RANDU

RANDU was a congruential generator supplied by IBM for use on it's mainfame computers. The choice of a=65539 and m=231 proved to be a poor choice. When the authors of Numerical Recipies in C generated a sample and plotted the planes, only 11 planes existed (where there should have been on the order of m1/k, where k is the number of random numbers chosen plotted in three dimensional space. The generator suffered from Correlation in k-space. The IBM representative stated:

<blockquore>

We guarantee that each number is random individually, but we don't guarantee that more than one of them is random.

Subtract with Borrow

In 1992, physicists discovered that even "high-quality" random-number generators, can yield incorrect results under certain circumstances. In preparation for simulating the three-dimensional Ising model, researchers tested the package on the two-dimensional version, which has a known answer. The result was incorrect. [1]

Summary

This article presented the reader with with various choices for a pseudo random number generator based on the problem domain. Should the reader require a source for Simulation or Sampling, choose an LCG. If the reader requires strength, choose the Win32 API, AutoSeededRandomPool, or AutoSeededX917RNG. If the reader requires Cryptographic Security, choose an AutoSeededX917. And finally, if the reader requires entropy, use Win32 API or AutoSeededRandomPool.

In addition, other generators exist (some of which are exposed in Crypto++). For example, the Blum Blum Shub CSPRNG produces a bit sequence based on Quadratic Residues. The generator will remain cryptographically secure as long as Integer Factorization remains hard. It has yet to be determined where Integer Factorization lies in Complexity Theory.

Acknowledgements

  • Wei Dai for Crypto++ and his invaluable help on the Crypto++ mailing list
  • Dr. A. Brooke Stephens who laid my Cryptographic foundations

Revisions

  • 04.10.2008 Added Reference to Schneier's Wired.com Article
  • 11.28.2007 Added 'Poor Generator' Section
  • 11.07.2007 Added Reference to NIST Test Suite
  • 09.13.2007 Added Explaination of Nonce
  • 09.05.2007 Added Reference to NIST 800-90
  • 09.05.2007 Added Reference to NIST 800-22
  • 08.28.2007 Added Reference to Mersenne Twister
  • 08.27.2007 Added Reference to RFC 2898
  • 01.12.2007 Added Non-Linear Congruential Generators
  • 01.09.2007 Expanded Usefulness of Random Numbers
  • 01.07.2007 Added Reference to Leonard Janke
  • 01.05.2007 Added Application Table
  • 01.04.2007 Added Key Derivation Function
  • 01.03.2007 Initial Release

Downloads

Checksums

  • ASRP.zip
    MD5: 773B2C409A7140CBA16EFF7903AA2C11
    SHA-1: 9CD4BC2C3E156157AA10ADF62994D4065980EC80
  • AutoSeededX917.zip
    MD5: B855380703A24DF9671AE8C7146113A9
    SHA-1: 1B724CD0C28CC4909F7A14F374EE1874CFB49E42
  • FIPS140-2.zip
    MD5: 7E106DBF757F281C764AD6F09DFEF4DA
    SHA-1: 0ACD845C20B4F8FA5EA58C3C04270A8DFB6DCAC7
  • GenRandom.zip
    MD5: B99A0CCC8BA862350235EBEBDFC3A0D2
    SHA-1: 33EA241D0E86E8AB6D86F64FB62C7C67678163CC
  • LCG.zip
    MD5: 0F4A91276B2CE558DB8C8F65218AA32D
    SHA-1: BB598A1A9FC13B519BD5D5CDB37FA52B39E1230C
  • RNGVS.zip
    MD5: 6E608C5AFFFE4D906E435C39BC2CC687
    SHA-1: ED5E73E39E3ADBEDA008A175E4243D8A59ABD254
  • RandomPool.zip
    MD5: 345CF4D59BED53B501ED1D58EF0FD892
    SHA-1: E699F73A53A57178E150A1A020DE4B173F4FF1E9

References

[1] The Bias of Random-Number Generators, <a href="http: www.sciencenews.org="" articles="" 20030927="" mathtrek.as"="">http://www.sciencenews.org/articles/20030927/mathtrek.as, accessed November 2007.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)