(untagged)

Fortuna - A Cryptographically Secure Pseudo Random Number Generator

citadel

0.00/5 (No votes)

3 Mar 2004

A C++ component to generate cryptographically secure pseudo random numbers using the Fortuna Algorithm

Fortuna Generator

Fortuna Pools

Fortuna Sources

Introduction

Pseudo Random Number Generators (PRNG) are used in many cryptographic operations. One of the most important cryptographic operations is generating encryption keys. If an attacker can compromise your (pseudo) random number generator, they can potentially also compromise your encryption key. This is bad; if an attacker knows the key used to encrypt your data, they can decrypt what you have encrypted. Therefore a good PRNG must generate 'random' numbers difficult to predict and the the state of the PRNG itself must be difficult to compromise. Although Fortuna is new (and cryptographers never trust anything new), it has been designed with both of these goals in mind. Fortuna was designed by two well known people in the cryptography / security field with experience in this area.

Fortuna is a random number generator developed by Bruce Schneier and Niels Ferguson in their book Practical Cryptography. Fortuna addresses some of the shortcomings of their previous PRNG Yarrow. The biggest challenge implementing Yarrow is that Yarrow requires entropy estimates. Fortuna overcomes this issue by removing the entropy esimators. The biggest design change besides the removal of the entropy esimators is that Fortuna has 32 pools to collect entropy.

My implementation of Fortuna is composed of 4 parts. There is a generator that generates the actual pseudo random data. There is an accumulator composed of 32 pools of source data (entropy). There is a source manager that manages the data sources. And there is a seed file that maintains a copy of the state of the pools and ensures that Fortuna can generate pseudo random data even after the computer has just rebooted. Fortuna avoids the problem of requiring an accurate estimate of the entropy by using 32 entropy pools. The entropy pools are used to periodically reseed the generator. Therefore if one pool or stream of pseudo random numbers is compromised, at the next reseed the generator will be using a new seed so that future PRN's are not also compromised.

The authors give a lot of implementation details for Fortuna. For the most part I followed their suggestions, but I did make a few changes. For more complete info goto www.citadelsoftware.ca/fortuna. For one thing, the authors recommend not using C or C++ for writing secure software, and this is a C++ component. I'll try to point out along the way where I deviate from their recommendations.

Design Goals

The main design goal of this implementation of Fortuna is to develop a PRNG which would be suitable for use on a web server. One distinguishing feature of a web server is that typically there is not much input from the keyboard or mouse. Therefore, this implementation of Fortuna uses the available information from Windows as 'entropy' sources to the generator. Note that this type of generator is more correctly called 'chaotic deterministic' than 'pseudo random' because the sources are not truly random. Some of the entropy for this implementation of Fortuna comes from using the high performance timer to time operating system calls. I realize that timing data is not true entropy because it is deterministic. I'll say more below on the high performance counter. And being unable to predict the outcome of the PRNG is main property desired.

Another design goal of this implementation of Fortuna is to make the PRNG difficult to successfully attack. This is done by

There is a large volume of data fed to the entropy pools, some with high entropy and some with low entropy. An attacker will have difficulty compromising the state of a single pool due to the high volume of data.
Data from the Windows registry is used to so that application data specific to your environment is included as a (low) entropy source to the accumulator
The status information of each process (page faults, memory, I/O Reads and Writes) are used as entropy sources
Extensive use of the high precision timer is used as a source of entropy
Avoiding simple contiguous data storage for key random data used in the prng to avoid memory scanning attacks
Being careful to randomize valuable random data in memory when the valuable data is no longer needed

Note that a true 'entropy' source requires a physical source; for a hardware RNG see ComScire (I am not affiliated with this company, but they appear to be a respected well established company with a good product from all I have read).

If you:

trust your network security
are confident an attacker cannot get access to the complete registry, and monitor the status of all the processes running on your machine
are confident that an attacker cannot run his own software on your machine to attempt to determine the contents of the 32 data pools and compromise the generator

I believe that this PRNG may suit your needs quite well.

In order to compromise this implementation of Fortuna, an attacker would require access to the entire registry, process monitoring etc. An attacker would have to severely compromise a machine to do this. Because of this, I believe this implementation of Fortuna would be acceptable for many web based applications. Below I list all of the data souces used to populate the data pools. Note that I said 'data pools', not 'entropy pools'. This is because some of the sources used don't have much entropy. For example, I use the entire registry as a source for the pools. Now, there is not a lot of entropy in the registry, but for an attacker to know the entire registry would require that the entire machine has been compromised. If this happens, you have a lot of problems besides the possibility of your PRNG being compromised.

Class Design

This implementation of Fortuna is highly multithreaded. Each data source runs on its own thread and each data pool also runs on its own thread. Each data source cycles over all the pools, distributing one byte to each pool in a round robin fashion. Each pool object has a mutex protecting the pool data. Because each source spreads it's data over each of the pools, this adds another level of 'deterministic chaos' to the distribution of source data over the pools. When two different sources access a pool at the same time, the thread scheduler determines which thread will have access first. While the thread scheduler is not random, given that there are 32 pools each on their own thread, and over 32 sources each on their own thread, predicting which source byte ends up in which pool quickly becomes intractible.

All of these threads do not add much overhead to your application because they spend most of their time waiting. Each source calls 'WaitForSingleObject' to determine if the component is shutting down.

UML diagrams for my implementation of Fortuna can be found on my web site at here.

Entropy Sources

Entropy sources (using the word entropy loosely as described above, they are more appropriately described as data sources) are distributed among the entropy pools. In the book Practical Cryptography, each source event is added to only one pool(see 10.5.6 in the book). In this implementation, each byte from each source event is distributed among the pools in a round robin fashion. With each byte of data there is also a byte which specifies the number of bits of entropy in that source byte of data. This allows the pools to keep track of how many bits of entropy are in the pool.

Each source must provide this virtual method to provide some 'chaotic data' (I know, I know, it's a cheesy name)

  // derived classes must implement this to return their data

  // pair - unsigned char, this is the chaotic data

  // pair - int,      this is the number of bits of 

  // entropy in the associated byte

  virtual bool GetChaoticData(std::vector<std::pair<unsigned char, 
    int> > vData)=0;

The High Resolution Performance Counter

Most new computers now come with a high resolution performance counter. The frequency of the counter is available from the api function:

  BOOL QueryPerformanceFrequency(LARGE_INTEGER* lpFrequency);

If there is no high performance timer, the return value is 0. On my 2.2GHZ laptop, the frequency of the high performance timer is 3,579,545. Note that the frequency is different from the CPU frequency. The value of the high resolution performance counter is obtained by:

  BOOL QueryPerformanceCounter(LARGE_INTEGER* lpPerformanceCount);

When timing data is obtained using the high performance counter, each byte that is different in the timer adds one bit of entropy to the pool.

I did some experiments regarding the high performance counter to determine how many bits of entropy I could claim by using the high performance counter to time the counts for a Sleep of 1ms. I wrote a separate program for these tests. I used the high performance timer to get the high performance count around the Sleep fn as follows:

  hpTimer.Start();
  Sleep(dwSleep);
  hpTimer.Stop();

  diff = hpTimer.m_stopTime.LowPart - hpTimer.m_startTime.LowPart;

  // subtract the number of counts to sleep

  diff -= dwFreq/1000*dwSleep;

where hpTimer is a simple class to wrap the high performance timer. Then I add the lsb from the difference in time to a bitvector (called BitBucket) I wrote myself (member m_bitbucket of type BitBucket). I need the mutex because I used 200 threads to speed up the bit generation and had a separate thread which extracted the accumulated lsb's and wrote them out to a file. The file writer thread Sleeps 100ms after extracting the bits from each bit generation thread.

  uc = (unsigned char)(diff & 0x01); // extract the lsb

  m_mutex.Wait(INFINITE);
  m_bitbucket.Add(uc);
  m_mutex.Release();

I accumulated all of the lsb's, writing 60MB or so to a file and then ran the DieHard randomness tests on the bits. The data passed most of the diehard tests. If you're interested in more of the gory details you can find them at my web site www.citadelsoftware.ca/hptimer.

The Windows Registry

The Windows registry is a huge data repository. This is one source of data used to populate the data pools. Again, this is not random data, but reproducing this data requires the attacker to have access to the entire registry. One of the challenges of working with the registry for this sort of application is ensuring that every handle to a registry key that is opened also gets closed. For this PRNG to be able to run on a server for weeks or months without rebooting, there can be no resource leaks. I have taken care to close registry key handles using RegCloseKey() when the handle is no longer needed. Also, registry keys are opened with KEY_ENUMERATE_SUB_KEYS | KEY_QUERY_VALUE so that only read permission is requested.

None of the data in the registry is considered to add entropy to the pools except for the CLSID's. Again this is a very conservative approach. Using the registry with four different source threads spreading the data over the pools complicates an attackers job considerably.

The registry is a system wide resource. Almost all applications will at some point make use of the registry. The FortunaCSI PRNG must be careful not to place too many demands on the registry. To accomplish this, every source thread has it's thread priority lowered. This is done by

BOOL m_bSetThreadPriority = SetThreadPriority(hThreadId, 
  THREAD_PRIORITY_BELOW_NORMAL);

Also, after every access to the registry, each source thread walking the registry sleeps for some length of time. The Sleep is typically from 3-10 ms. The actual sleep time is a function of the thread id, the 'this' pointer, and the data extracted from the registry. This makes it more difficult for an attacker to determine which of the 32 pools the data from the registry will end up in.

The case of a key with restricted access is also handled properly in the code. Attempting to open registry keys can result in an error code of ERROR_ACCESS_DENIED. When this happens, the subkey is abandoned and execution continues with the parent key of the key with the access restriction.

I tested the Registry walking logic by writing out a log file with every key and value in the registry. This pointed out a number of problems, particularly where I wasn't properly handling keys that didn't have read permission.

HKEY_CLASSES_ROOT\CLSID

One source thread of this Fortuna implementation cycles over all the guids in the HKCR\CLSID list. This is a list of every guid for every COM object installed on the machine. On my 2.2 Ghz Windows/XP laptop (about 1 year old) I have roughly 5600 guid's in this list and each cycle over these guids takes around 1 min.

When adding the bytes of the guids to the data pools, I only count 1 bit of entropy for each byte of the guid that is different than the preceeding guid. This keeps the estimate of the amount of 'entropy' very conservative. This is important to avoid reseeding the generator too often.

Reading all the CLSID's on my system once takes about 30 minutes.

HKEY_CLASSES_ROOT

Processing HKCR takes about 2 and 1/2 hours on my machine.

HKEY_CURRENT_USER

HKCU\Software is where the applications store application specific data in the registry. For example, your text editor will likely put the list of your most recently used files here. Each application contains a lot of data like this plus gui data such as your background colour etc. On my Win/XP laptop, it takes about 8 min for a dedicated thread to cycle through all this registry data. Dumping out the key names and only ascii data for the keys resulted in a file of around 270K.

On my machine it takes about 30 minutes to walk HKCU.

HKEY_LOCAL_MACHINE

HKLM is the largest repository of the registry. On my machine it takes about 5 hours to process.

Process Information

There is a wealth of information available for each process. If you look at the Task Manager, under "View", "Select Columns" you can see some of the different information that is available:

Process Name
Process id
Memory Usage
Page Faults
Handles
Threads
I/O Reads
I/O Read Bytes
I/O Writes
I/O Write Bytes
I/O Other
I/O Other Bytes

Each of the above quantities is available via different Win32 API calls. Following are some of the methods and the data that is available. For more information on the specific functions, see MSDN.

Each non zero byte from the following structures is added to the data pools. When a byte differs from the value obtained for the same process on a previous call, the byte is marked as containing one bit of entropy. Otherwise, the data is added to the pool but no entropy is claimed.

GetProcessIOCounters

The method fills in the IO_COUNTERS structure. This provides the following information:

typedef struct _IO_COUNTERS
{ ULONGLONG ReadOperationCount;
  ULONGLONG WriteOperationCount;
  ULONGLONG OtherOperationCount;
  ULONGLONG ReadTransferCount;
  ULONGLONG WriteTransferCount;
  ULONGLONG OtherTransferCount;
} IO_COUNTERS, *PIO_COUNTERS;

GetProcessTimes

This method provides the kernel time and user time for each process.

GetProcessMemoryInfo

This method fills in the PROCESS_MEMORY_COUNTERS structure. This provides the following information:

typedef struct _PROCESS_MEMORY_COUNTERS {
  DWORD cb;
  DWORD PageFaultCount;
  SIZE_T PeakWorkingSetSize;
  SIZE_T WorkingSetSize;
  SIZE_T QuotaPeakPagedPoolUsage;
  SIZE_T QuotaPagedPoolUsage;
  SIZE_T QuotaPeakNonPagedPoolUsage;
  SIZE_T QuotaNonPagedPoolUsage;
  SIZE_T PagefileUsage;
  SIZE_T PeakPagefileUsage;
} PROCESS_MEMORY_COUNTERS;

GetPerformanceInfo

This method fills in the PERFORMACE_INFORMATION (note the typo in the code from psapi.h) structure. This provides the following information:

typedef struct _PERFORMACE_INFORMATION {
  DWORD cb;
  SIZE_T CommitTotal;
  SIZE_T CommitLimit;
  SIZE_T CommitPeak;
  SIZE_T PhysicalTotal;
  SIZE_T PhysicalAvailable;
  SIZE_T SystemCache;
  SIZE_T KernelTotal;
  SIZE_T KernelPaged;
  SIZE_T KernelNonpaged;
  SIZE_T PageSize;
  DWORD HandleCount;
  DWORD ProcessCount;
  DWORD ThreadCount;
} PERFORMACE_INFORMATION, *PPERFORMACE_INFORMATION;

QueryWorkingSet

The Win32 API provides QueryWorkingSet to access the current working set of memory pages loaded for a process. I found a somewhat dated article to provide more information on the working set. If you are interested, here is the article by Matt Pietrek which gives more information on the working set (if someone knows of a more recent article I'd be glad to provide a reference to that instead).

Each time the working set for a process is obtained, the bytes are compared the bytes from the previous working set. If the size is different, then each non zero byte adds one bit of entropy. If the number of bytes is the same, then each each different byte adds one bit of entropy. Bytes that are the same as those from the previous working set are added to the pool but no entropy is claimed. This keeps the entropy estimate very conservative.

Some working sets can be quite large, well over 10MB of data. Because of this, I decided to limit the working sets considered to only those working sets below 1MB of data. Here are some of the applications I found with large working sets:

svchost.exe, 22MB
Windows Explorer, 21MB
Yahoo Pager, 18MB
Visual Studio 7, 45MB
Internet Explorer, 30MB

I was surprised how large the working set was for Yahoo Pager. Before I started filtering out the large working sets, Fortuna was using large amounts of memory and generating a lot of page faults. Removing the large working sets resulted in more reasonable system resource usage for the PRNG.

Microsoft Crypto API

Microsoft has introduced a cryptographic API which they claim is suitable for cryptographic purposes. Unfortunately, nothing is published regarding the internals of the Crypto API. Therefore some people are not comfortable using it due to privacy concerns. Also, given some Microsoft security practices in the past, it is difficult to know to what level should the Crypto API be trusted.

My approach here is to use the Crypto API as one source of 'random' data. Why not? It increases the workload of the attacker; now the attacker must compromise the Microsoft Crypto API as well as the other data sources used. The Crypto API generates 32 bytes of random data every 100ms. These bytes are distributed over all of the data pools using an estimate of 1 bit of entropy per byte of 'random' data. This is a conservative estimate obviously, because the internals of the Crypto API are unpublished.

Ping

Ping is a network command used to determine the time of a network transmission. I use the high resolution timer to time the various calls to perform a Ping. Even if the ping times out, the timing data still has useful randomness for seeding the prng pools. Ping timing data only compresses by around 2%. I wasn't able to gather enough timing data from Ping to analyze the results using Diehard.

Seed File

I have implemented the seed file a little differently than the authors of Practical Cryptography. If I'm understanding section 10.6.1 (page 178) correctly, they are only writing out the state from the generator. So, each time the PRNG is restarted, all the entropy in the pools is lost.

In my implementation, I write out the state of each of the 32 entropy pools. I also write out the key from the generator as well as the counter used by the generator. Some of the entropy pools are only used in the reseed very seldom and provide security during reseeds when the generator state is compromised. So in my opinion the state of the entropy pools is worth saving in the seed file.

I encrypt the seed file using AES with a 256 bit key in CTR mode. The key is the SHA256 hash of the user provided seed file password concatenated to a salt value.

One of the issues surrounding the use of the seed file is ensuring that the same seed file is never used twice. To avoid possible repeated values if the same seed file is used twice (see the book for a good discussion of how this could happen), after the seed file is read and before the Fortuna PRNG is allowed to return pseudo random bytes, more data is added to the hash of the pool entropy. This data includes a hash of the current snapshot of all processes, time information, and other available data. See the function GetMachineSignature() in FortunaUtils.cpp for all the gory details.

Using the code

A brief description of how to use the article or code. The class names, the methods and properties, any tricks or tips.

Using the code is quite straightforward. All of the complexity is hidden from the user. Here is how to set up the Fortuna PRNG.

 Fortuna* pFortuna = new Fortuna();

 std::string sFileName("FortunaCSI.bin");
 std::string sPassword("MtNorquay~246");
 bStatus = pFortuna->SetSeedFile(sFileName, sPassword);
 if (!bStatus)
 {
  printf("Set Seed File Failed\n");
  assert(0);
 }

 bool bReadSeedFile = false;
 if (bReadSeedFile)
 {
  CitadelSoftwareInc::Fortuna::FortunaErrors fError = 
    CitadelSoftwareInc::Fortuna::eNoErrors;
  bStatus = pFortuna->ReadSeedFile(fError);

  if (!bStatus)
  {
   printf("Failed to read seed file");
   assert(0);
  }
 }

 pFortuna->StartThreads();

Note that the use of the seed file is optional. The StartThreads() method is necessary to actually begin collecting data, walking the registry, monitoring processes, etc.

Once the set up is complete, then calculating random numbers is quite straightforward:

 unsigned int uint = 0;
 bool bStatus = pFortuna->GetRandomInt(uint);
 if (!bStatus)
   printf("Fortuna PRNG is not ready yet\n");
 else
   ...

There is also a method to get an arbitrarily long array of random bytes

  bool GetRandomBytes(unsigned char* pData, unsigned int numBytes);

Points of Interest

Writing server applications requires extensive testing. After adding a new feature, I would usually test by letting the program run for 8 to 12 hours, requesting a random number every 1 to 10 seconds. After adding the QueryWorkingSet data source and running for 8 hours I found (from the Task Manager) that there were some 24000 handles being used - oops. I quickly found the problem and fixed it. The GetPerformanceInfo function returns the number of handles; perhaps that value should be monitored and an indication of a problem be written to a log file if the value goes over some threshold.

To demonstrate the use of the prng I wrote a little MFC app which generates passwords. The screen shots at the top of this CodeProject page show the Fortuna monitor I added. This is an MFC dialog which monitors the state of the generator and the source and pool objects. The 'Peek' edit box for the Sources shows the actual data being fed to the pools from the currently selected source.

If you have any questions or comments on my implementation of Fortuna I'd be glad to hear from you. My email address is ron at citadelsoftware dot ca. Cheers!

History

Revision 1.0

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here