(untagged)

Quantifying The Accuracy Of Sleep

Mike O'Neill

0.00/5 (No votes)

20 Mar 2003

An analysis of actual sleep time caused by Sleep(), particularly for multithreaded applications

Download source - 115 Kb

Introduction
Background
The TimerTest Program
Results
Conclusions and Comments

Introduction

This article collects and analyzes statistics of the Sleep function. The source files contain a console app that runs Sleep with multiple different delays, and with multiple numbers of concurrent threads, and analyzes the results. (see footnote 1)

Background

I recently began some multithreaded projects and found myself using the Sleep function more than I had before. For example, I used Sleep(0) to relinquish the remainder of a thread's timeslice in situations where (on a single processor system) the thread needed a resource locked by another thread. In such situations, there is no real point in continuing the thread, which might as well let other threads run (including the thread that locked the resource) in hopes of getting the resource sooner. (see footnote 2) As another example, I used Sleep with a calculated short delay time, to throttle back on the number of messages being sent by a worker thread to the main thread. Without throttling, too many messages were being sent too quickly, which prevented the main thread from responding to mouse and user input (which, of course, was the entire point in opening a worker thread):

//... in a loop that gets a current line of text (CString tsCurLine)
//... uses GetTickCount() to find dwElapsedMilliSec

dwMessagesPerSec = 1000*(++dwTotalMessagesPosted)/dwElapsedMilliSec;
if (dwMessagesPerSec > dwMaxMessagesPerSec)
    Sleep( (DWORD)(1000*dwTotalMessagesPosted/dwMaxMessagesPerSec
                                                    - dwElapsedMilliSec) );

//
// OK, we've waited long enough, so allocate memory for CString and 
// post it to main thread Main thread is responsible for deleting memory 
//allocation
            
CString* s = new CString (tsCurLine);
::PostMessage(hMainWnd, USER_MESS_ADDITEM, (WPARAM) s, (LPARAM) m_pDoc);

// ... continue

Everything worked just fine, but in the back of my mind I remembered all those warnings about the granularity and inaccuracy of the various Windows timer functions. Joseph M. Newcomer, in his article entitled "Time Is The Simplest Thing..." provides a great summary of these inaccuracies, and basically advises not to rely on any accuracy at all, and to expect a granularity of around 55 milliSeconds for Win 9x, and a granularity of around 10 milliSeconds for Win NT.

So, just what was happening when I called Sleep?

The TimerTest Program

I wrote a console application that lets you enter the number of threads to open, and then puts the Sleep function through its paces. Each of 14 different time intervals were tested, ranging in geometrically-spaced increments from Sleep(0) to Sleep(1000). For each time interval, 50 iterations were performed, and statistics collected for each.

To measure the time interval accurately, I created a CStopWatch class that uses the system performance counter to measure time intervals with sub-milliSecond accuracy. A first version of TimerTest used GetTickCount to measure time, but I grew nervous that my results were corrupted by the inherent inaccuracy of GetTickCount. Later testing showed that the results were virtually the same, and that GetTickCount actually returns accurate times down to 1 msec accuracy.

CStopWatch is borrowed heavily from Laurent Guinnard's CDuration class described in his article entitled "Precise Duration Measurement". (see footnote 3) Here are the function declarations; all functions are implemented inline:

class CStopWatch  
{
public:
    CStopWatch();
    virtual ~CStopWatch();
    void Start(void);
    void Stop(void);
    DWORD GetLapTime() const;  // in whole microseconds (less than 214 secs)
                               // -- stopwatch keeps running
    DWORD GetInterval() const; // in whole microseconds (less than 214 secs)
                               // -- must call Stop() first, or returns zero
    LONGLONG GetLapTimeLongLong() const;      // in whole microseconds -- 
                                              // stopwatch keeps running
    LONGLONG GetIntervalLongLong() const; // in whole microseconds 
                               // -- must call Stop() first, or returns zero

protected:
    LARGE_INTEGER m_liStart;
    LARGE_INTEGER m_liStop;
    LONGLONG m_llFrequency;
};

In the TimerTest program, I allow the user to select the number of threads to open, and the selected number of threads are then started:

do{
    cout<<"Enter the number of threads (0-5)" << endl;
    cin>>nThreads;
} while ( nThreads>5 || nThreads<0 );
    
HANDLE hThread;
    
for (kk=1; kk<=nThreads; kk++)
{
    hThread = ::CreateThread(NULL, 0, ThreadFunc, (LPVOID)kk, NULL, &dwID);
    ::WaitForInputIdle(hThread, INFINITE);
}

The thread function itself performs mindless work: it simply counts an integer up to nearly its maximum value and then starts over, until a global variable g_bAbort is set to False by the console application:

DWORD WINAPI ThreadFunc (LPVOID pvParam)
{
    // a make-work thread -- endlessly performs mindless make-work
    
    DWORD dwThreadNum = (DWORD) pvParam;
    int ii = 0;
    
    while ( !g_bAbort )
    {
        ii++;
        if (ii >= 0x40000000) ii=0;
    }
    
    return (0);
}

After all requested threads are up and running, the TimerTest program enters its main loop where it exercises the Sleep function. While in the loop, detailed results are written in comma-separated format to a .txt file on the desktop, which later can be opened in Excel to graph and otherwise analyze the results. In addition, the program keeps track of statistics on its own, which it displays to the user as shown in the screen shot above:

for ( ii=0; ii<=13; ii++)
{
    s = ss = 0.0;
    
    for ( jj=1; jj<=iter; jj++) // iter is nominally set to 50 above the loop
    {
        StopWatch.Start();
        ::Sleep(stime[ii]);
        StopWatch.Stop();
        interval = StopWatch.GetInterval()/1000.0;    // convert to millisecs
        s = s + interval;
        ss = ss + interval*interval;
        oFile << stime[ii] << ", " << interval << endl;
    }
        
    mean = double(s)/double(iter);
    stdev = sqrt(double(iter*ss - s*s))/double(iter);
                
    printf("Sleep = %4d: mean = %8.3f, std dev = %6.3f\n", stime[ii], mean, 
           stdev);
}

Results

I ran TimerTest with each of zero to five threads and collected the results into an Excel file that's included with the source files. I also used Excel to graph the results and the graphs are included below.

I ran these tests on an older machine: 500 mHz Pentium III, 196 meg ram, Win98SE. A few other programs were running at the same time as the tests. Most notably, since the computer serves as an Internet gateway for our home network, the computer was running the "Personal Web Server" and Internet Connection Sharing (ICS). So, the computer was only moderately stressed.

Many of the results were unexpected (at least by me). Let's dive in.

Overall Statistics

The following two tables show the overall statistics of my results. The first table shows the mean (or average) value actually obtained for sleep time, as a function of the requested sleep time and the number of threads. The requested sleep time is in the column all the way on the left, and the mean sleep time actually obtained over the 50 tests is shown in the successive columns under the number of threads that were running.

The second table shows the standard deviation (or spread) of the actual sleep times, organized the same way (i.e., requested sleep time in the column on the left, and spread of the actually-received sleep times in successive columns under the number of other threads running).

These tables tell a lot about the overall statistics of Sleep. The first thing you notice is that for requested values above around 200 msecs, Sleep does a good job on average in giving your program the amount of sleep requested. Below 200 msecs, Sleep consistently gives higher values of sleep; for one or no threads, Sleep has difficulty giving less than around 9 msecs of sleep, no matter what was requested. For two or more threads, Sleep rarely gives less than 20 msecs of sleep.

As might be expected, the best results are obtained when there are no other threads running. Sleep is most consistent then (lowest values for standard deviation), and is able to match the requested amount of sleep most accurately (i.e., the mean matches the requested value of sleep).

For one or more threads, Sleep doesn't exactly fall apart, but it's clearly inconsistent (high values for the standard deviation) and it's only at the highest values for requested sleep that you get anything resembling your request.

Here's a more detailed discussion of three cases that seemed important: Sleep(0), results with no other threads running, and results with one or more threads running.

Sleep(0)

First, while Sleep(0) performed mostly as expected, there were two notable exceptions (described below). For the most part, Sleep(0) indeed relinquished the remainder of the thread's time slice to another thread. Where there were no other threads, Sleep(0) returned after an extremely short time interval, typically 10-15 microSeconds. Where there were other threads, Sleep(0) didn't return for a much longer period, typically around 100-150 milliseconds, reflecting the fact that Windows didn't give the thread a new time slice for a while.

What were the exceptions? Well, where there no other threads, the first 5-7 calls to Sleep(0) (i.e., the first 5-7 calls in the loop of 50 calls) only returned after an unexpectedly long time of 100-200 milliSeconds. This effect was dramatic and repeatable, such that the statistics shown above exclude the first 5-7 call to Sleep(0). Here's a screen shot of a portion of the spreadsheet output of raw results. The requested Sleep time is in the first column all the way on the left; there are 50 entries for each Sleep time, corresponding to each of the 50 iterations (you can only see the first dozen or so iterations of Sleep(0) in this excerpt). Each column after the first shows the measured sleep time actually received depending on the number of extra threads. The odd behavior is circled in blue:

I don't know why this occurred; if anyone has an explanation please post it. For practical programs that rely on Sleep(0), it might be advisable to call it a few times before getting to the real work of the program (although I'm not really sure why a program with no extra threads would ever need Sleep(0)).

The second exception involved Sleep(0) where more than just one other thread was running. I expected Sleep(0) to return only after all the other threads had run. So, if the delay with one other thread running was 100 milliSeconds, I expected the delay for two other threads running to be about 200 milliSeconds. That's not what I got. Rather, the delay was remarkably consistent no matter how many other threads were running, and typically was about 110 milliseconds. You can see this behavior in the above excerpted screen shot.

No Extra Threads Running

When there were no extra threads running, Sleep() did a remarkably good and consistent job at timing. The measured sleep time was extremely close to the requested sleep time (at least for times above around 10 msecs -- see below), and the measured time was remarkably consistent from one call to another. Here's a scatter chart of measured vs. actual sleep time, in a log-log format:

For Sleep times below 10 msecs, the accuracy was not great, but the repeatability was. For Sleep below 10 msecs, Sleep consistently gave higher sleep times than requested, but did so with surprisingly good repeatability of about 1.0 to 1.5 msec (one sigma).

Extra Threads Running

When there were extra threads running, Sleep was all over the place. The scatter chart reflects this randomness:

Unless you asked for more than about 200 msec of sleep, it was nearly impossible to rely on the amount of sleep actually given. Even at that level, Sleep yielded times that were completely inconsistent from one call to another, such that repeatability was a poor 20 to 25 msec (one sigma). In practical terms, allowing for a plus/minus three sigma variation, and remembering that Sleep almost never gives less than the requested time, that means you should expect an error of anywhere from +150 msecs to -0 msecs, for any one call to Sleep.

If you string together many many Sleep's, your results on average will improve, but only slowly. For example, even after stringing together fifty calls to Sleep(1000) with four threads, you still end up with an average value of 1024.612 msecs, or a total elapsed time of 51.230 seconds, in a situation where you only expected 50.000 elapsed seconds (i.e., an overall error of over a second). Clearly, with many threads running, you can't rely on Sleep() if timing is critical.

If average performance over the long haul is what you're after, then you might be able to rely on the Law Of Large Numbers to get acceptable performance. Roughly speaking, the Law Of Large Numbers states that performance tends towards the average over the long run. If we think Sleep behaves like a Guassian bell curve, then performance will tend toward the average as the square root of the number of calls. Taking 50 mSecs as an expected standard deviation (it's roughly the largest number in the table above), then you would need 2,500 calls to Sleep before you could expect sub-millisecond performance (on average).

Conclusions and Comments

Although my results were analyzed extensively for only one machine, I ran TimerTest on a few different machines, with differing loads and with different OS's. (I tried it on Win 95 and Win ME machines, with different speeds and memories, and with diferent loads.) Results similar to those above were obtained, although I did not analyze them as extensively as above. So, given that the results seem to match the documentation, I think that the above results would also apply to you.

Finally, here's a wrap-up of the major points in the article.

Everything written about inaccuracy in the Windows timer functions is correct as applied to Sleep. I think (without having done any testing at all) that the SetTimer API function would behave similarly, and would give your application a WM_TIMER message with the same inaccuracies shown here. (see footnote 4) SetTimer() has an additional caveat, however, mentioned in the MSDN article entitled "WM_TIMER Notification". According to this article, the WM_TIMER message is a low-priority message, such that GetMessage and PeekMessage post it only when no other higher-priority messages are in the thread's message queue. Thus, your application might never get a particular WM_TIMER message, and most certainly will not get it when you expect it if there are higher priority messages in the queue.
Decent accuracy can be obtained for calls above 200 msecs. Below that, Sleep() is accurate only if there are no other threads running.
For many repeated calls to Sleep, the Law Of Large Numbers helps reduce the average error, but probably will always result in an overall error that's slightly higher than the expected amount of sleep.

Footnotes

1. OK, there are at least two legitimate criticisms that can be leveled at this article. First, you might ask, "how in the world can he go on and on about such a mundane topic?" If that's your criticism, go for it!! And read my bio to find a clue into the reason for my verbosity ;)

Second, and more seriously, this is a software site, and there's very little software in this article. Moreover, the little software given is not really reusable for your own projects. I recognize this, but felt that the results were interesting enough to justify posting anyway. (return to article)

2. See MSDN article entitled "Sleep" which states that Sleep(0) relinquishes the remainder of a thread's timeslice to another thread of equal or greater priority, or if no such thread exists then does nothing. (return to article)

3. I made one important modification for purposes of this project: I eliminated a call to Sleep(0) in the Start function, since this would cause a thread switch. In the context of the CDuration class, a thread switch was needed to ensure consistent timings, whereas here it would inject an element of predictability not found in real-world situations, hence yielding a poor simulation of them. (return to article)

4. The same is probably also true of other types of timers, such as waitable timers. Read Nemanja Trifunovic's article "Timers Tutorial" for a description of various timers available in Windows. (return to article)

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here