Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

CFile Replacement with Overlapped I/O and User-Defined Progress Callback

0.00/5 (No votes)
10 Apr 2003 1  
Provides a class capable of providing a CFile sytle interface with no MFC dependecies and more importantly providing the ability to read/write if full overlapped IO mode with a user defined callback between each segment read/write.

Screenshot - DemoApp.jpg

Introduction

This article hopes to resolve some major issues that are present ( at least in my opinion ) with MFC's CFile class. My first major concern with this class was that is was an MFC class. I write a lot of ATL code and needed something a bit more portable than what MFC had to offer. Also there was a major lacking in the CFile implementation. This lacking was the absence of a nice overlapped I/O interface for doing file operations. In this article I present to you a replacement for the CFile class that provides both the overlapped I/O and a release from MFC dependency.

Pros:

  • No MFC Dependency
  • Overlapped I/O
  • Ability to provide callbacks during I/O operations
  • Message Handling while doing overlapped I/O without using multiple threads.

Cons:

  • When used with MFC does no provide any interface for use with CArchive.

Background

Before starting through this article there are some very basic requirements. You should besomewhatfamiliar with MFC ( the demo is written in it ) as well as have a strong knowledge of MSVC++. In order to truly understand how the message pump works within the File I/O routines you should already have a good understanding on Win32 message handling as I have no intent of embarking on that vast subject matter in this article. Basic knowledge of File operations is a plus. If you already understand the basic overlapped I/O model this should be nothing at all for you to breeze through however I will be explaining that it some detail so if you don't have such an understanding, don't worry.

Overlapped I/O

Before we get into the file object itself a solid understanding of what Overlapped I/O is and what it does for us is necessary. A simple explanation would be that when a call is made that supports overlapped I/O the call returns immediately instead of waiting until the operation completes. Behind the scenes windows queues up the transfer to execute and processes it in the order the call was made. Once the transfer itself is complete windows can notify you using a number of different methods. The method selected to get notifications of I/O completions depends greatly on which model you adopt for your design and also what exactly it is you are doing.

One way you can determine if your I/O operation is complete is to use the ::GetOverlappedResult API function. You would pass this function an OVERLAPPED structure and a place to hold a number of bytes transferred. You must also pass along a handle to the construct that the I/O operation was made on. When making this call you have two choices: 1) You can choose to wait until the I/O completes before the function returns at which time you should have the full transfer amount or 2) you can have it return immediately. If it returns a negative value, you did not specify it to wait and ::GetLastError returns ERROR_IO_INCOMPLETE that the operation is still pending. In effect would be polling for completion which is not a good way to go.

This brings us to a much more useful method of being notified when a pending I/O operation completes. This method would be to use completion routines. In order to exploit this method ( in the case of file I/O ) we must rely on the two API calls ::ReadFileEx and ::WriteFileEx . Both of these functions can accept a callback function as a parameter. The callback looks similar to this:

void CALLBACK FileIoCompletionRoutine( DWORD dwErrorCode, DWORD dwNumTrans, 
                                       LPOVERLAPPED lpOverlapped )

Using this model, we would now call one of our I/O routine and it would return to us right away. Windows then queues up the I/O operation as before and transfers the data. Once the data has been transferred fully it will call the completion routine that was specified in the I/O call itself.

Completion Routine Requirements

As you can see this can be very helpful when performing long I/O operations because the thread making the call is longer help up at the I/O call waiting for it complete. Now we can move forward and do whatever other processing may be pending and just wait for the system to notify us when the operation is done. There is however a couple things that MUST be done to allow this to work. The first is, at some point your thread must stop and wait for something. If your thread never pauses than the completion routine will never be able to run since it will execute in the same thread. If you do intense data processing in this thread then it would be wise to move the I/O into another thread that at some point can go into a "altertable" wait state.

An altertable wait state is provided by any of the listed functions: ::SleepEx, ::WaitForSingleObjectEx, ::WaitForMultipleObjectsEx, ::MsgWaitForMultipleObjectsEx. Each of these functions can take a flag in some manner that puts it into "altertable" mode. What this means in that on top of normal wait-style behavior, whether it be an amount of time or a wait handle, it can also be pulled out of its wait state when an I/O completion callback is ready to be executed. When this happens the callback is execute immediately and then your code press onward. This is very important to remember because as I mentioned, if you do not provide this functionality your completion routines will never run.

Other

There are other methods for capturing completions of I/O operations. One such method is the use of I/O completion ports. This however is a topic for another article as it is a much more complicated methodology and requires a great bit more explanation. An understanding of I/O completion ports is in now way necessary to use the CFileEx class defined below.

CFileEx

Now that the surface ( and I do mean surface only ) of overlapped I/O has been scratched we can move on to the implementation and use details of the CFileEx class. The public interface portion of the class is defined as:

class CFileEx
{
    public:
        CFileEx();
        ~CFileEx();
        BOOL Open( LPCTSTR lpFile, DWORD dwCreateDisp = OPEN_EXISTING, 
           DWORD dwAccess = GENERIC_READ | GENERIC_WRITE, 
           DWORD dwShare = 0, LPSECURITY_ATTRIBUTES lpSec = NULL ) 
           throw( CFileExException );
        void Close();
        DWORD Read( BYTE* pBuffer, DWORD dwSize ) throw( CFileExException );
        DWORD Write( BYTE* pBuffer, DWORD dwSize ) throw( CFileExException );
        DWORD ReadOv( BYTE* pBuffer, DWORD dwSize, 
                     LPFN_LRGFILEOP_PROGCALLBACK lpCallback, 
                     LPVOID pParam, BOOL bUseMsgPump = TRUE ) 
              throw( CFileExException );
        DWORD WriteOv( BYTE* pBuffer, DWORD dwSize, 
                       LPFN_LRGFILEOP_PROGCALLBACK lpCallback,
                       LPVOID pParam, BOOL bUseMsgPump = TRUE ) 
               throw( CFileExException );
        BOOL SetOvSegReadWriteSize( DWORD dwSegmentSize
                  = OVLFILE_DEFAULT_SEGSIZE ) throw( CFileExException );
        DWORD GetOvSegReadWriteSize();
        BOOL AbortOverlappedOperation();
        BOOL IsOpen();
        void SetThrowErrors( BOOL bThrow = TRUE );
        ULONGLONG GetFileSize() throw( CFileExException );
        ULONGLONG Seek( LONGLONG lToMove, DWORD dwMoveFrom = FILE_BEGIN ) 
             throw( CFileExException );
        ULONGLONG SeekToBegin() throw( CFileExException );
        ULONGLONG SeekToEnd() throw( CFileExException );
        void Flush() throw( CFileExException );
        BOOL GetFileName( TCHAR* lpName, DWORD dwBufLen );
        BOOL GetTimeLastAccessed( SYSTEMTIME& sys );
        BOOL GetTimeLastModified( SYSTEMTIME& sys );
        BOOL GetTimeCreated( SYSTEMTIME& sys );
                
    public:
        static BOOL GetTimeLastAccessed( LPCTSTR lpFile, SYSTEMTIME& sys );
        static BOOL GetTimeLastModified( LPCTSTR lpFile, SYSTEMTIME& sys );
        static BOOL GetTimeCreated( LPCTSTR lpFile, SYSTEMTIME& sys );
        
    public:
            // public data member. we can let people use 

            // the handle direct if they wish.

        HANDLE m_hFile;
        
    protected:
    
        ....
    
    private:
        
        ....
};

Most of this should be pretty self explanatory. There are a few points that do need to be clarified before using this class though. First is the callback routines that are used with the ReadOv and WriteOv functions. This function is NOT the same callback that is mentioned in the previous section. That completion routine in wrapped out within the class itself. The callback you provide to the two functions is called right around the same time and can take an additional parameter of your choosing. The definition for that function looks like:

BOOL CALLBACK ProgressCallback( DWORD dwWritten, DWORD dwTotalSize, 
                                LPVOID pParam );

The other point of interest is the "use message pump" flag that can be overridden in the two functions. This allows you to decide at the time of the call whether or not messages will be processed during the time in between call to the completion routine. Specifying false here only turns off the message pump NOT the callback function. The function you provide will still be called as normal. This can be useful if running in a thread that does not process any messages.

The other two Read and Write calls emulate the standard Read/Write calls found in MFC's CFile class. They take their respective buffers and perform the I/O operation not returning from the call until the I/O operation has completed. No callback can be used for these two functions. The only other "need to know" here is the SetThrowErrors function. When this is true the object will throw an exception in the form of a CFileExException object for any errors encountered. A definition of this object can be found in the FileEx.h header file.

This exception class provides two forms of error reporting. One is it reports on standard errors returned by Win32 API calls, including those returned by ::GetLastError. The other method is custom errors defined specifically for this class. If you catch the error you must check both error codes to ensure you are trapping the proper error. If this flag is set to false no errors will be thrown and the functions will attempt to report errors using BOOL value however for functions that return numbers there is no really effective way to report an error. In ( most of ) these cases 0 is returned.

The Nuts and Bolts of it All

And now we get to the interesting stuff. How does it all really work. Sure using the class is nice but knowing how it work is better. To really understand this class there are two main functions to understand ( which do mostly the same thing ) and three other helper functions. We start with the most complicated of them, the DoFileOperationWithMsgPump function. The source for this function is defined below:

DWORD CFileEx::DoFileOperationWithMsgPump( BOOL bWrite, BYTE* pBuffer,
        DWORD dwSize, LPFN_LRGFILEOP_PROGCALLBACK lpCallback, LPVOID pParam )
{
    RDWROVERLAPPEDPLUS ovp;
    DWORD dwNumSegs    = 0;
    DWORD dwCurSeg    = 1;
    BOOL bDone = FALSE;
    BOOL bQuit = 0;
    int nRet        = 0;
    
    ZeroMemory( &ovp.ov, sizeof( OVERLAPPED ) );
    ovp.lpCallback = lpCallback;
    ovp.pParam = pParam;
    ovp.dwTotalSizeToTransfer = dwSize;
    ovp.dwTotalSoFar = 0;
    ovp.dwError = 0;
    ovp.bContinue = TRUE;

    dwNumSegs = ( dwSize + m_dwSegSize - 1 ) / m_dwSegSize;   
                          // eliviates need for floating point lib calc.    

    
    if( ! NextIoSegment( bWrite, ovp, pBuffer, dwNumSegs, dwCurSeg ) ) {
            // something fouled up in our NextIoSegment routine ( ReadFileEx 

            // or WriteFilEx failed ) so we set the error in the ovp structure

            // and set quit and fake the nRet code.  By doing this an 

            // exception will be thrown on the way out of this function. 

            // setting bDone to TRUE makes sure the loop never runs.

        ovp.dwError = ::GetLastError();
        bQuit = TRUE;
        nRet = WAIT_IO_COMPLETION;
        bDone = TRUE;
    }

    while( ! bDone ) {
        nRet = ::MsgWaitForMultipleObjectsEx( 1, &m_hStop, INFINITE, 
                                             QS_ALLEVENTS, MWMO_ALERTABLE );
        switch( nRet ) 
        {
            case WAIT_OBJECT_0:    
                bQuit = TRUE;
                ::ResetEvent( m_hStop );
                break;
            case WAIT_OBJECT_0 + 1:
                PumpMsgs();
                break;
            case WAIT_IO_COMPLETION:
                {
                    bDone = ( ovp.dwTotalSoFar == ovp.dwTotalSizeToTransfer );
    
                    if( bDone || bQuit) {
                        break;
                    }
                    // this signals either an error happened on the last I/O

                    // compeletion rountine or the user returned FALSE from 

                    // their callback signaling to stop the IO process.

                    if( ! ovp.bContinue ) {                            
                        bQuit = TRUE;
                    }else{
                        dwCurSeg++;
                        if( ! NextIoSegment( bWrite, ovp, pBuffer, dwNumSegs,
                                             dwCurSeg ) ) {
                            // something failed with our read/write call. This

                            // is an API error so we need to handle it 

                            // accordingly. Setting the ovp.dwError and

                            // setting Quit to TRUE will force an exception to

                            // be thrown back to the user notifying them of 

                            // the failure.

                            ovp.dwError = ::GetLastError();
                            bQuit        = TRUE;
                        }
                    }
                }
                break;
        };    

            // For Some reason we are now dropping out of this loop. This is 

            // mostly likely a kill event that got signaled but we need to 

            // check for an actual API error just in case.

        if( ( nRet == WAIT_IO_COMPLETION ) && bQuit ) {
            if( ovp.dwError != 0 ) {
                ThrowErr( ovp.dwError, FALSE );
            }
            break;
        }
    }

    return ovp.dwTotalSoFar;
}

Before digging into this you should also take a quick second to review of RDWROVERLAPPEDPLUS which is defined as:

typedef struct _RDWROVERLAPPEDPLUS 
{ 
    OVERLAPPED ov; 
    DWORD dwTotalSizeToTransfer; 
    DWORD dwTotalSoFar; 
    LPFN_LRGFILEOP_PROGCALLBACK lpCallback; 
    LPVOID pParam; 
    BOOL bContinue; 
    DWORD dwError; 
} RDWROVERLAPPEDPLUS, *LPRDWROVERLAPPEDPLUS;

The main goal here is to break apart the buffer that we were given into a set number of segments. We don't actually break down the buffer but it is done conceptually none-the-less. The next thing we do is setup our extended overlapped structure to hold various data concerning the transfer including the callback function the user has requested be called during the I/O completion routine. We all track our total bytes sent here as well as a continue flag. The continue flag can be set by the user. If the user chooses to halt the operation their callback function can return false and all further I/O will stop. After we have an overlapped based structure we can kick off the I/O operation by called the NextIoSegment function. This is defined as:

BOOL CFileEx::NextIoSegment( BOOL bWrite, RDWROVERLAPPEDPLUS& ovp, 
                             BYTE* pBuffer, DWORD dwTtlSegs, DWORD dwCurSeg )
{
    BOOL bSuccess = FALSE;
    BYTE* pOffBuf = NULL;
    DWORD dwTransfer = 0;


    pOffBuf = ( BYTE* ) POINTEROFFSET( pBuffer, ovp.dwTotalSoFar );
    dwTransfer = ( dwCurSeg == dwTtlSegs ) ? 
                  ( ovp.dwTotalSizeToTransfer % m_dwSegSize ) : m_dwSegSize;
    ovp.ov.Offset = ovp.dwTotalSoFar;

    if( bWrite ) {
        bSuccess = ::WriteFileEx( m_hFile, pOffBuf, dwTransfer, &ovp.ov, 
                                  CFileEx::FileIoCompletionRoutine );
    }else{
        bSuccess = ::ReadFileEx( m_hFile, pOffBuf, dwTransfer, &ovp.ov, 
                                 CFileEx::FileIoCompletionRoutine );
    }        

    return bSuccess;
}

Here we are calling one of our I/O routines depending on type. The main thing to notice here is that we have to set the offset variable within the overlapped structure. This is so that we keep reading/writing forward in the file depending on how many bytes we have already read/written. We calculate the total amount we are going to transfer by what segment we are on. We have to do this as there is a more than likely probability that the file size will not be equally divisible by our segment length. In the method used the last segment is just the remainder of the file. Other than that there isn't much here. We just call the I/O routine will our custom callback function. I am not going to dive into that function because of its simplicity. All it does essentially is bring up the extended overlapped structure, increment the total number of bytes transferred and call the users callback function if provided.

Once this function returns we will head forward to our main loop in this function. Here I use the ::MsgWaitForMultipleObjectsEx function to wait for the I/O completion or a internal stop event that I use to doing abortions and file closes. A quick aside, that bring up the point of file closing and abortions. You must never allow the file to close while there is an outstanding I/O operation. This would cause your overlapped structure to go out of scope and would cause erratic behavior within the program. We are almost done here. The next thing is when the wait function returns. We have three choices. 1) The stop event signals in which we set the quit flag and wait for the pending I/O operation to complete and then the loop exits. 2) We get a message in the message queue. When this happens we call our local PumpMsgs() helper to clear out any messages that have been received. 3) An I/O operation has completed. At this pointer we need to check if we are finished. If are done we can exit else we increment the segment counter and call the next I/O function. That's it! Our operation will now complete it self.

The other function of relevance I mentioned earlier is the DoFileOperation function. You can look at in on your own if you would like but it is basically the same as the function above except it uses a different wait call and does not have a handler for incoming messages. This is called when the user chooses not use to a message pump in the ReadOv/WriteOv function calls.

Finishing Up

When opening files remember that this class just uses your standard ::CreateFile to create the file and takes the same parameters. The flags parameter is handled internally because we much ensure the FILE_FLAG_OVERLAPPED flag is set else none of our nice stuff contained within this class will work.

Progress Indications

As stated in the title this class can help you do progress status indicators for your operations. While it does not do this directly it is done via the user defined callback function. You can specify the function you want to be called and in doing so receive a call after each segment telling you the total amount that will be transfered and the amount of that total that has been transferred already. The demo application shows this through the user of a progress bar. I found this an absolute must when transferring large files from slow disk such as my external hard drive when it is unlucky enough to be stuck on a USB 1.x at work. The demo source is pretty self explanatory. It demonstrates open and closing files, doing reads and writes as well as the various way to abort a transfer.

Enjoy!!

Comments

About the OSes. I have tested this on Windows 2000 and Windows XP Pro. According to the documentation I saw this code should be able to run under NT4 though I do not have an NT4 box to test it on right now. If anyone gets this code to run on such a machine let me know and I will update this article with it as a definite. Thanks.

History

4/10/2003 - Initial Public Release

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here