(untagged)

How to do Synchronous and Asynchronous web downloads

Nishant Sivakumar

0.00/5 (No votes)

28 Jun 2002

Explains the usage of WebRequest, WebResponse and related classes.

Download zipped Project Source [VC++ .NET] - 6 KB

Introduction

In this article we'll see how to download files off the web. This is accomplished without too much effort using the WebRequest and the WebResponse classes. These classes offer methods that allow us to access the data from the web as a stream. Thus we can use any of the various reader/writer classes available for handling streams. There are two mechanisms that we can use for downloading files. For small files we can use synchronous mechanism and for large files or files that are downloaded from servers whose response times cannot be predicted we can use asynchronous mechanism. I'll demonstrate both methods in this article.

Synchronous download

void DownloadFile(String* url, String* fpath)
{
    WebRequest* wrq = WebRequest::Create(url);
    HttpWebResponse* hwr = static_cast<HttpWebResponse*>(wrq->GetResponse());
    Stream* strm = hwr->GetResponseStream();
    FileStream* fs = new FileStream(fpath,FileMode::Create,FileAccess::Write);
    BinaryWriter* br = new BinaryWriter(fs);
    int b;
    while((b=strm->ReadByte()) != -1)
    {
        br->Write(Convert::ToByte(b));
    }
    br->Close();
    strm->Close();
}

I've used five classes there in quick succession. I guess that's just what the BCL is all about, a lavish abundance of classes. WebRequest is an abstract class that allows an user to request internet data in a protocol independent manner. We use the static method Create to request our file. The WebRequest class has a method called GetResponse which returns a WebResponse object. Since in our particular case, we have requested for an HTTP file, we cast our WebResponse object to an HttpWebResponse object. One big advantage of using these classes is that they all allow us stream access. In our case the HttpWebResponse class has a GetResponseStream method that returns a Stream object that encapsulates the requested file from the web. The rest of it is simple if you have used streams before. If not, you can read my article on files and streams here on CP. We simply read from the stream returned by the HttpWebResponse object and write the data to a file.

Asynchronous download

This is a little bit more complicated than synchronous downloads. But then, as you might expect when you are downloading several large files, then this is the more efficient method. I vaguely remember someone from MS saying that asynchronous methods use high performance techniques like I/O completion ports internally.

We create our WebRequest object just as we did above, but instead of calling GetResponse, we call BeginGetResponse which begins an asynchronous request for an Internet resource. We specify a response callback function as one of the arguments. We then wait on a ManualResetEvent object which is set by the callback, so that our function will be able to block using a wait call till the entire response is read and stored. We also pass our WebRequest object as the state object for the callback function.

void DownloadFileAsync(String* url, String* fpath)
{
    WebRequest* wrq = WebRequest::Create(url);
    finished = new ManualResetEvent(false);
    m_writeEvent = new AutoResetEvent(true);
    buffer = new unsigned char __gc[512];
    OutFile = new FileStream(fpath,
        FileMode::Create,FileAccess::Write);
    wrq->BeginGetResponse(
        new AsyncCallback(this,WebStuffDemo::ResponseCallback),
        wrq);
    finished->WaitOne();
    OutFile->Close(); 
}

Response callback

void ResponseCallback(IAsyncResult* ar)
{
    WebRequest* wrq = static_cast<WebRequest*>(ar->AsyncState);
    WebResponse* wrp = wrq->EndGetResponse(ar); 
    Stream* strm = wrp->GetResponseStream();
    strm->BeginRead(buffer,0,512,
        new AsyncCallback(this,WebStuffDemo::ReadCallBack),strm);
}

The EndGetResponse method concludes the asynchronous request that was initiated using the BeginGetResponse method and returns a WebResponse object from which we can use GetResponseStream to get the underlying stream object. Now we begin our next asynchronous operation on the stream. We start an asynchronous read operation using BeginRead. If you are wondering why we do this, here is a snip from MSDN. "Using synchronous calls in asynchronous callback methods may result in severe performance penalties. Internet requests made with WebRequest and its descendents must use Stream.BeginRead to read the stream returned by the WebResponse.GetResponseStream method"

Read callback

void ReadCallBack(IAsyncResult* ar)
{
    Stream* strm = static_cast<Stream*>(ar->AsyncState);
    int count = strm->EndRead(ar);
    if(count > 0)
    {
        __wchar_t Temp __gc[] = new __wchar_t __gc[512];
        Decoder* d = Encoding::UTF8->GetDecoder();
        d->GetChars(buffer,0,buffer->Length,Temp,0);
        String* s = new String(Temp,0,count);
        Console::WriteLine(s->Length);
        unsigned char wbuff __gc[] = new unsigned char __gc[512]; 

        buffer->CopyTo(wbuff,0);

        OutFile->BeginWrite(wbuff,0,count,
            new AsyncCallback(this,WebStuffDemo::WriteCallBack),OutFile);

        strm->BeginRead(buffer,0,512,
            new AsyncCallback(this,WebStuffDemo::ReadCallBack),strm);
    }
    else
    {
        strm->Close();
        finished->Set();
    }
}

We call EndRead on the stream and get back the count of bytes that were read from the stream. EndRead is a blocking call and is to be called once per BeginRead call we have initiated already. If the count of bytes read is greater than zero, then there is more data left. Otherwise we know that all the data has arrived and we close the stream and also set the event on which our main function is waiting. Just as we had to use asynchronous methods to read the data, we must use asynchronous methods for writing the data to our file, otherwise we'll have blocking calls inside the asynchronous callback functions which is highly inefficient.

So what we do is we call the BeginRead method on our output stream object. We pass our write-callback function as the callback, and pass the output stream object as the callback function's state object. Once we do this we call BeginRead on our input stream object to start another asynchronous read, as there is still more data left to be retrieved.

Write callback

void WriteCallBack(IAsyncResult* ar)
{
    m_writeEvent->WaitOne();
    FileStream* out = static_cast<FileStream*>(ar->AsyncState);
    out->EndWrite(ar);
    m_writeEvent->Set();
}

We call EndWrite on our output stream which ends an asynchronous write operation started by BeginWrite. EndWrite blocks till all the data has been written to. Thus we are saved the bother of making sure that all the data has got written. As you can see, I have use an AutoResetEvent object to make sure that two writes don't occur in parallel and also to ensure that the writes are called in the correct order. If multiple write callbacks are invoked, they'll all hang at the WaitOne call and when they are executed, they'll get executed in the order in which they called WaitOne.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here