(untagged)

ISAPI versus ASP file uploads in web applications

Adrian Bacaianu

0.00/5 (No votes)

15 Jan 2006

�This article describes two ways to upload images and files on your web server, and the advantages and disadvantages of both.

Fig. Graphs of uploading time needed in ASP/ISAPI.

Description

This article describes two ways of uploading images and files to your web server, and the advantages and disadvantages of both. My contribution is based on the comparison between the different methods. I didn't invent any Microsoft upload technique :).

This article is based on an ASP technique which I got from my colleague Doru Paraschiv and on a very good article from MSDN. My special thanks to Panos Kougiouris, for his help in revealing the ISAPI binary streaming techniques. Check out Creating a DLL to Enable HTTP-based File Uploads with IIS.

Uploading files from a browser is often a web site requirement. For instance, let's say that you have created a web site for a real estate company. The site contains listings of properties along with a picture of each property. Realtors can add new listings through an administration section of the site. Adding text for the new listing is easy using an HTML form. Then one day a realtor tells you that she wants to upload a photo of a property for sale in your site. How would you do that with ASP and ISAPI?

How to use

Both the methods use the same functional structure.

First, the user must complete the fields in the PostFile.asp form. The JavaScript function ValidateForm is used to validate the content of the needed fields. In the Upload.asp file, we call the Upload function to make the upload, followed by the SaveFile function to write the file to disk. The ASP uploading functions are hosted in _incUpload.asp file.

Fig. Image of the uploading form.

Second, the user must receive the results of Upload.asp (in case of ASP) or UploadExt.dll (in case of ISAPI) in the ReceiveRedirect.asp file. It is also possible to finish the execution in Upload.asp or UploadExt.dll but in most cases a redirect is needed to catch all the parameters of the form to write the file on another server.

The ISAPI form uses two optional hidden parameters to tell the DLL the directory to store the upload (if isn't specified the script uses c:\temp\) and the name of the ASP redirection page (if the user wants this).

The default value of the upload directory (the default is relative to root site directory \Data\) is put in the form sPathData in an ASP variable (defined in _include.asp).

Fig. Image of the receiving ASP page.

Your IIS anonymous user account, usually IUSR_machinename, must have write access to the directory you want to save the files in. In order to execute the ISAPI extension, the UploadExt.dll file must have the "Scripts and Executables" execute permission from IIS console manager.

ASP upload details

From MSDN: One of the most glaring omissions in older versions of Active Server Pages (ASP) is the ability to handle a file that's posted from an HTML form. While the ASP Request object has always allowed easy access to every other type of form field, posted files have been strictly off limits to pure ASP. The usual solution has been to implement support for file uploads using the Posting Acceptor or a third-party component. In this article, we'll show you how easy it is to get around this limitation and retrieve a file (or files!) from the data posted to an ASP file.

Upload the file

The first requirement for this process is to have an HTML form that actually uploads (or "posts") a file to your ASP page.

The ASP script that saves the file starts out by allowing the server a generous amount of time to process whatever file is uploaded to it. This is a very important consideration, because it can take a long time to process a large file. If your script times out before the process finishes, the uploaded file will be lost! Next, the script sets up some needed constants and defines several functions. We'll return to these functions in a moment, but for now, let's move on to the main part of the script and see how this process really works.

Catch the POST

The first thing the script does is retrieve all the data that was posted from the HTML file and place it into a variable called biData. This is done using the BinaryRead method of the Request object. As the name implies, this method reads a specified number of bytes from the post request in their raw, binary form. The actual number of bytes to read is provided by the TotalBytes property of the Request object.

You have to transform all that binary data into something more friendlier. This is done with a simple loop that marches through the extracted binary data and, using a series of calls to some binary data manipulation functions, converts the data into an easily readable format and places it into the variable PostData.

Breaking up the raw data

Once we have the posted data in ASCII form, it's a fairly simple process to work through it and extract each form field from it. The first step is to determine if our data is properly encoded. After that, we can determine the boundary between each form element.

Amazingly, both these pieces of information can be found in one convenient place: the HTTP_CONTENT_TYPE header. (Yes, you can still get at the headers after calling BinaryRead. You just can't get at the Form or QueryString collections.) As you can see here, the structure of this information is very simple:

-----------------------------7d22151d40264

Given this simple structure, it's easy to extract the information we need. First, we use the Split function to break the information around the semicolon. This allows us to check the encoding type easily by checking the first element of the array created by calling Split. Assuming that the encoding type is what we expect ("multipart/form-data"), we then extract the boundary, using another call to Split.

Each chunk of form data is itself in two chunks. The first is an informational chunk that tells all about the form field (i.e., its name and any associated information). The second chunk is the actual form data that was transmitted.

Content-Disposition: form-data; name="RedirectPage"

ReceiveRedirect.asp
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Name"

Adrian Bacaianu
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Upload"

Submit Query
-----------------------------7d22151d40264--

As you can see in this listing, the boundary information is used to separate each form field and its data. Unfortunately, the informational chunk of a form field isn't separated from the data chunk by anything obvious. Well, actually, it is obvious, but we've all grown so used to ignoring this particular separator that it's hard to notice. Basically, two pairs of carriage return and line feed characters separate the information chunk and the data chunk. That's a total of four bytes between the chunks.

Splitting apart the form data

Now, it would be great if we could extract all of this form data into a nice collection, just like the Request.Form collection. So, with that in mind, will go to define a collection named myRequest and an array named myRequestFiles. While it would be nice to use a collection for any files that were uploaded, collections can't easily track all the information required to properly handle a file. For example, for a simple file, it's handy to track the posted field name, the actual file contents that were transmitted, the file/path name of the file (i.e. where it was on the client's disk), and the MIME type of the file. A simple two-dimensional array is the easiest solution to this problem.

At this point, it's a simple matter of looping through all of the form fields that were extracted when postData was split along the boundary. As each field is examined, the informational and data chunks are extracted using simple calls to the Mid function. Note that we don't use Split here. This is because there could be dual carriage return and line feed pairs in the data itself.

After these chunks are extracted, the informational chunk is examined (via the InStr function) to see if it contains the string "filename=". If it does, that marks the field as an uploaded file. In that case, we call several functions, to extract the field name, filename and the MIME type of the file. This information is then added, along with the contents of the file itself, to the array of uploaded files.

If the field isn't a file, its name is extracted using the GetFieldName function. And then it's added to the myRequest collection.

That's it! Once the loop finishes, the myRequest collection will hold all of our "normal" form fields and the myRequestFiles array will hold up to 10 uploaded files. Yep, this will indeed work for more than one uploaded file! If you need more than 10, just change the appropriate dimension when the myRequestFiles array is created.

Saving the file

At this point, all that's left is to actually save the uploaded file (or files) on the server. Note that the contents of the other form fields are employed (via the myRequest collection) to determine what filename should be used to save the file. It's important to note that for the save operation to work, your IIS anonymous user account, usually IUSR_machinename, must have write access to the directory you want to save the files in!.

Conclusion

Well, that wasn't really too hard, was it? Of course, there are some drawbacks to this approach: Large files take a long time to process and binary files can be difficult to handle properly. But, once you've got the basics down (and now you do!) it's a fairly straightforward process to adapt this technique to handle more troublesome files. And, best of all, you don't have to buy and install a third-party object or use the much maligned posting acceptor!

ISAPI upload details

In MSDN, Panos has explained this very well:

"The MIME-compliant content type, called multipart/form-data, makes writing HTML that uploads files almost trivial. On the server side though, ASP does not have a way to access data in the multipart/form-data format. The most flexible way to access the uploaded file is through a C++ ISAPI extension DLL. This article describes a reusable ISAPI extension DLL that allows you to upload images and files without writing C++ code.

On the server side though, ASP does not have a way to access data in the multipart/form-data format. There is a posting acceptor component, but its programmability is limited and it always stores the uploaded files into the file system. For instance, if you know in advance that the users must always load small files and you process these files in memory, you might want a more flexible solution. The most flexible way to access the uploaded file is through a C++ ISAPI extension DLL.

The ACTION attribute of the form element should point to a target component (for example, CGI or ISAPI extension DLL) that knows how to parse the multipart encoding and process the data. The user presses the Browse button, selects a file from the file system, and then presses the UPLOAD button to send the file. Of course, you might want to hire a designer to spice up the page with graphics and other cool JavaScript tricks, but as far as the HTML is concerned it couldn't be any simpler.

When a browser sends a request to a server, it always sends an HTTP packet with the data describing the request. The packet always contains the virtual path of the URL. For instance, if you call http://myserver/default.asp, the packet will contain the /default.asp path. In addition, if the request is the result of submitting an HTML form, the request will contain the contents of the INPUT tags in the form.

The next question, of course, is concerned with how the data is encoded. It turns out that it's done using MIME encoding. The default (and the simplest) encoding is application/x-www-form-urlencoded, which is described in HTML 4.01 specification in W3C. The application/xwww-form-urlencode type is simple and perfect for submitting a few text fields.

For large text files or images, the specification defines another encoding: multipart/form-data. The following snippet shows what the HTTP packet of a form submitted using the multipart/form-data encoding looks like.

Raw data posted to the web server is in following format:

-----------------------------7d22151d40264
Content-Disposition: form-data; name="Filename"

UploadFileName.txt
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Filedata"; 
                             filename="C:\boot.ini"
Content-Type: application/octet-stream


timeout=2
default=multi(0)disk(0)rdisk(0)partition(1)\WINNT

-----------------------------7d22151d40264
Content-Disposition: form-data; name="PathData"

C:\Projects\articles\Discover WEB. ISAPI versus ASP 
                   web upload\Work\UploadISAPI\Data\
-----------------------------7d22151d40264
Content-Disposition: form-data; name="RedirectPage"

ReceiveRedirect.asp
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Name"

Adrian Bacaianu
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Upload"

Submit Query
-----------------------------7d22151d40264--

"The ISAPI extension DLL parses the input stream and collects all the name/value pairs. Then it creates a new dictionary COM object and stores the name/value pairs in the dictionary. Finally, it generates a new ID and stores the new dictionary into the dictionary of dictionaries using that ID as the key".

To access a parameter from that collection is very simple:

//here create the "PathData" param 

MultipartEntry *pEntry = cParser["PathData"]; 

if(pEntry != NULL)
    sPath = pEntry->Data(); //here get data of 

                            //the "PathData" param 

sName = cParser[2]->Name(); //here get the name 

                            //("PathData") of the 2 param

If the response is printed to a web browser directly from an ISAPI extension, we use the SendToBrowser method:

void SendToBrowser(LPEXTENSION_CONTROL_BLOCK pECB, 
                                 const String & sMsg)
{
    DWORD        dwLen = sMsg.size();
    if(dwLen > 0)               //use HTTP WriteClient function 

        pECB->WriteClient(pECB->ConnID, 
                   (LPVOID) sMsg.c_str(), &dwLen, 0);
}

If the response is redirected from the ISAPI extension to another ASP page, we use the RedirectBrowser method:

void RedirectBrowser(LPEXTENSION_CONTROL_BLOCK pECB, 
                                  const String & sMsg)
{
    DWORD dwLen = sMsg.size();
    if(dwLen > 0)      // use HTTP ServerSupportFunction function 

        pECB->ServerSupportFunction(pECB->ConnID, 
               HSE_REQ_SEND_URL, (LPVOID)sMsg.c_str(), &dwLen, 0);
}

Conclusion

This article describes two ways of uploading images and files to your web server, and the advantages and disadvantages of both. There are also other ways to do the upload, which will be presented in another article!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here