Fig. Graphs of uploading time needed in ASP/ISAPI.
Description
This article describes two ways of uploading images and files to your web server, and the advantages and disadvantages of both. My contribution is based on the comparison between the different methods. I didn't invent any Microsoft upload technique :).
This article is based on an ASP technique which I got from my colleague Doru Paraschiv and on a very good article from MSDN. My special thanks to Panos Kougiouris, for his help in revealing the ISAPI binary streaming techniques. Check out Creating a DLL to Enable HTTP-based File Uploads with IIS.
Uploading files from a browser is often a web site requirement. For instance, let's say that you have created a web site for a real estate company. The site contains listings of properties along with a picture of each property. Realtors can add new listings through an administration section of the site. Adding text for the new listing is easy using an HTML form. Then one day a realtor tells you that she wants to upload a photo of a property for sale in your site. How would you do that with ASP and ISAPI?
How to use
Both the methods use the same functional structure.
First, the user must complete the fields in the PostFile.asp form. The JavaScript function ValidateForm
is used to validate the content of the needed fields. In the Upload.asp file, we call the Upload
function to make the upload, followed by the SaveFile
function to write the file to disk. The ASP uploading functions are hosted in _incUpload.asp file.
Fig. Image of the uploading form.
Second, the user must receive the results of Upload.asp (in case of ASP) or UploadExt.dll (in case of ISAPI) in the ReceiveRedirect.asp file. It is also possible to finish the execution in Upload.asp or UploadExt.dll but in most cases a redirect is needed to catch all the parameters of the form to write the file on another server.
The ISAPI form uses two optional hidden parameters to tell the DLL the directory to store the upload (if isn't specified the script uses c:\temp\) and the name of the ASP redirection page (if the user wants this).
The default value of the upload directory (the default is relative to root site directory \Data\) is put in the form sPathData
in an ASP variable (defined in _include.asp).
Fig. Image of the receiving ASP page.
Your IIS anonymous user account, usually IUSR_machinename
, must have write access to the directory you want to save the files in. In order to execute the ISAPI extension, the UploadExt.dll file must have the "Scripts and Executables" execute permission from IIS console manager.
ASP upload details
From MSDN: One of the most glaring omissions in older versions of Active Server Pages (ASP) is the ability to handle a file that's posted from an HTML form. While the ASP Request
object has always allowed easy access to every other type of form field, posted files have been strictly off limits to pure ASP. The usual solution has been to implement support for file uploads using the Posting Acceptor or a third-party component. In this article, we'll show you how easy it is to get around this limitation and retrieve a file (or files!) from the data posted to an ASP file.
Upload the file
The first requirement for this process is to have an HTML form that actually uploads (or "posts") a file to your ASP page.
The ASP script that saves the file starts out by allowing the server a generous amount of time to process whatever file is uploaded to it. This is a very important consideration, because it can take a long time to process a large file. If your script times out before the process finishes, the uploaded file will be lost! Next, the script sets up some needed constants and defines several functions. We'll return to these functions in a moment, but for now, let's move on to the main part of the script and see how this process really works.
Catch the POST
The first thing the script does is retrieve all the data that was posted from the HTML file and place it into a variable called biData
. This is done using the BinaryRead
method of the Request
object. As the name implies, this method reads a specified number of bytes from the post request in their raw, binary form. The actual number of bytes to read is provided by the TotalBytes
property of the Request
object.
You have to transform all that binary data into something more friendlier. This is done with a simple loop that marches through the extracted binary data and, using a series of calls to some binary data manipulation functions, converts the data into an easily readable format and places it into the variable PostData
.
Breaking up the raw data
Once we have the posted data in ASCII form, it's a fairly simple process to work through it and extract each form field from it. The first step is to determine if our data is properly encoded. After that, we can determine the boundary between each form element.
Amazingly, both these pieces of information can be found in one convenient place: the HTTP_CONTENT_TYPE
header. (Yes, you can still get at the headers after calling BinaryRead
. You just can't get at the Form
or QueryString
collections.) As you can see here, the structure of this information is very simple:
-----------------------------7d22151d40264
Given this simple structure, it's easy to extract the information we need. First, we use the Split
function to break the information around the semicolon. This allows us to check the encoding type easily by checking the first element of the array created by calling Split
. Assuming that the encoding type is what we expect ("multipart/form-data"), we then extract the boundary, using another call to Split
.
Each chunk of form data is itself in two chunks. The first is an informational chunk that tells all about the form field (i.e., its name and any associated information). The second chunk is the actual form data that was transmitted.
Content-Disposition: form-data; name="RedirectPage"
ReceiveRedirect.asp
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Name"
Adrian Bacaianu
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Upload"
Submit Query
-----------------------------7d22151d40264--
As you can see in this listing, the boundary information is used to separate each form field and its data. Unfortunately, the informational chunk of a form field isn't separated from the data chunk by anything obvious. Well, actually, it is obvious, but we've all grown so used to ignoring this particular separator that it's hard to notice. Basically, two pairs of carriage return and line feed characters separate the information chunk and the data chunk. That's a total of four bytes between the chunks.
Splitting apart the form data
Now, it would be great if we could extract all of this form data into a nice collection, just like the Request.Form
collection. So, with that in mind, will go to define a collection named myRequest
and an array named myRequestFiles
. While it would be nice to use a collection for any files that were uploaded, collections can't easily track all the information required to properly handle a file. For example, for a simple file, it's handy to track the posted field name, the actual file contents that were transmitted, the file/path name of the file (i.e. where it was on the client's disk), and the MIME type of the file. A simple two-dimensional array is the easiest solution to this problem.
At this point, it's a simple matter of looping through all of the form fields that were extracted when postData
was split along the boundary. As each field is examined, the informational and data chunks are extracted using simple calls to the Mid
function. Note that we don't use Split
here. This is because there could be dual carriage return and line feed pairs in the data itself.
After these chunks are extracted, the informational chunk is examined (via the InStr
function) to see if it contains the string "filename=". If it does, that marks the field as an uploaded file. In that case, we call several functions, to extract the field name, filename and the MIME type of the file. This information is then added, along with the contents of the file itself, to the array of uploaded files.
If the field isn't a file, its name is extracted using the GetFieldName
function. And then it's added to the myRequest
collection.
That's it! Once the loop finishes, the myRequest
collection will hold all of our "normal" form fields and the myRequestFiles
array will hold up to 10 uploaded files. Yep, this will indeed work for more than one uploaded file! If you need more than 10, just change the appropriate dimension when the myRequestFiles
array is created.
Saving the file
At this point, all that's left is to actually save the uploaded file (or files) on the server. Note that the contents of the other form fields are employed (via the myRequest
collection) to determine what filename should be used to save the file. It's important to note that for the save operation to work, your IIS anonymous user account, usually IUSR_machinename
, must have write access to the directory you want to save the files in!.
Conclusion
Well, that wasn't really too hard, was it? Of course, there are some drawbacks to this approach: Large files take a long time to process and binary files can be difficult to handle properly. But, once you've got the basics down (and now you do!) it's a fairly straightforward process to adapt this technique to handle more troublesome files. And, best of all, you don't have to buy and install a third-party object or use the much maligned posting acceptor!
ISAPI upload details
In MSDN, Panos has explained this very well:
"The MIME-compliant content type, called multipart/form-data, makes writing HTML that uploads files almost trivial. On the server side though, ASP does not have a way to access data in the multipart/form-data format. The most flexible way to access the uploaded file is through a C++ ISAPI extension DLL. This article describes a reusable ISAPI extension DLL that allows you to upload images and files without writing C++ code.
On the server side though, ASP does not have a way to access data in the multipart/form-data format. There is a posting acceptor component, but its programmability is limited and it always stores the uploaded files into the file system. For instance, if you know in advance that the users must always load small files and you process these files in memory, you might want a more flexible solution. The most flexible way to access the uploaded file is through a C++ ISAPI extension DLL.
The ACTION
attribute of the form element should point to a target component (for example, CGI or ISAPI extension DLL) that knows how to parse the multipart encoding and process the data. The user presses the Browse button, selects a file from the file system, and then presses the UPLOAD button to send the file. Of course, you might want to hire a designer to spice up the page with graphics and other cool JavaScript tricks, but as far as the HTML is concerned it couldn't be any simpler.
When a browser sends a request to a server, it always sends an HTTP packet with the data describing the request. The packet always contains the virtual path of the URL. For instance, if you call http://myserver/default.asp, the packet will contain the /default.asp path. In addition, if the request is the result of submitting an HTML form, the request will contain the contents of the INPUT
tags in the form.
The next question, of course, is concerned with how the data is encoded. It turns out that it's done using MIME encoding. The default (and the simplest) encoding is application/x-www-form-urlencoded, which is described in HTML 4.01 specification in W3C. The application/xwww-form-urlencode type is simple and perfect for submitting a few text fields.
For large text files or images, the specification defines another encoding: multipart/form-data. The following snippet shows what the HTTP packet of a form submitted using the multipart/form-data encoding looks like.
Raw data posted to the web server is in following format:
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Filename"
UploadFileName.txt
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Filedata";
filename="C:\boot.ini"
Content-Type: application/octet-stream
timeout=2
default=multi(0)disk(0)rdisk(0)partition(1)\WINNT
-----------------------------7d22151d40264
Content-Disposition: form-data; name="PathData"
C:\Projects\articles\Discover WEB. ISAPI versus ASP
web upload\Work\UploadISAPI\Data\
-----------------------------7d22151d40264
Content-Disposition: form-data; name="RedirectPage"
ReceiveRedirect.asp
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Name"
Adrian Bacaianu
-----------------------------7d22151d40264
Content-Disposition: form-data; name="Upload"
Submit Query
-----------------------------7d22151d40264--
"The ISAPI extension DLL parses the input stream and collects all the name/value pairs. Then it creates a new dictionary COM object and stores the name/value pairs in the dictionary. Finally, it generates a new ID and stores the new dictionary into the dictionary of dictionaries using that ID as the key".
To access a parameter from that collection is very simple:
MultipartEntry *pEntry = cParser["PathData"];
if(pEntry != NULL)
sPath = pEntry->Data();
sName = cParser[2]->Name();
If the response is printed to a web browser directly from an ISAPI extension, we use the SendToBrowser
method:
void SendToBrowser(LPEXTENSION_CONTROL_BLOCK pECB,
const String & sMsg)
{
DWORD dwLen = sMsg.size();
if(dwLen > 0)
pECB->WriteClient(pECB->ConnID,
(LPVOID) sMsg.c_str(), &dwLen, 0);
}
If the response is redirected from the ISAPI extension to another ASP page, we use the RedirectBrowser
method:
void RedirectBrowser(LPEXTENSION_CONTROL_BLOCK pECB,
const String & sMsg)
{
DWORD dwLen = sMsg.size();
if(dwLen > 0)
pECB->ServerSupportFunction(pECB->ConnID,
HSE_REQ_SEND_URL, (LPVOID)sMsg.c_str(), &dwLen, 0);
}
Conclusion
This article describes two ways of uploading images and files to your web server, and the advantages and disadvantages of both. There are also other ways to do the upload, which will be presented in another article!