Introduction
ASP.NET is a very nice and fast way to program small to enterprise level websites fast and efficiently. Most controls are perfect for any web development needed; however, the FileUpload
control really has some shortcomings. Back when I used .NET 1.1, I did some testing to see how the control worked and handled large files, and was very unpleased.
I tried to upload a huge 1 or 2 gig file. I found myself with a problem where my web page would error, and the memory was never released. I left it at that, and did not mess with it anymore. I thought for sure that the 2.0 framework would have fixed that problem, but it appears that we are not that lucky (at least not that I could find).
Anyhow, the project I was working on requires that I have control of what my users do. You never know when someone is going to try and upload some huge file on purpose or by accident.
Purpose
This module was to provide a controlled way to process files and also to allow end users to see their current progress on the upload.
Code Examples
Below are the main source examples for this module. If you have searched the crap out of this subject on the internet, then you should see some similarities; however, there is one big difference that I have found that for some reason others are not doing. Let me explain:
Many examples on how to do this show developers converting byte data to strings and then searching for the string data. That is very inefficient. Most people make this module to use it for large file downloads. Most of the data that people are going to be posting is binary. Why tax your CPU with conversions to strings and then search the strings? I have no idea, but maybe that just sounded easier at the time. The way I do it is I search for byte patterns inside each byte buffer.
Another benefit is that it is possible for part of your file delimiters to be split up between multiple byte buffers. Most the examples I saw, people looked at the buffer once and then threw it away. I keep the last buffer in memory for reference, so if I did not find the start of a file in the buffer in my current buffer, then I merge the byte arrays together and then re-search.
I don't want to say that the way I am doing it is the best way, but I feel that just a few enhancements to the examples out there would make them worthy of deployment. Many of the examples that I tested would bring data in at approx 10 - 40 KB a second. Yeah.. like I am going to wait for a 100 - 2000 meg file to upload at that rate. With my module, I was able to bring in 1 - 4 Mbs per second with a low CPU.
Features
- Upload direct to disk support
- File progress bar (AJAX for the example)
- Automatic removal of files after a configured period of time
- Fast buffer parsing resulting in lower CPU requirements
- Configurable in app.config
How it works
The upload module implements IHttpModule
and is hooked into the context.BeginRequest
event. When that event is fired, I look to make sure that the page that is being requested is a page that may be uploading file(s) (configured in the web.config). I create my FileProcessor
object and then start to get the buffered data from the worker object. My code comments will explain the process as it happens.
UploadModule.cs
using System;
using System.Collections.Generic;
using System.Text;
using System.Web;
using System.Reflection;
using System.IO;
using System.Diagnostics;
using System.Threading;
namespace TravisWhidden
{
public class UploadModule : System.Web.IHttpModule
{
#region "ClassLevelVars"
private string _baseFileLocation = "";
#endregion
#region "Constructors"
public UploadModule()
{
_baseFileLocation =
TravisWhidden.Properties.Settings.Default.BaseFileUpload;
}
#endregion
#region "Properties"
public static string BasePath
{
get { return TravisWhidden.Properties.Settings.Default.BaseFileUpload; }
}
#endregion
#region "Methods"
private bool IsUploadPages()
{
HttpApplication app = HttpContext.Current.ApplicationInstance;
string[] uploadPages = (string[])
TravisWhidden.Properties.Settings.Default.Pages.Split(
new string[] {";"}, StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < uploadPages.Length; i++)
{
if (uploadPages[i].ToLower() ==
app.Request.Path.Substring(1).ToLower())
return true;
}
return false;
}
#endregion
#region IHttpModule Members
void System.Web.IHttpModule.Dispose()
{
}
void System.Web.IHttpModule.Init(System.Web.HttpApplication context)
{
context.BeginRequest += new EventHandler(context_BeginRequest);
}
#endregion
#region "Event Handlers / Events"
void context_BeginRequest(object sender, EventArgs e)
{
HttpApplication httpApplication = (HttpApplication)sender;
HttpContext context = httpApplication.Context;
if (!IsUploadPages())
{
Debug.WriteLine("UploadModule - Not IsUploadPages().");
return;
}
Debug.WriteLine("UploadModule - IsUploadPages().");
FileProcessor fp = new FileProcessor(_baseFileLocation);
string rawURL = context.Request.RawUrl;
Guid currentFileID = Guid.NewGuid();
UploadStatus uploadStatus = new UploadStatus(currentFileID);
if (context.Request.QueryString["PostID"] != null &&
context.Request.QueryString["PostID"].Length ==
Guid.NewGuid().ToString().Length)
{
currentFileID = new Guid(context.Request.QueryString["PostID"]);
}
else
{
if (rawURL.IndexOf("?") == -1)
{
rawURL = rawURL + "?PostID=" + currentFileID.ToString();
}
else
{
rawURL = rawURL + "&PostID=" + currentFileID.ToString();
}
context.Response.Redirect(rawURL);
}
if (context.Request.ContentType.IndexOf("multipart/form-data") == -1)
{
return;
}
Debug.WriteLine("UploadModule Executing.");
HttpWorkerRequest workerRequest =
(HttpWorkerRequest)context.GetType().GetProperty(
"WorkerRequest", BindingFlags.Instance |
BindingFlags.NonPublic).GetValue(context, null);
if (workerRequest.HasEntityBody())
{
try
{
context.Application.Add(currentFileID.ToString(), uploadStatus);
long contentLength = long.Parse((workerRequest.GetKnownRequestHeader(
HttpWorkerRequest.HeaderContentLength)));
uploadStatus.TotalBytes = contentLength;
long receivedcount = 0;
long defaultBuffer = 8192;
byte[] preloadedBufferData = workerRequest.GetPreloadedEntityBody();
if (preloadedBufferData == null)
{
throw new Exception("GetPreloadedEntityBody() " +
"was null. Try again");
}
fp.GetFieldSeperators(ref preloadedBufferData);
fp.ProcessBuffer(ref preloadedBufferData, true);
uploadStatus.CurrentBytesTransfered += preloadedBufferData.Length;
if (!workerRequest.IsEntireEntityBodyIsPreloaded())
{
do
{
long tempBufferSize = (uploadStatus.TotalBytes -
uploadStatus.CurrentBytesTransfered);
if (tempBufferSize < defaultBuffer)
{
defaultBuffer = tempBufferSize;
}
byte[] bufferData = new byte[defaultBuffer];
receivedcount =
workerRequest.ReadEntityBody(bufferData, bufferData.Length);
fp.ProcessBuffer(ref bufferData, true);
uploadStatus.CurrentBytesTransfered += bufferData.Length;
} while (receivedcount != 0);
}
}
catch (Exception ex)
{
Debug.WriteLine("Error: " + ex.Message);
Debug.WriteLine(ex.ToString());
}
finally
{
fp.CloseStreams();
}
string finishedFiles = string.Join(";",
fp.FinishedFiles.ToArray());
if (rawURL.IndexOf("?") == -1 &&
finishedFiles.Length > 0)
{
rawURL = rawURL + "?finishedFiles=" + finishedFiles;
}else{
rawURL = rawURL + "&finishedFiles=" + finishedFiles;
}
fp.Dispose();
context.Response.Redirect(rawURL);
}
}
#endregion
}
}
FileProcessor.cs
The FileProcessor
class will be processing the buffers that come in off the WorkerRequest
object. It is responsible for finding the files in the form post, extracting the data out, and keeping track of the files it extracted.
using System;
using System.Collections.Generic;
using System.Text;
using System.Diagnostics;
using System.Text;
using System.IO;
using System.Threading;
namespace TravisWhidden
{
public class FileProcessor : IDisposable
{
#region "Class Vars"
private string _formPostID = "";
private string _fieldSeperator = "";
private long _currentBufferIndex = 0;
private bool _startFound = false;
private bool _endFound = false;
private string _currentFilePath = @"C:\upload\";
private string _currentFileName = Guid.NewGuid().ToString() + ".bin";
private FileStream _fileStream = null;
private long _startIndexBufferID = -1;
private int _startLocationInBufferID = -1;
private long _endIndexBufferID = -1;
private int _endLocationInBufferID = -1;
private Dictionary<long, byte[]> _bufferHistory =
new Dictionary<long, byte[]>();
private List<string> _finishedFiles = new List<string>();
#endregion
#region "Constructors"
public FileProcessor(string uploadLocation)
{
_currentFilePath = uploadLocation;
}
#endregion
#region "Properties"
public List<string> FinishedFiles
{
get { return _finishedFiles; }
}
#endregion
#region "Methods"
public void ProcessBuffer(ref byte[] bufferData, bool addToBufferHistory)
{
int byteLocation = -1;
if (!_startFound)
{
byteLocation = GetStartBytePosition(ref bufferData);
if (byteLocation != -1)
{
_startIndexBufferID = _currentBufferIndex + 1;
_startLocationInBufferID = byteLocation;
_startFound = true;
}
}
if (_startFound)
{
int startLocation = 0;
if (byteLocation != -1)
{
startLocation = byteLocation;
}
int writeBytes = ( bufferData.Length - startLocation );
int tempEndByte = GetEndBytePosition(ref bufferData);
if (tempEndByte != -1)
{
writeBytes = (tempEndByte - startLocation);
_endFound = true;
_endIndexBufferID = (_currentBufferIndex + 1);
_endLocationInBufferID = tempEndByte;
}
if (writeBytes > 0)
{
if (_fileStream == null)
{
_fileStream = new FileStream(_currentFilePath +
_currentFileName, FileMode.OpenOrCreate);
int fileTimeToLive =
global::TravisWhidden.Properties.Settings.Default.FileTimeToLive;
Timer t = new Timer(new TimerCallback(DeleteFile),
_currentFilePath + _currentFileName,
(fileTimeToLive * 1000), 0);
}
_fileStream.Write(bufferData, startLocation, writeBytes);
_fileStream.Flush();
}
}
if (_endFound)
{
CloseStreams();
_startFound = false;
_endFound = false;
ProcessBuffer(ref bufferData, false);
}
if (addToBufferHistory)
{
_bufferHistory.Add(_currentBufferIndex, bufferData);
_currentBufferIndex++;
RemoveOldBufferData();
}
}
private void RemoveOldBufferData()
{
for (long bufferIndex = _currentBufferIndex;
bufferIndex >= 0; bufferIndex--)
{
if (bufferIndex > (_currentBufferIndex - 3))
{
}
else
{
if (_bufferHistory.ContainsKey(bufferIndex))
{
_bufferHistory.Remove(bufferIndex);
}
else
{
bufferIndex = 0;
}
}
}
GC.Collect();
}
public void CloseStreams()
{
if (_fileStream != null)
{
_fileStream.Dispose();
_fileStream.Close();
_fileStream = null;
_finishedFiles.Add(_currentFileName);
_currentFileName = Guid.NewGuid().ToString() + ".bin";
}
}
public void GetFieldSeperators(ref byte[] bufferData)
{
try
{
_formPostID = Encoding.UTF8.GetString(bufferData).Substring(29, 13);
_fieldSeperator =
"-----------------------------" + _formPostID;
}
catch (Exception ex)
{
Debug.WriteLine("Error in GetFieldSeperators(): " + ex.Message);
}
}
private int GetStartBytePosition(ref byte[] bufferData)
{
int byteOffset = 0;
if (_startIndexBufferID == (_currentBufferIndex + 1))
{
byteOffset = _startLocationInBufferID;
}
if (_endIndexBufferID == (_currentBufferIndex +1))
{
byteOffset = _endLocationInBufferID;
}
int tempContentTypeStart = -1;
byte[] searchString = Encoding.UTF8.GetBytes("Content-Type: ");
tempContentTypeStart =
FindBytePattern(ref bufferData, ref searchString, byteOffset);
if (tempContentTypeStart != -1)
{
searchString = Encoding.UTF8.GetBytes("\r\n\r\n");
int tempSearchStringLocation = FindBytePattern(ref bufferData,
ref searchString, tempContentTypeStart);
if (tempSearchStringLocation != -1)
{
int byteStart = tempSearchStringLocation + 4;
return byteStart;
}
}
else if((byteOffset - searchString.Length) > 0 ){
return -1;
}
else
{
if (_currentBufferIndex > 0)
{
byte[] previousBuffer = _bufferHistory[_currentBufferIndex - 1];
byte[] mergedBytes = MergeArrays(ref previousBuffer, ref bufferData);
byte[] searchString2 = Encoding.UTF8.GetBytes("Content-Type: ");
tempContentTypeStart = FindBytePattern(ref mergedBytes,
ref searchString2, previousBuffer.Length - searchString2.Length);
if (tempContentTypeStart != -1)
{
searchString2 = Encoding.UTF8.GetBytes("Content-Type: ");
int tempSearchStringLocation = FindBytePattern(ref mergedBytes,
ref searchString2, (previousBuffer.Length - searchString2.Length));
if (tempSearchStringLocation != -1)
{
if (tempSearchStringLocation > previousBuffer.Length)
{
int currentBufferByteLocation =
(tempSearchStringLocation - previousBuffer.Length);
return currentBufferByteLocation;
}
else
{
return 0;
}
}
}
}
}
return -1;
}
private int GetEndBytePosition(ref byte[] bufferData)
{
int byteOffset = 0;
if (_startIndexBufferID == (_currentBufferIndex + 1))
{
byteOffset = _startLocationInBufferID;
}
int tempFieldSeperator = -1;
byte[] searchString = Encoding.UTF8.GetBytes(_fieldSeperator);
tempFieldSeperator = FindBytePattern(ref bufferData,
ref searchString, byteOffset);
if (tempFieldSeperator != -1)
{
if (tempFieldSeperator - 2 < 0)
{
}
else
{
return (tempFieldSeperator - 2);
}
}
else if ((byteOffset - searchString.Length) > 0)
{
return -1;
}
else
{
if (_currentBufferIndex > 0)
{
byte[] previousBuffer = _bufferHistory[_currentBufferIndex - 1];
byte[] mergedBytes = MergeArrays(ref previousBuffer, ref bufferData);
byte[] searchString2 = Encoding.UTF8.GetBytes(_fieldSeperator);
tempFieldSeperator = FindBytePattern(ref mergedBytes,
ref searchString2,
previousBuffer.Length - searchString2.Length + byteOffset);
if (tempFieldSeperator != -1)
{
searchString2 = Encoding.UTF8.GetBytes("\r\n\r\n");
int tempSearchStringLocation = FindBytePattern(ref mergedBytes,
ref searchString2, tempFieldSeperator);
if (tempSearchStringLocation != -1)
{
if (tempSearchStringLocation > previousBuffer.Length)
{
int currentBufferByteLocation =
(tempSearchStringLocation - previousBuffer.Length);
return currentBufferByteLocation;
}
else
{
return -1;
}
}
}
}
}
return -1;
}
private static int FindBytePattern(ref byte[] containerBytes,
ref byte[] searchBytes, int startAtIndex)
{
int returnValue = -1;
for (int byteIndex = startAtIndex; byteIndex <
containerBytes.Length; byteIndex++)
{
if (byteIndex + searchBytes.Length > containerBytes.Length)
{
return -1;
}
if (containerBytes == searchBytes[0])
{
bool found = true;
int tempStartIndex = byteIndex;
for (int searchBytesIndex = 1; searchBytesIndex <
searchBytes.Length; searchBytesIndex++)
{
tempStartIndex++;
if (!(searchBytes[searchBytesIndex] ==
containerBytes[tempStartIndex]))
{
found = false;
break;
}
}
if (found)
{
return byteIndex;
}
}
}
return returnValue;
}
private static byte[] MergeArrays(ref byte[] arrayOne, ref byte[] arrayTwo)
{
System.Type elementType = arrayOne.GetType().GetElementType();
byte[] newArray = new byte[arrayOne.Length + arrayTwo.Length];
arrayOne.CopyTo(newArray, 0);
arrayTwo.CopyTo(newArray, arrayOne.Length);
return newArray;
}
private static void DeleteFile(object filePath)
{
try
{
if (System.IO.File.Exists((string)filePath))
{
System.IO.File.Delete((string)filePath);
}
}
catch { }
}
#endregion
#region IDisposable Members
public void Dispose()
{
_bufferHistory.Clear();
GC.Collect();
}
#endregion
}
}
Instructions for use
This module will be seen by every page; however, we don't want to use it unless it's the file upload page. You will see in the web.config a section for setting the pages to watch, along with some other settings:
<applicationSettings>
<TravisWhidden.Properties.Settings>
<setting name="BaseFileUpload" serializeAs="String">
<value>C:\upload\</value>
</setting>
<setting name="FileTimeToLive" serializeAs="String">
<value>3600</value>
</setting>
<setting name="Pages" serializeAs="String">
<value>UploadModuleExample/Default.aspx;UploadModuleAtlasExample/
Default.aspx;/MembersOnly/ImageUploadManagement.aspx</value>
</setting>
</TravisWhidden.Properties.Settings>
</applicationSettings>
Closing
This module is to be distributed free of charge. If you plan to sell this module, at least send me some of the proceeds. He he.
Contact Info
You can email me at travis@lvfbody.com if you have any questions or comments. I would love to hear about how my module is being used and how it is acting.
References
- http://forums.asp.net/thread/106552.aspx
Description: Very good information on how you can get this to work. Some code examples are extremely inefficient, but it gets the point across.
Other references (which pretty much look like stuff on ASP.NET forums) can be found on CodeProject.com. Some of the examples seen did not do all in one.
Additional Information
Note: as I was writing this up, I found this MSDN reference, and I don't know if this project could have been made simpler. I hope we did not re-invent the wheel. It may be nothing, but worth reading about anyways: http://msdn2.microsoft.com/en-us/library/system.web.httppostedfile.aspx.
This is what caught my eye: "Files are uploaded in MIME multipart/form-data format. By default, all requests, including form fields and uploaded files, larger than 256 KB are buffered to disk, rather than held in server memory."