Introduction
Ever since I started working with asp pages, I've always wanted to be able to test them simply from the file system without having to configure IIS or setup a virtual directory. Especially when downloading and looking at sample code from sites like CodeProject. While working on an application for web based acceptance testing, I discovered how to make it happen. First, ASP.NET lets you host the runtime outside of IIS. Second, Microsoft provides the Asynchronous Pluggable Protocol interfaces to create custom protocols for IE. I just needed to hook a custom protocol up to an instance of the ASP.NET runtime and everything would work great. My final solution came out pretty close to that, but there were quite a few problems I had to work through along the way.
I've included two protocols, echo: and aspx:. The echo protocol just echoes the url line as html text. This is the test protocol I wrote just to make sure I got the headers right. Aspx will run any aspx page locally and load the result in IE. It supports most headers and postback but doesn't support cookies just yet. I've added a shell extension as well that lets you just right click on any aspx file and execute it using the aspx protocol.
Using the code
You can use the protocol anywhere you would use a regular protocol, such as the run line or linking from the browser. The aspx protocol should let you test, develop and distribute your web applications without having to install a web server of any kind. Since I wrote 2 protocols I factored out the registration and base buffer code into a base class, and some attributes. If you want to experiment with your own protocols, all you need to do is assign an attribute to your class, override the start method and derive from BaseProtocol
. You can also use the ContextHandlerAttribute
to register any ShellExtensions
you write.
Registration
Since protocol is a com component that has to be able to be loaded from any directory, it needs to be registered in the gac or setup with DEVPATH, as well as registered with regasm. I've included a register.cmd file to register the component. If you have problems then most likely your .NET install is not in the default location.
If you want to compile the code and install that way, first register it in the gac or use the the DEVPATH variable, which is easier once you have it setup. Using fuslogvw.exe was essential in getting DEVPATH configured. One note: You need to include the trailing \ in the directory that you put in DEVPATH, since the last character is trimmed off no matter what it is. Then you must run regasm on the component. The component handles it's registration and all the additional registry keys to setup the protocol and shell extensions required.
The Interface
IInternetProtocol
derives from IInternetProtocolRoot
. IInternetProtocolRoot
covers methods required to start and stop downloading content. The 4 methods that IInternetProtocol
adds all deal with getting the results of the download. All the methods are required to implement an asynchronous download and handle situations like user aborts, pausing and resuming downloads. Fortunately we can get away with doing a synchronous download. After much trial and error I determined that I only needed to implement 3 methods to get this simple protocol working.
First the Start method is called with the full address string is passed in including the initial protocol string. This tells us to begin downloading and gives us two interfaces. One to notify IE with and the other to get additional information about the request from. Since we are just implementing a simple protocol, all we need is the url and we can focus on the first interface. We need to do a minimum of 2 things here, prepare our data and then send notification that the data is ready to read. We prepare the data by creating a writer, writing to the data stream and flushing the writer like this.
public void WriteBasicMessage(string Message, string Title)
{
StreamWriter Writer = new StreamWriter(Stream);
Writer.Write("{0}", Message, Title);
Writer.Flush();
Stream.Position = 0;
}
To notify IE, we need to call 2 methods on
IInternetProtocolSink
the first is
ReportData
, passing in the last notification code and the number of bytes in our stream. WARNING: The documentation states the you just need to pass in a number that tells how far the download has progressed. When I passed in a value of 100, thinking using percent would be just fine, there were errors for large file downloads. I didn't figure out the problem until I wrote my own
UrlMon
client and saw that it was returning the file size to this method. When I gave that a try, everything magically started working. The second method is
ReportResult
. I just pass in
S_OK
and 200 for OK and that works fine.
Now we come to the second gotcha when dealing with these interfaces. The documentation states that the calls to read should be bracketed by calls to Lock and Unlock, but you might get some calls to read after Unlock. Well, it is also likely that you will get calls to Read before calls to Lock. What does this mean to us component writers. We need to initialize our buffer in the Start method before we call
ReportData
instead of flushing everything in the Lock method, which seems more natural. With that out of the way, Read is a pretty simple method to implement as long as we use a .NET stream to store our data in and use a
byte[]
array to do the transfer. My first implementation used a
StringWriter
base on a
StringBuilder
, which worked alright until I ran into encoding issues. Switching to a
MemoryStream
and creating the proper Writer objects for each case turned out to be the right move.
This worked except that IE was interpreting the output as a text file and not taking any of the html formatting into account. I needed to send the encoding headers to the client so it would know this is html text. To do that, I had to access yet another couple of interfaces. First, we needed to call
IServiceProvider
which is a standard interface for getting lazy initialized objects from an interface. Then we just query for the
IHttpNegotiate
interface. This lets us do 2 things, get request headers from the client and send response headers back. For our simple echo protocol, we just need to send some basic headers. They look like this.
HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length:234
The
Content_Type: text/html;
header tells the client that we are sending back html content and the content length tells it how much we are sending back. If it seems like quite a bit of work just to write out a couple of lines of html in the browser, that's because it is. However, it is setting us up for the cool part. Running ASP.NET locally without using a web server.
ASPXProtocol
Hosting ASP.NET outside of IIS, is the most well documented part of this project. A very good example is Cassini, the open source webserver used with asp matrix. Since there are plenty of good articles out there I'm just going to provide a short overview. There are two steps to hosting ASP.NET. Initializing the runtime and sending requests to it. For security, ASP.NET loads up in separate
AppDomains
. This prevents code from one site getting access to another site without special permissions. However, it makes things more complicated for us since we need to send requests from our Domain to the ASP.NET domain. To make this easier the
System.Web.Hosting
namespace contains the
ApplicationHost.CreateApplicationHost
method. This creates a host object for us and returns a proxy reference to us. This can be pretty confusing the first time you encounter it, but it makes perfect sense once you get used to it. Trust me. Once we have a copy of the host object, what good is it? The host object acts as our gateway to the ASP.NET runtime, so we need to give it some useful functionality. We just need to give it a simple method, I decided to call mine
SendRequest
. It just turns around and calls the
HttpRuntime.ProcessRequest
method. This loads up the runtime, processes the request and sends the response back.
One of the difficulties in initializing the runtime is that we need to find the location of the root directory of the site and the virtual directory. Normally this information is maintained by IIS. In our case we don't have any of this information and need to search for the root of the site. A simple method works for most cases. First we iterate back to the root of the file system. If we find a global.asax file we can be pretty sure that we have found the root of the containing site for the page. If we don't find a global.asax file then we use the lowest directory that contains a web.config. If that fails, then the lowest directory that contains an aspx page is used. This works for nearly all the sites, except for virtual directories without an asax file. I plan on adding a configuration site to the object that lets these be configured if need be.
The
System.Web.Hosting
namespace contains the
SimpleWorkerRequest
class to handle basic requests. This does not include POST or header data, so I had to use a derived class to handle these. Also, I got errors if I created it outside of the ASP.NET
AppDomain
, so I had to create a simple data class that I passed into the
SendRequest
method. After that, I just matched up some overrides to the data class and everything ran great. There was just one hitch.
SimpleWorkerRequest
uses a
TextWriter
to output the content. This worked fine for simple text, and outputting directly to a file, but if the content was an image this caused problems.
TextWriter
does encoding and decoding on write. Since I'm streaming directly to IE, which does it's own decoding, this caused the stream to be encoded twice. I overrode a couple more methods and swapped in a
BinaryWriter
and everything came through including gif and jpeg images. I lucked out with this shortcut, since I can just load the image files by having ASP.NET process the images just like aspx pages. I need to load those directly for a more robust solution. Now that I have the hosting working I just tied that in with the protocol code that I wrote for the
echoProtocol
And everything was working fine. I could load up an aspx page locally with just a link like aspx:c:\test\test.aspx. However links from the page were broken and POST didn't work.
Fixing the links and IInternetProtocolInfo
The problem with the links is that without any additional information, IE will just append the links from the page onto the current url of the page. If we are on aspx:c:/test/test.aspx and have a link to aspx:c:/test/test2.aspx it will be interpreted as aspx:c:/test/test.aspxaspx:c:/test/test2.aspx. Each protocol has it's own way of combining URLs. Fortunately, the
IInternetProtocolInfo
interface gives us a way to tell the client how to combine these Urls. There are 3 main types of links that we need to parse. Fully qualified links need to be processed as is without including the current page. Links beginning with / need to map to the root of the site that we are running in. And local links need to map starting at the current directory. The implementation of
IInternetProtocolInfo
gives a good minimal implementation of the interface
DB2XML Implements Pluggable Protocol Handler
POST data and IInternetBindInfo
Getting this to work was a matter of setting up the COM plumbing with the Marshal object, and finding this KB article. HOWTO: Handle POST Requests in a Pluggable Protocol Handler Following the instructions in this article led to the following implementation
public byte[] GetPostData(BINDINFO BindInfo)
{
if (BindInfo.dwBindVerb != BINDVERB.BINDVERB_POST)
return new byte[0];
byte[] result = new byte[0];
if (BindInfo.stgmedData.enumType == TYMED.TYMED_HGLOBAL)
{
UInt32 length = BindInfo.cbStgmedData;
result = new byte[length];
Marshal.Copy(BindInfo.stgmedData.u, result, 0, (int)length);
if (BindInfo.stgmedData.pUnkForRelease == null)
Marshal.FreeHGlobal(BindInfo.stgmedData.u);
}
return result;
}
Browsing for files
In order to make things easier, I've added a simple function that runs the url through the System.IO.Directory.Exists
method. If it is a directory then the protocol returns a simple list of the parent and subdirectories as well as the local files. This makes it easier to find specific files.
COM Registration
Registering the protocol required adding a few extra registry keys. This can be accomplished by adding a couple of static methods to each exported .NET object and making them with attributes. Since I implemented a couple of objects, I refactored this into a base class and created a simple attribute to feed the specific information back from the derived class to the base during registration.
Shell integration
Adding a few registry keys would have been enough for most registrations, pointing at c:\program files\internet explorer\iexplore.exe. However, I wanted to do something a little more robust. This requires another object this time implementing IContextMenu
and IShellExtInit
. This was pretty simple I just lookup the mapping for http to send the request to the same application registered for that. The .NET framework samples provide a decent example of implementing IContextMenu
.
Implementing in .NET
Without a doubt the most tedious part of this project has been getting the COM Interop to work correctly. Interop has a number of tools that make accessing any component with a typelib easy. I initially tried to generate a type library from the IDL for these interfaces, and import these in with tlbimp.exe. I had a number of problems with these imports and ended up implementing the interfaces by hand. This article described the technique that helped me out the most.
However working with low level interfaces that require direct memory access has a high learning curve with little documentation to help. I would have been totally lost without the years I devoted to learning IDL and doing COM under C++. It wasn't a lot of fun having to learn a completely new syntax to express basic IDL constructs. However I was pleasantly surprised that the CLR COM interop was able to handle all the interfaces I threw at it. I really only expected interop to work as well as VB6.
Future Enhancements
There are a few loose ends that I'm currently working on when I have time. First I'm working on an html based configuration site so that you can configure virtual domains like aspx:www.mydomain.com/ to map to directories on your hard drive. Second, I need to figure out how to get cookies working. Either I need to get IE to handle them, or implement them myself. I think it is the protocols' responsibility, but I'm not sure. Also, I currently don't add any additional header information to responses like mime type or create date, etc. It would be good to add that for the future.
References
This excellent control started this project off. HTML Editor
I've been working on a unit/acceptance testing application for the web development that I've been doing recently. I had a version working that automates IE, except the application kept losing control of IE and didn't have enough low level control of the networking. I've written a version using the WebBrowser control and one using the HtmlEditor control and the HtmlEditor control has by far been the most stable. However, it doesn't give any http status notifications on error. I wrote this protocol to figure more about what is going on under the covers with MSHTML and UrlMon. I also plan to use it for some phases of automated testing because it lets me gain access to the internals of the ASP.NET process during page execution and between page execution. This is important to make sure that all my pages handle lost session, cache and other catastrophic events correctly.
History