(untagged)

A Simple protocol to view aspx pages without IIS implemented in C#

Andy Brummer

0.00/5 (No votes)

17 Feb 2004

Covers how to write a Pluggable Asyncrhonous Protocol using C# and provides a useful protocol to enable local execution of ASP.NET sites.

Introduction

Ever since I started working with asp pages, I've always wanted to be able to test them simply from the file system without having to configure IIS or setup a virtual directory. Especially when downloading and looking at sample code from sites like CodeProject. While working on an application for web based acceptance testing, I discovered how to make it happen. First, ASP.NET lets you host the runtime outside of IIS. Second, Microsoft provides the Asynchronous Pluggable Protocol interfaces to create custom protocols for IE. I just needed to hook a custom protocol up to an instance of the ASP.NET runtime and everything would work great. My final solution came out pretty close to that, but there were quite a few problems I had to work through along the way.

I've included two protocols, echo: and aspx:. The echo protocol just echoes the url line as html text. This is the test protocol I wrote just to make sure I got the headers right. Aspx will run any aspx page locally and load the result in IE. It supports most headers and postback but doesn't support cookies just yet. I've added a shell extension as well that lets you just right click on any aspx file and execute it using the aspx protocol.

Using the code

You can use the protocol anywhere you would use a regular protocol, such as the run line or linking from the browser. The aspx protocol should let you test, develop and distribute your web applications without having to install a web server of any kind. Since I wrote 2 protocols I factored out the registration and base buffer code into a base class, and some attributes. If you want to experiment with your own protocols, all you need to do is assign an attribute to your class, override the start method and derive from BaseProtocol. You can also use the ContextHandlerAttribute to register any ShellExtensions you write.

Registration

Since protocol is a com component that has to be able to be loaded from any directory, it needs to be registered in the gac or setup with DEVPATH, as well as registered with regasm. I've included a register.cmd file to register the component. If you have problems then most likely your .NET install is not in the default location.

If you want to compile the code and install that way, first register it in the gac or use the the DEVPATH variable, which is easier once you have it setup. Using fuslogvw.exe was essential in getting DEVPATH configured. One note: You need to include the trailing \ in the directory that you put in DEVPATH, since the last character is trimmed off no matter what it is. Then you must run regasm on the component. The component handles it's registration and all the additional registry keys to setup the protocol and shell extensions required.

The Interface

IInternetProtocol derives from IInternetProtocolRoot. IInternetProtocolRoot covers methods required to start and stop downloading content. The 4 methods that IInternetProtocol adds all deal with getting the results of the download. All the methods are required to implement an asynchronous download and handle situations like user aborts, pausing and resuming downloads. Fortunately we can get away with doing a synchronous download. After much trial and error I determined that I only needed to implement 3 methods to get this simple protocol working.

First the Start method is called with the full address string is passed in including the initial protocol string. This tells us to begin downloading and gives us two interfaces. One to notify IE with and the other to get additional information about the request from. Since we are just implementing a simple protocol, all we need is the url and we can focus on the first interface. We need to do a minimum of 2 things here, prepare our data and then send notification that the data is ready to read. We prepare the data by creating a writer, writing to the data stream and flushing the writer like this.

    public void WriteBasicMessage(string Message, string Title)
    {
      StreamWriter Writer = new StreamWriter(Stream); 
      Writer.Write("{0}", Message, Title);
      Writer.Flush();
      Stream.Position = 0;
    }

To notify IE, we need to call 2 methods on IInternetProtocolSink the first is ReportData, passing in the last notification code and the number of bytes in our stream. WARNING: The documentation states the you just need to pass in a number that tells how far the download has progressed. When I passed in a value of 100, thinking using percent would be just fine, there were errors for large file downloads. I didn't figure out the problem until I wrote my own UrlMon client and saw that it was returning the file size to this method. When I gave that a try, everything magically started working. The second method is ReportResult. I just pass in S_OK and 200 for OK and that works fine.

Now we come to the second gotcha when dealing with these interfaces. The documentation states that the calls to read should be bracketed by calls to Lock and Unlock, but you might get some calls to read after Unlock. Well, it is also likely that you will get calls to Read before calls to Lock. What does this mean to us component writers. We need to initialize our buffer in the Start method before we call ReportData instead of flushing everything in the Lock method, which seems more natural. With that out of the way, Read is a pretty simple method to implement as long as we use a .NET stream to store our data in and use a byte[] array to do the transfer. My first implementation used a StringWriter base on a StringBuilder, which worked alright until I ran into encoding issues. Switching to a MemoryStream and creating the proper Writer objects for each case turned out to be the right move.

This worked except that IE was interpreting the output as a text file and not taking any of the html formatting into account. I needed to send the encoding headers to the client so it would know this is html text. To do that, I had to access yet another couple of interfaces. First, we needed to call IServiceProvider which is a standard interface for getting lazy initialized objects from an interface. Then we just query for the IHttpNegotiate interface. This lets us do 2 things, get request headers from the client and send response headers back. For our simple echo protocol, we just need to send some basic headers. They look like this.

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length:234

The Content_Type: text/html; header tells the client that we are sending back html content and the content length tells it how much we are sending back. If it seems like quite a bit of work just to write out a couple of lines of html in the browser, that's because it is. However, it is setting us up for the cool part. Running ASP.NET locally without using a web server.

ASPXProtocol

Hosting ASP.NET outside of IIS, is the most well documented part of this project. A very good example is Cassini, the open source webserver used with asp matrix. Since there are plenty of good articles out there I'm just going to provide a short overview. There are two steps to hosting ASP.NET. Initializing the runtime and sending requests to it. For security, ASP.NET loads up in separate AppDomains. This prevents code from one site getting access to another site without special permissions. However, it makes things more complicated for us since we need to send requests from our Domain to the ASP.NET domain. To make this easier the System.Web.Hosting namespace contains the ApplicationHost.CreateApplicationHost method. This creates a host object for us and returns a proxy reference to us. This can be pretty confusing the first time you encounter it, but it makes perfect sense once you get used to it. Trust me. Once we have a copy of the host object, what good is it? The host object acts as our gateway to the ASP.NET runtime, so we need to give it some useful functionality. We just need to give it a simple method, I decided to call mine SendRequest. It just turns around and calls the HttpRuntime.ProcessRequest method. This loads up the runtime, processes the request and sends the response back.

One of the difficulties in initializing the runtime is that we need to find the location of the root directory of the site and the virtual directory. Normally this information is maintained by IIS. In our case we don't have any of this information and need to search for the root of the site. A simple method works for most cases. First we iterate back to the root of the file system. If we find a global.asax file we can be pretty sure that we have found the root of the containing site for the page. If we don't find a global.asax file then we use the lowest directory that contains a web.config. If that fails, then the lowest directory that contains an aspx page is used. This works for nearly all the sites, except for virtual directories without an asax file. I plan on adding a configuration site to the object that lets these be configured if need be.

The System.Web.Hosting namespace contains the SimpleWorkerRequest class to handle basic requests. This does not include POST or header data, so I had to use a derived class to handle these. Also, I got errors if I created it outside of the ASP.NET AppDomain, so I had to create a simple data class that I passed into the SendRequest method. After that, I just matched up some overrides to the data class and everything ran great. There was just one hitch. SimpleWorkerRequest uses a TextWriter to output the content. This worked fine for simple text, and outputting directly to a file, but if the content was an image this caused problems. TextWriter does encoding and decoding on write. Since I'm streaming directly to IE, which does it's own decoding, this caused the stream to be encoded twice. I overrode a couple more methods and swapped in a BinaryWriter and everything came through including gif and jpeg images. I lucked out with this shortcut, since I can just load the image files by having ASP.NET process the images just like aspx pages. I need to load those directly for a more robust solution. Now that I have the hosting working I just tied that in with the protocol code that I wrote for the echoProtocol And everything was working fine. I could load up an aspx page locally with just a link like aspx:c:\test\test.aspx. However links from the page were broken and POST didn't work.

Fixing the links and IInternetProtocolInfo

The problem with the links is that without any additional information, IE will just append the links from the page onto the current url of the page. If we are on aspx:c:/test/test.aspx and have a link to aspx:c:/test/test2.aspx it will be interpreted as aspx:c:/test/test.aspxaspx:c:/test/test2.aspx. Each protocol has it's own way of combining URLs. Fortunately, the IInternetProtocolInfo interface gives us a way to tell the client how to combine these Urls. There are 3 main types of links that we need to parse. Fully qualified links need to be processed as is without including the current page. Links beginning with / need to map to the root of the site that we are running in. And local links need to map starting at the current directory. The implementation of IInternetProtocolInfo gives a good minimal implementation of the interface DB2XML Implements Pluggable Protocol Handler

POST data and IInternetBindInfo

Getting this to work was a matter of setting up the COM plumbing with the Marshal object, and finding this KB article. HOWTO: Handle POST Requests in a Pluggable Protocol Handler Following the instructions in this article led to the following implementation

    public byte[] GetPostData(BINDINFO BindInfo)
    {
      if (BindInfo.dwBindVerb != BINDVERB.BINDVERB_POST)
        return new byte[0];
      byte[] result = new byte[0];
      if (BindInfo.stgmedData.enumType == TYMED.TYMED_HGLOBAL)
      {
        UInt32 length = BindInfo.cbStgmedData;
        result = new byte[length];

        Marshal.Copy(BindInfo.stgmedData.u, result, 0, (int)length);
        if (BindInfo.stgmedData.pUnkForRelease == null)
          Marshal.FreeHGlobal(BindInfo.stgmedData.u);
      }
      return result;
    }

Browsing for files

In order to make things easier, I've added a simple function that runs the url through the System.IO.Directory.Exists method. If it is a directory then the protocol returns a simple list of the parent and subdirectories as well as the local files. This makes it easier to find specific files.

COM Registration

Registering the protocol required adding a few extra registry keys. This can be accomplished by adding a couple of static methods to each exported .NET object and making them with attributes. Since I implemented a couple of objects, I refactored this into a base class and created a simple attribute to feed the specific information back from the derived class to the base during registration.

Shell integration

Adding a few registry keys would have been enough for most registrations, pointing at c:\program files\internet explorer\iexplore.exe. However, I wanted to do something a little more robust. This requires another object this time implementing IContextMenu and IShellExtInit. This was pretty simple I just lookup the mapping for http to send the request to the same application registered for that. The .NET framework samples provide a decent example of implementing IContextMenu.

Implementing in .NET

Without a doubt the most tedious part of this project has been getting the COM Interop to work correctly. Interop has a number of tools that make accessing any component with a typelib easy. I initially tried to generate a type library from the IDL for these interfaces, and import these in with tlbimp.exe. I had a number of problems with these imports and ended up implementing the interfaces by hand. This article described the technique that helped me out the most.

However working with low level interfaces that require direct memory access has a high learning curve with little documentation to help. I would have been totally lost without the years I devoted to learning IDL and doing COM under C++. It wasn't a lot of fun having to learn a completely new syntax to express basic IDL constructs. However I was pleasantly surprised that the CLR COM interop was able to handle all the interfaces I threw at it. I really only expected interop to work as well as VB6.

Future Enhancements

There are a few loose ends that I'm currently working on when I have time. First I'm working on an html based configuration site so that you can configure virtual domains like aspx:www.mydomain.com/ to map to directories on your hard drive. Second, I need to figure out how to get cookies working. Either I need to get IE to handle them, or implement them myself. I think it is the protocols' responsibility, but I'm not sure. Also, I currently don't add any additional header information to responses like mime type or create date, etc. It would be good to add that for the future.

References

This excellent control started this project off. HTML Editor

I've been working on a unit/acceptance testing application for the web development that I've been doing recently. I had a version working that automates IE, except the application kept losing control of IE and didn't have enough low level control of the networking. I've written a version using the WebBrowser control and one using the HtmlEditor control and the HtmlEditor control has by far been the most stable. However, it doesn't give any http status notifications on error. I wrote this protocol to figure more about what is going on under the covers with MSHTML and UrlMon. I also plan to use it for some phases of automated testing because it lets me gain access to the internals of the ASP.NET process during page execution and between page execution. This is important to make sure that all my pages handle lost session, cache and other catastrophic events correctly.

History

First Version.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here