Introduction
Most developers often need Internet Explorer automation, which basically means opening a browser, filling some forms, and posting data programmatically.
The most common approach is to use shdocvw.dll (the Microsoft Web Browser control) and Mshtml.dll (the HTML Parsing and Rendering Component), or Microsoft.Mshtml.dll which is actually a .NET wrapper for Mshtml.dll. You can get more information about Internet Explorer - About the Browser here.
If you pick the above method and DLLs, let's see some of the problems you may have to deal with:
- You have to distribute these DLLs because your project would be dependent to these DLLs, and this is a serious problem if you cannot deploy them correctly. Simply do some Googling about shdocvw and mshtml.dll distributing problems, and you'll see what I'm talking about.
- You have to deploy an 8 MB Microsoft.mshtml.dll because this DLL is not part of the .NET framework.
In this case, what we need to do is use a late binding technique. Writing our own wrappers for the above mentioned DLLs. And of course, we'll do this as it is more useful than using these DLLs. For instance, we won't need to check if the document download operation is complete because IEHelper
will do this for us.
Background
What we need to do here is use a late binding technique which is successfully explained here in this article. Thanks to Ariadne for this great article.
The missing point for me here is attaching to COM events which we really need to know for querying if the document download operation is complete.
Here is how we connect to the web browser's events:
private IConnectionPointContainer connectionPointContainer;
private IConnectionPoint connectionPoint;
private int pdwCookie = 0;
private void registerDocumentEvents()
{
connectionPointContainer = IeApplication as IConnectionPointContainer;
Guid guid = new Guid("34A715A0-6587-11D0-924A-0020AFC7AC4D");
connectionPointContainer.FindConnectionPoint(ref guid, out connectionPoint);
browserEvents = new DWebBrowserEvents2_Helper();
connectionPoint.Advise(browserEvents, out pdwCookie);
DocumentStatus ds = DocumentStatus.Instance;
ds.DownloadComplete = false;
}
After we have attached to the web browser events, we need to use a singleton class to query the document status. That is what DocumentStatus
does.
After we get feedback from the DocumentComplete
event, we set DownloadComplete
to true
so our code can flow.
void DWebBrowserEvents2.DocumentComplete(object pDisp, ref object URL)
{
DocumentStatus ds = DocumentStatus.Instance;
ds.DownloadComplete = true;
}
And, of course, we need to unregister events when we are done...
private void unregisteDocumentEvents()
{
connectionPoint.Unadvise(pdwCookie);
Marshal.ReleaseComObject(connectionPoint);
}
If you are curious about how I wrote those COM wrappers, let me give you a clue: Use Comto.net by aurigma.
Using the code
Simply start a new IEHelper
instance. Navigate to the pages you want to be in. Find some input elements by name or ID. Set their values and post them.
IEHelper ie = new IEHelper();
ie.OpenAVisibleBlankDocument();
object p = null;
string url = @"http://mail.google.com/mail/?hl=en&tab=wm";
bool ret = ie.Navigate(url, ref p, ref p, ref p, ref p);
ie.SetValueById("Email", txtUserName.Text);
ie.SetValueById("Passwd", txtPassword.Text);
ie.ClickButtonByName("signIn");
What to do next
What you need to do is fill in the blanks. Because when you download the source code, you'll see that most of the events are not implemented yet. If you think that you need those events, evolve them according to your requirements.
void DWebBrowserEvents2.OnQuit()
{
throw new Exception("The method or operation is not implemented.");
}