Introduction
In this article I will explain how to automate the process of downloading files with the open/save dialog box, protected by authentication creating a Windows application. To illustrate this process I have created a web application that generates a protected site. I have included an authentication page. Within this site I have developed a page that requires a user name and a button that generates an on-the-fly file which includes the name typed in the text field. Then I created a Windows application using the WebBrowser
control that connects to the site, enters the credentials, and downloads the file with the HttpWebRequest
object, bypassing the dialog box.
Background
Sometimes we have to interact with a file download dialog box Open /Save dialog box. This happens because in the header of the file it is specified as Content-Disposition: attachment. This interaction becomes problematic when you have to write a program that automates the process of downloading files from a site (for example, a scheduled task). If the file exists in the web server you can use a WebClient
object and simply download the file, but if the file is created on-the-fly, you have to simulate user navigation and user click.
The problem is greater if the site is protected by login and password, so before you can get to the page that generates the files on the fly you must first be accredited in the login page and maintain the authentication status. To navigate the site I used a WebBrowser
control that allowed me to easily log in and keep the session, but when I tried to download the file, simulating the user's click, the program loses control on the file generated. The alternative is to use the HttWebRequest
object to download the file, but being a protected site, the problem is to pass the authentication status of the WebBrowser
control to access the HttWebRequest
object then recreate the request, and send all the values of the post. My purpose is to automate this process by passing the file download dialog and saving the file directly to a local drive.
Using the code
In my application I use WebBrowser
control and Timer
control. The WebBrowser
control allows developers to build Web browsing capability within Windows Forms applications. The Timer
control is useful if the session is not initialized for some reason or if the site changes page and the program tries to connect to pages that no longer exist. To pass the cookies to the HttpWebRequest
object (that takes care of downloading the files) I used InternetGetCookieEx
, this method retrieves data stored in cookies associated with a specified URL. I suggest reading http://blogs.msdn.com/b/ieinternals/archive/2009/08/20/wininet-ie-cookie-internals-faq.aspx (which also explains why it sometimes does not work with direct transfer of cookies between WebBrowser
and HttpWebRequest Request.CookieContainer.SetCookies(Url, webBrowser1.Document.Cookie)
and http://social.technet.microsoft.com/wiki/contents/articles/17366.get-cookies-from-webbrowser-controls-in-visual-c.aspx). By default, the cookies returned from this function will not include any HTTPOnly cookies; to retrieve HTTPOnly cookies, you have to pass the INTERNET_COOKIE_HTTPONLY
flag.
[DllImport("wininet.dll", CharSet = CharSet.Auto, SetLastError = true)]
static extern bool InternetGetCookieEx(string pchURL, string pchCookieName,
StringBuilder pchCookieData, ref uint pcchCookieData, int dwFlags, IntPtr lpReserved);
const int INTERNET_COOKIE_HTTPONLY = 0x00002000;
private string GetGlobalCookies(string uri)
{
uint uiDataSize = 2048;
StringBuilder sbCookieData = new StringBuilder((int)uiDataSize);
if (InternetGetCookieEx(uri, null, sbCookieData, ref uiDataSize,
INTERNET_COOKIE_HTTPONLY, IntPtr.Zero) && sbCookieData.Length > 0)
{
return sbCookieData.ToString().Replace(";", ",");
}
else
{
return null;
}
}
The WebBrowser.DocumentCompleted
event occurs when the WebBrowser
control finishes loading a document. When this occurs the new document is loaded, which means you can access it. In order to access the form element, I use the getElementById
method, so it is simple to set the login and password and then invoke click on the button to send input values.
private void webBrowser1_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
if ((bFirstTime == true) &&
(e.Url.ToString().Contains(sPageLogin))) {
bFirstTime = false;
HtmlElement heElement;
heElement = (HtmlElement)webBrowser1.Document.GetElementById("UserEmail");
heElement.InnerText = "name";
heElement = (HtmlElement)webBrowser1.Document.GetElementById("UserPass");
heElement.InnerText = "pass";
heElement = (HtmlElement)webBrowser1.Document.GetElementById("Submit1");
heElement.InvokeMember("click");
System.Threading.Thread.Sleep(2000);
webBrowser1.Navigate(sWebPath + sPageDw);
}
In my example I use a Web Application created with ASP.NET to test the Windows application. By default, the ASP.NET page framework uses viewstate to preserve control values between round trips and usually the state is inserted inside a hidden field. In a similar way, Event Validation ensures that events raised on the client originate from the controls rendered in the page and also this value is inserted in a hidden field. When the page is posted back to the server, the ASP.NET page framework re-computes the hash value of the viewState and event validation and compares it with the value stored in the view state. For this reason it is important to read the page twice, first time for reading and saving the value of these particular input fields, and second time to send them back to the server with the other value (in this example, the name typed). I use Live HTTP Headers (add-ons for Firefox – Mozilla) to see the HTTP requests and responses; it was very useful for me to be able to display the browser request and to try to recreate it with the HttpWebRequest
object.
string sTmpCookieString =GetGlobalCookies(webBrowser1.Url.AbsoluteUri);
string sPageData;
string PathRemote = sWebPath + sPageDw;
HttpWebRequest fstRequest = (HttpWebRequest)WebRequest.Create(PathRemote);
fstRequest.Method = "GET";
fstRequest.CookieContainer = new System.Net.CookieContainer();
fstRequest.CookieContainer.SetCookies(webBrowser1.Document.Url, sTmpCookieString);HttpWebResponse fstResponse = (HttpWebResponse)fstRequest.GetResponse();
StreamReader sr = new StreamReader(fstResponse.GetResponseStream());
sPageData = sr.ReadToEnd();
sr.Close();
string sViewState = ExtractInputHidden(sPageData, "__VIEWSTATE");
string sEventValidation = this.ExtractInputHidden(sPageData, "__EVENTVALIDATION");
string sUrl = sWebPath + sPageDw;
HttpWebRequest hwrRequest = (HttpWebRequest)WebRequest.Create(sUrl);
hwrRequest.Method = "POST";
hwrRequest.CookieContainer = new System.Net.CookieContainer();
string sPostData = "__EVENTTARGET=&__EVENTARGUMENT=&__VIEWSTATE=" +
sViewState + "&__EVENTVALIDATION=" + sEventValidation +
"&Name=" + sMyName + "&Button1=Button";
byte[] bByteArray = Encoding.UTF8.GetBytes(sPostData);
hwrRequest.ContentType = "application/x-www-form-urlencoded";
hwrRequest.CookieContainer.SetCookies(webBrowser1.Document.Url, sTmpCookieString);hwrRequest.ContentLength = bByteArray.Length;
Stream sDataStream = hwrRequest.GetRequestStream();
sDataStream.Write(bByteArray, 0, bByteArray.Length);
sDataStream.Close();
using (WebResponse response = hwrRequest.GetResponse())
{
using (sDataStream = response.GetResponseStream())
{
StreamReader reader = new StreamReader(sDataStream);
{
string sHeader = response.Headers["Content-Disposition"];
int iFileNameStartIndex = sHeader.IndexOf("filename=") +
"filename=".ToString().Length;
int iFileNameLength = sHeader.Length - iFileNameStartIndex;
string sFileName = sHeader.Substring(iFileNameStartIndex, iFileNameLength);
string sResponseFromServer = reader.ReadToEnd();
FileStream fs = File.Open(sLocalPath + sFileName,
FileMode.OpenOrCreate, FileAccess.Write);
Byte[] info = new System.Text.UTF8Encoding(true).GetBytes(sResponseFromServer);
fs.Write(info, 0, info.Length);
fs.Close();
If the session is lost or there are other problems, I close the program and write the error in the log file when the timer generates the event tick.
Points of interest
The most interesting part of the project is to pass authentication from the WebBrowser
control to the HttpWebRequest
object. In this way it is possible to bypass the problem of the dialog box, always keeping the control within the application.