UPDATE
I've updated the code and the binary with the great improvements that Piers Lawson suggested in the comments. The app should no longer have problems taking snapshots of some images with JavaScript or just plain random problems. It is also slightly optimized with suggestions from Frank Herget. It looks like he's based a very nice service around it on his site - check it out!
Thanks again for your great support!
Introduction
The article describes a console-like application that loads a Web page, makes a screenshot of it and saves it as a JPG file.
Our beloved sys admin - (we all bow to him and worship his skills) has recently asked if it's possible to write a .NET application to make a thumbnail of a Website. The task is pretty trivial with Windows Forms actually. But with him being the Linux guy and all... I decided to pick up the more challenging part of it being the console app. An interesting use case anyway.
In WinForms, all you really need to do is drop a WebBrowser from your Toolbox on your form and once it's loaded the page call:
Bitmap bitmap = new Bitmap(width, height);
webBrowser1.DrawToBitmap(bitmap,
new Rectangle(webBrowser1.Location.X, webBrowser1.Location.Y,
webBrowser1.Width, webBrowser1.Height));
Obvious enough. When it gets tricky is when you want to do it in a console application in a way that can take a shot of multitude of Websites provided in a batch file. There is a dirty way of instantiating a whole form, making it show (or not), doing the work and then exiting the WinForms app. This might probably be enough for a quick solution, but I wanted a clean piece of code, so I would actually NOT take pride in something in that tone.
How is it done then...
So we instantiate the Web control in our class constructor...
public WebPageBitmap(string url, int width, int height, bool scrollBarsEnabled)
{
this.url = url;
this.width = width;
this.height = height;
webBrowser = new WebBrowser();
webBrowser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(documentCompletedEventHandler);
webBrowser.Size = new Size(width, height);
webBrowser.ScrollBarsEnabled = scrollBarsEnabled;
}
Easy so far and pretty similar to what the regular app would do anyway. The documentCompletedEventHandler
is a delegate
to tell that it has loaded. (I initially wanted to use that for drawing the bitmap but deferred that to the point where the bitmap is actually fetched after I added the resizing part.) Now comes the interesting case.
The Neat Part
Since the call is asynchronous, a simple webBrowser.Navigate
(URL); just won't cut it. We are in a single thread and the browser does not create a separate thread for that. This makes sense by the canonical windows rule: Only the thread that creates a control, accesses the control. We need to somehow allow the control to take the flow of the thread and do its work. Navigate only tells it that it should perform the action and immediately exits. The developer's responsibility then is to know when the control is ready for consumption. Which is the case when the webBrowser.ReadyState
progresses to (or returns to) the state of WebBrowserReadyState.Complete
.
The Solution
To pass the flow to the app controls, you need to perform Application.DoEvents();
which was a bit of a wild guess when I used it. Surprise, surprise, it works just like it did in other Windows frameworks that I used before.
public void Fetch()
{
webBrowser.Navigate(url);
while (webBrowser.ReadyState != WebBrowserReadyState.Complete)
{
Application.DoEvents();
}
}
The effect is a tiny and neat (I hope) app that pulls a Web page from the net and makes a screenshot off of it (with possible rescaling).
You can get the source code or get the app directly. App usage:
GetSiteThumbnail.exe http://www.yoursite.com/ thumbnail.jpg
[browser_width(defaults to 800) browser_height (defaults to 600) ]
[thumbnail_width thumbnail_height]
Sample:
GetSiteThumbnail.exe http://www.cognifide.com/ cognifide.jpg 1280 1024 640 480