Introduction
In this article, I will show you how to capture an HTML document as an image using a WebBrowser
object and the IViewObject.Draw
method, which according to MSDN draws a representation of an object onto the specified device context. Before we get started, I just want to mention that the obtained results were identical to those obtained using commercial libraries, so I hope this will be useful to someone.
The IViewObject interface
The very first thing that we must do is to define the IViewObject
interface.
Imports System.Runtime.InteropServices
Imports System.Runtime.InteropServices.ComTypes
Imports System.Drawing
<ComVisible(True), ComImport> _
<GuidAttribute("0000010d-0000-0000-C000-000000000046")> _
<InterfaceTypeAttribute(ComInterfaceType.InterfaceIsIUnknown)> _
Public Interface IViewObject
<PreserveSig()> _
Function Draw(<MarshalAs(UnmanagedType.U4)> dwDrawAspect As UInt32, lindex As Integer, _
pvAspect As IntPtr, <[In]()> ptd As IntPtr, hdcTargetDev As IntPtr, hdcDraw As IntPtr, _
<MarshalAs(UnmanagedType.Struct)> ByRef lprcBounds As Rectangle, _
<MarshalAs(UnmanagedType.Struct)> ByRef lprcWBounds As Rectangle, _
pfnContinue As IntPtr, <MarshalAs(UnmanagedType.U4)> dwContinue As UInt32) _
As <MarshalAs(UnmanagedType.I4)> Integer
<PreserveSig()> _
Function GetColorSet(<[In](), MarshalAs(UnmanagedType.U4)> dwDrawAspect As Integer, _
lindex As Integer, pvAspect As IntPtr, <[In]()> ptd As IntPtr, _
hicTargetDev As IntPtr, <Out()> ppColorSet As IntPtr) As Integer
<PreserveSig()> _
Function Freeze(<[In](), MarshalAs(UnmanagedType.U4)> dwDrawAspect As Integer, _
lindex As Integer, pvAspect As IntPtr, <Out()> pdwFreeze As IntPtr) As Integer
<PreserveSig()> _
Function Unfreeze(<[In](), MarshalAs(UnmanagedType.U4)> dwFreeze As Integer) As Integer
Sub SetAdvise(<[In](), MarshalAs(UnmanagedType.U4)> aspects As Integer, <[In](), _
MarshalAs(UnmanagedType.U4)> advf As Integer, <[In](), _
MarshalAs(UnmanagedType.[Interface])> pAdvSink As IAdviseSink)
Sub GetAdvise(<[In](), Out(), MarshalAs(UnmanagedType.LPArray)> paspects As Integer(), _
<[In](), Out(), MarshalAs(UnmanagedType.LPArray)> advf As Integer(), _
<[In](), Out(), MarshalAs(UnmanagedType.LPArray)> pAdvSink As IAdviseSink())
End Interface
Below is a summary description of the parameters that the Draw
method takes (this is the only method we will use):
UInt32 dwDrawAspect
- specifies the aspect to be drawn. Valid values are taken from the DVASPECT
and DVASPECT2
enumerations. In this example, I'm using DVASPECT.CONTENT
so the value passed is 1.
int lindex
- portion of the object that is of interest for the draw operation. Currently, only -1 is supported.
IntPtr pvAspect
- pointer to the additional information.
IntPtr ptd
- describes the device for which the object is to be rendered. We will render for the default target device, so the value passed will be IntPtr.Zero
.
IntPtr hdcTargetDev
- information context for the target device indicated by the ptd
parameter.
IntPtr hdcDraw
- device context on which to draw.
ref Rectangle lprcBounds
- the size of the captured image.
ref Rectangle lprcWBounds
- the region of the WebBrowser
object that we want to be captured.
IntPtr pfnContinue
- pointer to a callback function (not used here).
UInt32 dwContinue
- value to pass as a parameter to the function (not used here).
The HtmlCapture class
Now that we have defined our IViewObject
interface, it is time to move on and create a class that will be used to capture a web page as an image.
Imports System.Windows.Forms
Imports System.Drawing
Public Class HtmlCapture
Private _Web As WebBrowser
Private _Timer As Timer
Private _Screen As Rectangle
Private _ImgSize As System.Nullable(Of Size) = Nothing
Public Delegate Sub HtmlCaptureEvent(sender As Object, url As Uri, image As Bitmap)
Public Event HtmlImageCapture As HtmlCaptureEvent
Public Sub New()
_web = New WebBrowser()
_Timer = New Timer()
_Timer.Interval = 2000
_Screen = Screen.PrimaryScreen.Bounds
_web.Width = _Screen.Width
_web.Height = _Screen.Height
_web.ScriptErrorsSuppressed = True
_web.ScrollBarsEnabled = False
AddHandler _web.Navigating, AddressOf web_Navigating
AddHandler _web.DocumentCompleted, AddressOf web_DocumentCompleted
AddHandler _Timer.Tick, AddressOf tready_Tick
End Sub
#Region "Public methods"
Public Sub Create(url As String)
_ImgSize = Nothing
_web.Navigate(url)
End Sub
Public Sub Create(url As String, imgsz As Size)
Me._ImgSize = imgsz
_web.Navigate(url)
End Sub
#End Region
#Region "Events"
Private Sub web_DocumentCompleted(sender As Object, e As WebBrowserDocumentCompletedEventArgs)
_Timer.Start()
End Sub
Private Sub web_Navigating(sender As Object, e As WebBrowserNavigatingEventArgs)
_Timer.[Stop]()
End Sub
Private Sub tready_Tick(sender As Object, e As EventArgs)
_Timer.[Stop]()
Dim body As Rectangle = _Web.Document.Body.ScrollRectangle
Dim docRectangle As New Rectangle() With { _
.Location = New Point(0, 0), _
.Size = New Size(If(body.Width > _Screen.Width, body.Width, _Screen.Width), _
If(body.Height > _Screen.Height, body.Height, _Screen.Height)) _
}
_Web.Width = docRectangle.Width
_Web.Height = docRectangle.Height
Dim imgRectangle As Rectangle
If _ImgSize Is Nothing Then
imgRectangle = docRectangle
Else
imgRectangle = New Rectangle() With { _
.Location = New Point(0, 0), _
.Size = _ImgSize.Value _
}
End If
Dim bitmap As New Bitmap(imgRectangle.Width, imgRectangle.Height)
Dim ivo As IViewObject = TryCast(_Web.Document.DomDocument, IViewObject)
Using g As Graphics = Graphics.FromImage(bitmap)
Dim hdc As IntPtr = g.GetHdc()
ivo.Draw(1, -1, IntPtr.Zero, IntPtr.Zero, IntPtr.Zero, hdc, _
imgRectangle, docRectangle, IntPtr.Zero, 0)
g.ReleaseHdc(hdc)
End Using
RaiseEvent HtmlImageCapture(Me, _Web.Url, bitmap)
End Sub
#End Region
End Class
As you can see, I'm using a Timer
object to determine if the HTML document is fully loaded and can be captured. The reason I'm doing this is because an HTML document can trigger the DocumentCompleted
event multiple times. After the document is fully loaded, the tready_Tick
method is called.
Using the code
HtmlCapture
has an overloaded method named Create
. If you use the
Create(string url)
method, the size of the image will be the same as the size of the HTML document. If you want to create a thumbnail image of the HTML document, use
Create(string url,Size imgsz)
.
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
Dim hc As New HtmlCapture()
AddHandler hc.HtmlImageCapture, AddressOf hc_HtmlImageCapture
hc.Create("http://www.codeproject.com")
End Sub
Private Sub hc_HtmlImageCapture(sender As Object, url As Uri, image As Bitmap)
image.Save(OutputDirectory + url.Authority + ".bmp")
Process.Start(OutputDirectory)
End Sub