Capturing Entire Content of an Internet Explorer Window

Kaveh Yazdi Nezhad

0.00/5 (No votes)

14 Jun 2014CPOL

8.7K

A simple solution to capture entire HTML code of a web page which is displaying on Internet Explorer

Introduction

Normally when you need to capture the contents of a webpage, it's enough to use .NET WebBrowser and it always works fine but recently I experienced a different situation: webpage data was displaying by using Ajax and JavaScript and because of that, the WebBrowser component could not catch everything and just displayed pre-designed parts of the web page.

As Internet Explorer was able to show everything perfectly, I decided to find a solution in order to catch the real-time content when page is displaying in Internet Explorer. After searching the web and being faced with a small programming error, finally I succeeded in optimizing an available code for this purpose.

That works fine for me so maybe it can be useful for someone else too.

Using the Code

Just open an Internet Explorer window, go to the website you want to capture and then use the following code. To make it easy to understand, I assumed my goal is to assign a textbox with HTML code of a special webpage.
Required references: MSHTML , SHdocVw

using System.Runtime.InteropServices;
using mshtml;
using System.Reflection;
.
.
.

SHDocVw.ShellWindows shellWindows = new SHDocVw.ShellWindowsClass();
string filename;
 foreach (SHDocVw.InternetExplorer ie in shellWindows)
   {filename = Path.GetFileNameWithoutExtension(ie.FullName).ToLower();
    if (filename.Equals("iexplore") && ie.Name.Equals("Internet Explorer") &&
             ie.LocationURL.Equals("http://www.mysite.com"))
                    {   dynamic d = ie;
                        object v = d.Document.ActiveElement.InnerHTML;
                        textBox1.Text = v.ToString();
                    }

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)