Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / DevOps / automation

Let's customize Google to show previous searches

4.89/5 (34 votes)
8 May 2012CPOL11 min read 102K   1.1K  
The application adds a custom button and a few anchor elements on Google page which link to a previously searched text on google.

Image 1 

Introduction

Usually when I Google something there are a good amount of chances that I'll be googling the same thing again after a few days to recollect whatever I learned last time. So I run Internet Explorer and look into the history if I feel like hitting the search button again with the same keywords as I used last time to get the results I am looking for. This works great but one day I just wondered that why doesn't google shows recently searched text on its page, if it does so it will provide Googlers much ease. Certainly since I can't approach Google with the idea so I decided to customize google for my machine so that it may show me whatever I looked for last few times.

How is it done

This could be done by modifying the Google's HTML each time its loaded in any browser window by adding a few HTML elements to it. A button with text "I'm Feeling Really Good" along side standard Google page buttons ("Google Search" and "I'm Feeling Lucky") is added to the Google page's HTML. On hit of this button all the text which has been searched in the past is rendered as HTML in a Span Element of Google page (the red text in image above). When a new search is performed, the search query is retrieved from the Google URL which is their in the address bar and is stored as a history record ( IE history track has no role here).

Components Used

This application uses the following components:

  1. SHDocVw.dll - This component is hosted directly by Internet Explorer and provides the functionality associated with navigation, in-place linking, favorites and history management. SHDocVw also exposes interfaces to its host which allows it to be hosted seperately as an ActiveX control. SHDocVw.dll component is also referred as WebBrowser Control. It also hosts the MSHTML.dll component.
  2. MSHTML.dll - The MSHTML.dll component is responsible for HTML parsing and rendering in browser (Internet Explorer 4.0 or later) and provides access to the loaded HTML document through DOM. It also hosts Scripting Engines, ActiveX controls, plug-ins and all other objects which could be referenced by the HTML document.

For detailed reading on these components and on how to choose one of the components read this.

Understanding the Google HTML Structure

To edit and make changes to a Website's HTML first it should be undertood. Understanding involves knowing the HTML Elements on the page where the custom elements will be added.

Source code of any page could be seen in the web browser. Many browsers (like Internet Explorer 9.0 ) offer better ways to navigate to HTML of visible elements on web page. Here I'll explain the Google HTML structure using Internet Explorer 9.0 Developer Tools.

First lets look at the parent Element of buttons "Google Search" and "I'm Feeling Lucky".  The buttons have names "btnK" and "btnI" respectively. The custom button will be added as the Last Child to the parent of these button elements. To make the custom button look like Google's buttons only this needs to be ensured that the type property of the button is submit as the Google's CSS applies to all input elements which have their type property as "submit".

Image 2

The previous search texts will be shown as anchor element being linked to google with the same query as was the last time. These anchor elements need to displayed in some element in Google page. To display the anchor elements bottom part of the google page, which is actually a big empty space with white color, will be used.

LetsCustomizeGoogle/footer_element.png

The Div element has "footer" as its ID property. To display anchor elements inside this page a span element will be added as a child to this element and then the anchor elements will be added to this span element. No styles need to be added to anchor elements as its all handled by Google's CSS which applies styles to all anchor elements which are child of footer div.

How it all works

To start understanding the code and to have a vision of how it all works first it will be worth to have a look at the application flow which is shown below. Have a special look at DocumentComplete event of Window and OnClick event of custom button.

LetsCustomizeGoogle/FlowDiagram2.png

Register all windows

When the application starts in the Main method a new instance of SHDocVw.ShellWindows class is created. ShellWindows is a collection of all open windows which belong to the shell. It includes Internet Explorer and Windows Explorer windows as well. ShellWindows class exposes two events, WindowRegistered Event which is fired whenever a new instance of a window is created and WindowRevoked Event which is fired whenever instance of a window is killed. For our purpose we need to add EventHandler for WindowRegistered Event so that all new instances of Internet Explorer could be captured.

C#
//This will be used to detect a new window.

ExplorerWindow = new SHDocVw.ShellWindows();
ExplorerWindow.WindowRegistered += 
  new DShellWindowsEvents_WindowRegisteredEventHandler(ExplorerWindow_WindowRegistered);

If a new instance of Internet Explorer is created the WindowsRegistered Event is fired. In the EventHandler for this event all the windows in the ShellWindows collection are enumerated. All windows of the collection are type cast to InternetExplorer Class of SHDocVw.dll. A custom property "IsRegistered" is set for the window for which the property isn't set already and a new EventHandler will be added for the DocumentComplete event of the window. The use of property "IsRegistered" is to ensure that only once the EventHandler is added to the DocumentComplete event of the window.

C#
SHDocVw.ShellWindows AllWindows = new ShellWindows();

//Check if a window is already registered if not then register it
//and wait for the document complete event to get fired

foreach (object Window in AllWindows)
{
    SHDocVw.InternetExplorer IE = (InternetExplorer)Window;
    if (IE.GetProperty("IsRegistered") == null)
    {
        IE.PutProperty("IsRegistered", true);

        IE.DocumentComplete += new DWebBrowserEvents2_DocumentCompleteEventHandler(IE_DocumentComplete);
    }
}

The DocumentComplete Event is fired when the readystate property of the window is changed to READYSTATE_COMPLETE. This happens when the browser is done with downloading the web page. Once the Web page is loaded the next step is to ensure that the opened page is Google only. The best way to accomplish this is by looking at the LocationURL property of the InternetExplorer Class. LocationURL property holds the location of the page that is currently displayed. There isn't anything better than Regular Expressions to search the strings.

Once it is known that the loaded page is Google, next is to Create Elements and adding them to the page. Also the page could be the search page of google and not the home page. If this is so then the search query should be saved in the XML. Below is the code snippet from the DocumentComplete EventHandler.

C#
//if the loaded document is Google then create required HTML Elements
//on page, save queries in XML and load results from XML to Google page

if (Doc != null && System.Text.RegularExpressions.Regex.Match(
                     IE.LocationURL, @"www\.google\..*").Success)
{
    CreateElements(Doc);
    SaveResults(URL.ToString()); 
}

Create Elements

The Document loaded in the web page is accessed using an instance of MSHTML.HTMLDocument class. In the MSHTML.HTMLDocument class a reference to the document loaded in the Browser window is stored. All JavaScript methods which are available to the web page scripts are also available through MSHTML. MSHTML provides access to web page DOM as well.

To create a new input Element, the createElement method of the MSHTML.HTMLDocument class is used. The reference to this new input element is held in MSHTML.HTMLInputElement class. The type property of new button is set to "Submit" so that it is transformed by Google CSS which works on all elements which have their type property as "Submit". A child span element is also added  to our custom button whose innerText property is set to "I'm Feeling Really Good" ( the text to be displayed on our button). 

To add Custom Button ("I'm Feeling Really Good") to the page it is first required to find a reference to an existing Google button ("I'm Feeling Lucky") and then to add our button to its parent. Its seen above (In Understanding the Google HTML Structure section) that the HTML Name property of input Element corresponding to "I'm Feeling Lucky" button is "btnI". To get all the HTML elements with name "btnI", getElementsByName method of MSHTML.HTMLDocument class is used. This method returns an object of MSHTML.IHTMLElementCollection interface. From this collection a refenrence to button "I'm Feeling Lucky" could be extracted in MSHTML.HTMLInputElement class.

After adding the button to the page a Span Element needs to be added to the page's HTML which will display previous search queries in the form of anchor elements. Span element needs to be added to Div element which has its ID "footer". The MSHTML.HTMLDivElement and MSHTML.HTMLSpanElement classes are used to hold reference to HTML Div and Span Elements. The display property of Span element is set as none and its ID property as SpanSearchResults. The span will be displayed when the custom button will be clicked.

C#
//if a button doesn't exists already then create one and add
//an event handler to its onclick event
if (Doc.getElementById(ButtonId) == null)
{
    mshtml.IHTMLElementCollection LuckyButtonCollection = 
      (mshtml.IHTMLElementCollection)Doc.getElementsByName("btnI");
    
    if (LuckyButtonCollection.length <= 0) return;
    mshtml.HTMLButtonElement LuckyButton = 
      (mshtml.HTMLButtonElement)LuckyButtonCollection.item(name: Type.Missing, index: 0);

    mshtml.HTMLButtonElement CustomButton = 
      (mshtml.HTMLButtonElement)Doc.createElement("button");
    CustomButton.id = ButtonId;
    CustomButton.className = "gbqfba";

    //build a span element which will hold the button name
    mshtml.HTMLSpanElement CustomButtonText = (mshtml.HTMLSpanElement)Doc.createElement("span");
    CustomButtonText.className = "gbqfsb";
    CustomButtonText.innerText = "I'm Feeling Really Good";

    CustomButton.appendChild((mshtml.IHTMLDOMNode)CustomButtonText);

    LuckyButton.parentNode.appendChild((mshtml.IHTMLDOMNode)CustomButton);

    mshtml.HTMLButtonElementClass CustomButtonClass = (mshtml.HTMLButtonElementClass)CustomButton;
    CustomButtonClass.HTMLButtonElementEvents2_Event_onclick += 
      new HTMLButtonElementEvents2_onclickEventHandler(
      CustomButtonClass_HTMLInputTextElementEvents2_Event_onclick);
}

//create the Span element if one doesn't exists
mshtml.HTMLSpanElement Ele = (mshtml.HTMLSpanElement)Doc.getElementById(SpanSearchResults);

if (Ele == null)
{
    mshtml.HTMLDivElement ParentDiv = (mshtml.HTMLDivElement)Doc.getElementById("footer");
    
    mshtml.HTMLSpanElement TargetSpan = (mshtml.HTMLSpanElement)Doc.createElement("span");
    TargetSpan.id = SpanSearchResults;
    if (ParentDiv == null) return;
    
    ParentDiv.insertBefore((mshtml.IHTMLDOMNode)TargetSpan, ParentDiv.firstChild);

    TargetSpan.style.display = "none";
}

mshtml.HTMLDivElement LogoDiv = (mshtml.HTMLDivElement)Doc.getElementById("hplogo");
mshtml.HTMLDivElement LogoChildDiv = (mshtml.HTMLDivElement)LogoDiv.firstChild;
LogoChildDiv.innerText = LogoChildDiv.innerText + ". Customized By Hitesh"; 

Button click Event Handler

Now since the elements have been added to the Google web page, the next is to add some action to the button click. When "I'm Feeling Really Good" button is clicked it will make the Span element ('SpanSearchResults') visible and will add anchor elements to display previous search queries into it.

C#
//onclick of our custom button clear the span which holds any previous searchs and 
//reload the searchs in Google page from the saved XML.
static bool CustomButtonClass_HTMLInputTextElementEvents2_Event_onclick(IHTMLEventObj pEvtObj)
{
    MSHTML.HTMLDocument doc = (MSHTML.HTMLDocument)pEvtObj.srcElement.document;
    MSHTML.HTMLSpanElement SearchResults = (MSHTML.HTMLSpanElement)doc.getElementById("SpanSearchResults");
    SearchResults.style.display = "";
    if (!LoadResultsInGoogle(doc))
    NoResultsMessage(doc);
    return true;
}

To add anchor elements to the page the queries are first read from the XML file and then an anchor element is created for each search query. Each anchor element will display the search query and on click will open a new window with search results for same query from google. The tooltip of anchor element will be set to the date and time of when the query was searched and was saved in the application XML.

The structure of the XML which will store search queries is as below.

XML
<searches>
   <search text="" URL="" datetime=""/>
<searches>

Each search element corresponds to one search query and its text property holds the searched text, URL property holds the URL which will send the same query to google again, usually like "http://www.google.com/search?q=query" and the datetime attribute will store date and time stamp of when the query was searched.

Load Previous Search Queries in Google

To show the previous search queries in web page, first it is required to read all the "search" nodes from XML file. Using an XPath query all the "search nodes could be read from the XML file.

C#
XmlNodeList Nodes = XMLDoc.SelectNodes("//search");

For each node found, an anchor element will be created and will be added to our Custom Span Element. The href attribute of anchor element will be the URL attribute of XML "search" node, the innerText of anchor element will be set to text property of node and title property will be set to string "Searched On " + datetime attribute of node. The target property will be set to _blank to ensure that all queries open in a new window. Each of these anchor elements will then be added to our custom span which has its ID property set as 'SpanSearchResults'.

C#
for (int i = 0; i < Nodes.Count; i++)
{
    XmlNode CurrNode = Nodes[i];

    if (CurrNode.Attributes["text"].Value == "")
        continue;
    string URL = CurrNode.Attributes["URL"].Value;
    MSHTML.HTMLAnchorElement anch = (MSHTML.HTMLAnchorElement)Doc.createElement("a");

    if (TargetSpan.firstChild != null)
        TargetSpan.insertBefore((MSHTML.IHTMLDOMNode)anch, TargetSpan.firstChild);
    else
        TargetSpan.appendChild((MSHTML.IHTMLDOMNode)anch);

    anch.setAttribute("href", URL);
    anch.setAttribute("name", "searchanchors"); 
    anch.target = "_blank";
    anch.innerText = CurrNode.Attributes["text"].Value;
    anch.style.color = "#aa0000";
    anch.title = "Searched On " + CurrNode.Attributes["datetime"].Value;
}

Saving New Queries in XML

Now since you have opened Google and searched something, hence your search query should be saved in XML file as a history record so that it could be displayed next time when Google is opened again. In Google's URL the search query is present in parameter "q" which has a format as &quot;q=queried+text". So from the URL only that part is required to be seperated which is after the = sign which comes after "q" and before an & ( which signifies the start of a new parameter). This could be accomplished using Regular Expressions as shown in code below.

C#
System.Text.RegularExpressions.Regex reg = 
   new System.Text.RegularExpressions.Regex(@"q=(?<query>[^&]+)");
System.Text.RegularExpressions.Match mt = reg.Match(URL);

string query = "";
string text = "";
if (mt.Success)
{
    query = mt.Result("${query}");
}

Usually Whenever some text is searched in google then all the spaces are replaced by a '+' sign when the query is present in URL. So all the '+' should be replaced by a <space> character and all the URL encoded characters should be replaced by their ASCII representation.

C#
text = query.Replace("+", " ");

//find the HTML Encoded characters in Query text and replace them with their
//Character
System.Text.RegularExpressions.Regex HexReg = 
  new System.Text.RegularExpressions.Regex(@"(?<val>%(?<hex>[A-Z0-9]{2}))");

System.Text.RegularExpressions.Match HexMatch = HexReg.Match(text);
while (HexMatch.Success)
{
  text = text.Replace(HexMatch.Result("${val}"), 
      char.ConvertFromUtf32(int.Parse(HexMatch.Result("${hex}"), 
      System.Globalization.NumberStyles.HexNumber)));
  HexMatch = HexMatch.NextMatch();
}

Now the searched text has been retrieved from the URL and hence it should be checked that this text is already not saved in the XML file and if not then a new "Search" node should be created and saved under root node of the XML.

C#
XmlDocument xmlDoc = LoadXML();

XmlNodeList ExistingNodes = 
  xmlDoc.SelectNodes("//search[@text='" + text + "']");

//prepare the query and save it in XML
if (ExistingNodes.Count > 0) return false;
XmlElement searchEle = xmlDoc.CreateElement("search");
searchEle.SetAttribute("text", text);
searchEle.SetAttribute("URL", "http://www.google.com/search?q=" + query);
searchEle.SetAttribute("datetime", DateTime.Now.ToString());

XmlNode root = xmlDoc.DocumentElement;
if (root != null)
root.AppendChild(searchEle);
else
xmlDoc.AppendChild(searchEle);
SaveXML(xmlDoc);

whenever now Google will be opened next time this new saved query will be displayed on the Google page and could be navigated to again by clicking on the anchor element representing it.

How to Use

To use the application, put a shortcut to executable in windows startup folder. Each time windows will start the executable will run without UI and will change each and every Google instance.

During Build if application is run then it could be tested by opening a new IE window and then opening Google in it.

Points of Interest

The same methodology could be applied to customize many other websites on the run. Only requirement being a little understanding of target website's HTML and knowing how to use SHDocVw.dll and MSHTML.dll.

History

  • 16-02-2012 - Updated attached code which was modified as per changed Google markup.
  • 06-05-2012 - Updated code and images as per the updated Google's markup.
  • 08-05-2012 - Images were not getting displayed in new version. Issue resolved 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)