Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Internet web macros in C#

0.00/5 (No votes)
4 May 2004 3  
Write web macro agents with plugin libraries for data processing

Introduction

Sometimes you need to retrieve/submit information from the web in your applications but you don't want to write a full library for it. You would rather focus on your specific need, assuming you already got the information from the web in a HTML or mshtml DOM format.

So, KUMO is done for you. You can call your own objects in the macro, defined as plug Ins, and you can export your web macro as DLLs or .EXE objects.

Background

In 1998 Compaq introduced a web language to automate actions on the web. http://research.compaq.com/SRC/WebL/. The project stopped and a few commercial software or Java frameworks are now providing web automation functionalities. Unfortunately nothing serious never appeared on .NET.

Using the code

The code is based on the KUMO web macro methodology: a web macro is written in modified C# instructions. The modified C# instructions of the macro are the ## instructions that simply mean that the macro has to wait for the browser to have finished other work to move further. Another property of the ## instructions is that the return type does not need to be declared. The web macro uses 3 objects: SPBrowser, SPBrowserObject, SPBrowserCollection. SPBrowser represents the current browser, whereas SPBrowserObject is a wrapper of a mshtml.IHTMLElement object, a SPBrowserCollection is an array of SPBrowserObject.

By writing your own .NET DLL implementing the KUMOFrwk.Plugin.IPlugin interface and putting it in the /Plugins directory under the installation folder of KUMO, you will be able to add your own custom methods on the 3 objects SPBrowser, SPBrowserObject, SPBrowserCollection. You will be able to see the methods in KUMO editor that has an AutoComplete feature that recognizes plugins.

To give a simple example I implement the getEmails() function of the plugin ContactPlugin that I describe later :

//

// Navigate to Google advertisement page

## browser.goToURL("http://www.google.com/jobs/eng.html"); 

## emails = browser.getEmails(); 
if (emails.Length>0) 
{ 
    MessageBox.Show(emails[0]); 
}

The plugIn source code is available under the Download Source code. The important part is the function doFunction that will be launched by KUMO. The function defined here will search in all objects of the current web page those that look like an email. Of course there are several way to optimize this function to get faster results, but this is not the point of this article.

public object doFunction(params object[] allparameters) 
{ 
    // In that case there is no need to use any of the input parameter. 

    string[] allEmails; 
    string strRegex = @"^([a-zA-Z0-9_\-\.]+)@((\[[0-9]{1,3}" 
      + @"\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\" + 
      @".)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)$"; 
    Regex emailReg = new Regex(strRegex); 
    doc = (mshtml.HTMLDocument)localBrowser.Document; 
    mshtml.IHTMLElementCollection allTags = doc.all; 
    System.Collections.Queue aQueue = new Queue(); 
    foreach (mshtml.IHTMLElement anObj in allTags) 
    { 
        if (anObj.innerText != null) 
        { 
            if (anObj.innerText != "") 
            { 
                if ((emailReg.IsMatch(anObj.innerText))&
                    (anObj.innerText!="")) aQueue.Enqueue(
                     anObj.innerText); 
            } 
         } 
    } 
    allEmails = new string[aQueue.Count]; 
    for (int i=0; i<aQueue.Count;i++) 
    { 
        allEmails[i]=(string)aQueue.Dequeue(); 
    } 
    return allEmails; 
 }

Points of Interest

Download KUMO on www.softmorning.net

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here