Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

A Really Vain How are my articles doing Web Spider

0.00/5 (No votes)
4 Feb 2013 1  
A simple web spider to see fetch CodeProject articles.

Introduction

This article was written purely out of curiosity really. I had a friend who built a pub crawl site (barcrawl) which uses a little web spider to spider into a mapping site and extract map co-ordinates. This is something I had not done before, so I decided to give this a try.

But first, I need to have a site that I wanted to get some data out of, so being totally into The Code Project, I decided to pick it. The next question I had was, what data do I want to fetch. Well, being the vain sort of chap, I'm sure we can all be, at times, I thought I'll write a web spider that fetches my articles summary area, similar to the area shown below, which can be found by navigating to any CodeProject user's article area using a URL like: http://www.codeproject.com/script/articles/list_articles.asp?userid=userId.

So that is what this article is trying to grab. In trying to do this, I feel this article shows how to grab useful information out of what is potentially a vast area of data.

So How is it All Done

There are a couple of steps involved with the VainWebSpider application, which are as follows:

  1. Get a user ID so that the full URL can be formed to fetch the articles for.
  2. Store this user ID within the Registry to allow the VainWebSpider application to know which user to fetch data for the next time it's run.
  3. Grab the entire web page based on the full URL, such as http://www.codeproject.com/script/articles/list_articles.asp?userid=569009.
  4. Store the web page in a suitable object, a string is fine.
  5. Split the string to grab only the area that is of interest, largely reducing the size of the string in memory. We are only interested in the article summary details; anything else, we don't care about.
  6. Use Regular Expressions to grab the data we want.

These 6 steps are the basis of this article.

Step 6 is really the most interesting as it allows us to pluck the data we want out of the junk. Let's have a look at an example, shall we?

Using Regular Expressions to Grab the Data

<hr size=1 noshade><a name='#Author0'></a><h2>Articles 
  by Sacha Barber (10 articles found)</h2><h4>Average article rating: 
  4.5</h4><h3>C# Algorithms</h3>

  <p><a href='http://www.codeproject.com/cs/algorithms/#Evolutional'>Evolutional</a></p>

  <div class=smallText style='width:600;margin-left:40px;margin-top:10px'>

  <a href='http://www.codeproject.com/cs/algorithms/Genetic_Algorithm.asp'>
  <b>AI - Simple Genetic 
  Algorithm (GA) to solve a card problem</b></a><div style='font-size:8pt; 
  color:#666666'>Last Updated: 8 Nov 2006  Page views: 7,164  Rating: 
  4.7/5  Votes: 17  Popularity: 5.8</div>

  <div style='margin-top:3;font-size:8pt;'>A simple Genetic Algorithm (GA) 
  to solve a card problem.</div>
</div>

The actual web site content for an article is shown above. Let's say, we only want to grab the number of views for the article. How can this be done? Well, quite simply, actually. We just create a well formed Regular Expression such as:

private List<long> getViews()
{
    string pattern = "Page views: [0-9,]*";
    MatchCollection matches = Regex.Matches(this.webContent, 
        pattern, RegexOptions.ExplicitCapture);
    List<long> lViews = new List<long>();
    foreach (Match m in matches)
    {
        int idx = m.Value.LastIndexOf(":") + 2;
        lViews.Add(long.Parse(m.Value.Substring(idx).Replace(",", "")));
    }
    return lViews;
}

This nifty little bit of code is enough to match all the Page views: XXXX entries in the web content. The matches object would end up containing the view values, such as 7,164, for the example above. From here, it's easy; we simply repeat this for all the parts of the web content we are interested in.

We end up with Regular Expressions within the WebScreenScraper class to grab the following details:

  • Article Views
  • Article Votes
  • Article Popularity
  • Article Ratings
  • Article URLs

All this is done inside the WebScreenScraper class. Once we have the results, they are simply made available as a standard ADO.NET DataTable to allow the main interface (frmMain) to show them in a nice manner.

Class Diagram

The VainWebSpider class diagram is as follows:

Code Listings

The code to do this is basically as follows:

Program Class

This class holds various pop-ups and commonly used functions, as well as the Main method:

using System;
using System.Collections.Generic;
using System.Windows.Forms;
using Microsoft.Win32;

namespace VainWebSpider
{
    #region Program CLASS
    /// <summary>
    /// provides the main access point into the application. Also
    /// provides several generic helper methods, such as InputBox(..),
    /// ErrorBox(..),InfoBox(..) and also provides read/write funtions
    /// to store the current UserID within the registry
    /// </summary>
    public static class Program
    {
        #region Instance fields
        //instance fields
        private static long userId;

        #endregion
        #region Public Methods/Properties

        /// <summary>
        /// gets or sets the UserID which will be used to retrieve codeproject
        /// articles for. When a new UserID is set, the new value is also written
        /// to the windows registry, using the writeToRegistry(..) method. This 
        /// ensures the next time the VainWebSpider application is run, the last
        /// selected UserID will be used. 
        /// </summary>
        public static long UserID
        {
            get { return Program.userId; }
            set
            {
                Program.userId = value;
                Program.writeToRegistry(value);
            }
        }

        /// <summary>
        /// Creates a new "VainWebSpider" subkey (if none exists) under 
        /// the HKEY_LOCAL_MACHINE\SOFTWARE registry key. It also creates
        /// a new value within the newly created VainWebSpider subkey, for
        /// the userId input parameter. This is done so that the VainWebSpider
        /// application can know which user it was looking at last time
        /// </summary>
        /// <param name="userId">
        ///     The userId to store within the registry</param>
        public static void writeToRegistry(long userId)
        {
            try
            {
                RegistryKey hklm = Registry.LocalMachine;
                RegistryKey hkSoftware = 
                            hklm.OpenSubKey("Software", true);
                RegistryKey hkVainWebSpider = 
                    hkSoftware.CreateSubKey("VainWebSpider");
                hkVainWebSpider.SetValue("userId", userId);
            }
            catch (Exception ex)
            {
                Program.ErrorBox(
                    "There was a problem creating " + 
                    "the Registry key for VainWebSpider");
            }
        }

        /// <summary>
        /// Returns the userId value within the
        /// HKEY_LOCAL_MACHINE\SOFTWARE\VainWebSpider registry key
        /// </summary>
        /// <returns>The value of the userId value within the
        /// HKEY_LOCAL_MACHINE\SOFTWARE\VainWebSpider registry key, 
        /// if it exists, else returns -1</returns>
        public static long readFromRegistry()
        {
            try
            {
                RegistryKey hklm = Registry.LocalMachine;
                RegistryKey hkSoftware = hklm.OpenSubKey("Software");
                RegistryKey hkVainWebSpider = 
                            hkSoftware.OpenSubKey("VainWebSpider");
                return long.Parse(hkVainWebSpider.GetValue("userId").ToString());
            }
            catch (Exception ex)
            {
                return -1;
            }
        }

        /// <summary>
        /// InputBox, returns user input string
        /// </summary>
        /// <param name="prompt">the prompt</param>
        /// <param name="title">the form title</param>
        /// <param name="defaultValue">the default value to use</param>
        /// <returns>the string the user entered</returns>
        public static string InputBox(string prompt,
          string title, string defaultValue)
        {
            InputBoxDialog ib = new InputBoxDialog();
            ib.FormPrompt = prompt;
            ib.FormCaption = title;
            ib.DefaultValue = defaultValue;
            ib.ShowDialog();
            string s = ib.InputResponse;
            ib.Close();
            return s;
        } 

        /// <summary>
        /// Shows an error message within a MessageBox
        /// </summary>
        /// <param name="error">the error message</param>
        public static void ErrorBox(string error)
        {
            MessageBox.Show(error,"Error", 
               MessageBoxButtons.OK,MessageBoxIcon.Error);
        }

        /// <summary>
        /// Shows an information message within a MessageBox
        /// </summary>
        /// <param name="error">the information message</param>
        public static void InfoBox(string info)
        {
            MessageBox.Show(info, "Information",
                MessageBoxButtons.OK, MessageBoxIcon.Information);
        }

        /// <summary>
        /// Shows a Yes/No query within a MessageBox
        /// </summary>
        /// <param name="query">the query message</param>
        /// <returns>DialogResult,
        /// which is the result of the Confirmation query</returns>
        public static DialogResult YesNoBox(string query)
        {
            return MessageBox.Show(query,
                "Confirmation", MessageBoxButtons.YesNo,
                MessageBoxIcon.Question);
        }
        #endregion
        #region MAIN THREAD
        /// <summary>
        /// The main entry point for the application.
        /// Expects 0 command line arguments
        /// </summary>
        [STAThread]
        static void Main()
        {
            Application.EnableVisualStyles();
            Application.SetCompatibleTextRenderingDefault(false);
            Application.Run(new frmLoader());

        }
        #endregion
    }
    #endregion
}

WebScreenScraper Class

This class does the hob of fetching and extracting the data from the relevant CodeProject web page:

using System;
using System.Collections.Generic;
using System.Text;
using System.Data;
using System.Text.RegularExpressions;
using System.IO;
using System.Net;
using System.Windows.Forms;


namespace VainWebSpider
{
    #region WebScreenScraper CLASS
    /// <summary>
    /// This class reads the entire contents of the article summary codeproject
    /// web page for the currently selected user. An example URL for such a codeproject
    /// page may be 
    /// http://www.codeproject.com/script/articles/list_articles.asp?userid=569009
    /// which would fetch all articles for author 569009 that's Sacha Barber, which
    /// is Me.
    /// Data within this page is then extracted using regular expressions which are then
    /// used to create new <see cref="CPArticle">
    ///       CPArticle</see> objects. The values within
    /// these new CPArticle objects in then used to create 
    /// a <see cref="DataTable">DataTable
    /// </see> which is used by the
    ///       <see cref="frmMain">main interface </see>
    /// </summary>
    public class WebScreenScraper
    {
        #region Instance Fields
        // Fields
        private List<CPArticle> cpArticlesForUser;
        private bool hasArticles;
        private string authorName;
        private long userId;
        public string webContent;
        public event EventHandler EndParse;
        public event EventHandler StartParse;
        #endregion
        #region Constructor
        /// <summary>
        /// Constructs a new WebScreenScraper using the parameters provided
        /// </summary>
        /// <param name="userId">The codeproject
        ///             user to fetch articles for</param>
        public WebScreenScraper(long userId)
        {
            this.hasArticles = true;
            this.cpArticlesForUser = new List<CPArticle>();
            this.userId = userId;
        }
        #endregion
        #region Public Properties / Methods
        /// <summary>
        /// Raises the start event, then calls the following methods
        /// readInSiteContents(..) and getArticleSummaryArea(..)
        /// </summary>
        public void getInitialData()
        {
            this.OnStartParse(this, new EventArgs());
            this.readInSiteContents();
            this.getArticleSummaryArea();
        }

        /// <summary>
        /// Returns a <see cref="DataTable">DataTable<see/>
        /// of all the articles founf (if any) for the current
        /// codeproject user
        /// </summary>
        /// <returns>A <see cref="DataTable">DataTable<see/>
        /// which holds all the articles found for the current
        /// codeproject user</returns>
        public DataTable getWebData()
        {
            //screen scape the web page, to gather the
            //data that we are intersted in
            List<long> lViews = this.getViews();
            List<string> lRatings = this.getRatings();
            List<int> lVotes = this.getVotes();
            List<float> lPopularity = this.getPopularity();
            List<string> lURLS = this.getArticleURLS();
            //create new CPArticles using the extracted data
            for (int i = 0; i < lViews.Count; i++)
            {
                this.cpArticlesForUser.Add(new CPArticle(
                                                    lViews[i],
                                                    lRatings[i],
                                                    lVotes[i],
                                                    lPopularity[i],
                                                    lURLS[i]));
            }
            //raise the finished event, to alert the event subscribers
            //that we are now donw
            this.OnEndParse(this, new EventArgs());
            //return the DataTable to the caller
            return this.createDataSet();
        }

        /// <summary>
        /// Returns true if the currently parsed web page has 
        /// codeproject articles. Some codeproject users dont
        /// have an articles published
        /// </summary>
        public bool HasArticles
        {
            get { return this.hasArticles; }
        }

        /// <summary>
        /// Gets the number of articles for the currently
        /// requested codeproject member
        /// </summary>
        public int NoOfArticles
        {
            get { return this.cpArticlesForUser.Count; }
        }

        /// <summary>
        /// Gets the name for the currently requested 
        /// codeproject member
        /// </summary>
        public string AuthorName
        {
            get { return this.authorName; }
        }
        #endregion
        #region Events
        /// <summary>
        /// Raised when the parsing of the requested codeproject page is completed
        /// </summary>
        /// <param name="sender"><see
        ///    cref="WebScreenScraper">
        ///         WebScreenScraper</see></param>
        /// <param name="e"><see
        ///     cref="WebScreenScraper">EventArgs</see></param>
        public void OnEndParse(object sender, EventArgs e)
        {
            if (this.EndParse != null)
            {
                this.EndParse(this, e);
            }
        }

        /// <summary>
        /// Raised at the start of parsing of the requested codeproject page
        /// </summary>
        /// <param name="sender"><see
        ///     cref="WebScreenScraper">WebScreenScraper</see></param>
        /// <param name="e"><see
        ///     cref="WebScreenScraper">EventArgs</see></param>
        public void OnStartParse(object sender, EventArgs e)
        {
            if (this.StartParse != null)
            {
                this.StartParse(this, e);
            }
        }
        #endregion
        #region Private Methods
        /// <summary>
        /// Returns a <see cref="DataTable">DataTable<see/>
        /// of all the articles founf (if any) for the current
        /// codeproject user
        /// </summary>
        /// <returns>A <see cref="DataTable">DataTable</see>
        ///  which holds all the article details for the current
        ///  code project user </returns>
        private DataTable createDataSet()
        {
            //create a new DataTable and set up the column types
            DataTable dt = new DataTable("CPArticles");
            dt.Columns.Add("ArticleURL", Type.GetType("System.String"));
            dt.Columns.Add("Views", Type.GetType("System.Int64"));
            dt.Columns.Add("Ratings", Type.GetType("System.String"));
            dt.Columns.Add("Votes", Type.GetType("System.Int32"));
            dt.Columns.Add("Popularity", Type.GetType("System.Single"));
            //loop through all the previously fetched CPArticle(s) and
            //add the contents of each to the DataTable
            foreach (CPArticle cpa in this.cpArticlesForUser)
            {
                DataRow row = dt.NewRow();
                row["ArticleURL"] = cpa.ArticleURL;
                row["Views"] = cpa.Views;
                row["Ratings"] = cpa.Ratings;
                row["Votes"] = cpa.Votes;
                row["Popularity"] = cpa.Popularity;
                dt.Rows.Add(row);
            }
            return dt;
        }

        /// <summary>
        /// Trimes the entire web content read from codeproject page, to being
        /// just the article summary area. Which is a much smaller more manageable
        /// string. Which means that the webContent instance field now contains
        /// a string which has ALL the details we need, but none of the other stuff
        /// which is off no interest.
        /// </summary>
        private void getArticleSummaryArea()
        {
            //clear all the articles that may have been stored for
            //the previous run
            this.cpArticlesForUser.Clear();
            //check for no articles found
            if (this.webContent.Contains("(No articles found)"))
            {
                this.webContent = "";
                this.hasArticles = false;
                this.authorName = "";
            }
            else
            {
                //check for an author name, codeproject article summary page
                //always uses <a name='#Author0'> to denote author text
                int idx = this.webContent.IndexOf("<a name='#Author0'>", 0);
                if (idx > 0)
                {
                    this.webContent = this.webContent.Substring(idx);
                    this.hasArticles = true;
                    this.authorName = getAuthor();
                }
                //ERROR, no author, no articles, this is bad, must be totally
                //unknown user as codepeoject web site
                else
                {
                    this.webContent = "";
                    this.hasArticles = false;
                    this.authorName = "";
                }
            }
        }

        /// <summary>
        /// returns a string, which represents the name of the author
        /// for all the articles for the current codeproject user
        /// This name is extracted by using the RegEx
        /// pattern "Articles by [a-z\sA-Z]*" on the 
        /// codeproject web page for the current user
        /// </summary>
        /// <returns>a string, which represent the authors name
        /// for all the articles for the current codeproject user</returns>
        private string getAuthor()
        {
            string pattern = @"Articles by [a-z\sA-Z]*";
            MatchCollection matches = Regex.Matches(this.webContent, 
                pattern, RegexOptions.ExplicitCapture);
            List<string> author = new List<string>();
            foreach (Match m in matches)
            {
                int idx = m.Value.LastIndexOf("by") + "by ".Length;
                author.Add(m.Value.Substring(idx));
            }
            return author[0].Trim();
        }

        /// <summary>
        /// returns a list of strings, which represent the URLs
        /// for all the articles for the current codeproject user
        /// These URLs are extracted by using the RegEx
        /// pattern "<a href='([-a-zA-Z_/#0-9]*).asp'>" on the 
        /// codeproject web page for the current user
        /// </summary>
        /// <returns>a generic list of strings, which represent the URLs
        /// for all the articles for the current codeproject user</returns>
        private List<string> getArticleURLS()
        {
            string pattern = "<a href='([-a-zA-Z_/#0-9]*).asp'>";
            MatchCollection matches = Regex.Matches(this.webContent, 
                pattern, RegexOptions.ExplicitCapture);
            List<string> urls = new List<string>();
            foreach (Match m in matches)
            {
                urls.Add(m.Value.Replace("<a href='", "").Replace("'>", ""));
            }
            return urls;
        }

        /// <summary>
        /// returns a list of floats, which represent the Popularity
        /// for all the articles for the current codeproject user
        /// These Popularity are extracted by using the RegEx
        /// pattern "Popularity: [0-9.]*" on the codeproject web page 
        /// for the current user
        /// </summary>
        /// <returns>a generic list of floats, which represent the Popularity
        /// for all the articles for the current codeproject user</returns>
        private List<float> getPopularity()
        {
            string pattern = "Popularity: [0-9.]*";
            MatchCollection matches = Regex.Matches(this.webContent, 
                pattern, RegexOptions.ExplicitCapture);
            List<float> lPopularity = new List<float>();
            foreach (Match m in matches)
            {
                int idx = m.Value.LastIndexOf(":") + 2;
                lPopularity.Add(float.Parse(m.Value.Substring(idx)));
            }
            return lPopularity;
        }

        /// <summary>
        /// returns a list of strings, which represent the Ratings
        /// for all the articles for the current codeproject user
        /// These Ratings are extracted by using the RegEx
        /// pattern "Rating: [0-9./]*" on the codeproject web page 
        /// for the current user
        /// </summary>
        /// <returns>a generic list of strings, which represent the Ratings
        /// for all the articles for the current codeproject user</returns>
        private List<string> getRatings()
        {
            string pattern = "Rating: [0-9./]*";
            MatchCollection matches = Regex.Matches(this.webContent, 
                pattern, RegexOptions.ExplicitCapture);
            List<string> lRatings = new List<string>();
            foreach (Match m in matches)
            {
                int idx = m.Value.LastIndexOf(":") + 2;
                lRatings.Add(m.Value.Substring(idx));
            }
            return lRatings;
        }

        /// <summary>
        /// returns a list of longs, which represent the views
        /// for all the articles for the current codeproject user
        /// These views are extracted by using the RegEx
        /// pattern "Page views: [0-9,]*" on the codeproject web page 
        /// for the current user
        /// </summary>
        /// <returns>a generic list of longs, which represent the views
        /// for all the articles for the current codeproject user</returns>
        private List<long> getViews()
        {
            string pattern = "Page views: [0-9,]*";
            MatchCollection matches = Regex.Matches(this.webContent, 
                pattern, RegexOptions.ExplicitCapture);
            List<long> lViews = new List<long>();
            foreach (Match m in matches)
            {
                int idx = m.Value.LastIndexOf(":") + 2;
                lViews.Add(long.Parse(m.Value.Substring(idx).Replace(",", "")));
            }
            return lViews;
        }

        /// <summary>
        /// returns a list of ints, which represent the votes
        /// for all the articles for the current codeproject user
        /// These votes are extracted by using the RegEx
        /// pattern "Votes: [0-9]*" on the codeproject web page 
        /// for the current user
        /// </summary>
        /// <returns>a generic list of ints, which represent the votes
        /// for all the articles for the current codeproject user</returns>
        private List<int> getVotes()
        {
            string pattern = "Votes: [0-9]*";
            MatchCollection collection1 = Regex.Matches(this.webContent, 
                pattern, RegexOptions.ExplicitCapture);
            List<int> lVotes = new List<int>();
            foreach (Match m in collection1)
            {
                int num1 = m.Value.LastIndexOf(":") + 2;
                lVotes.Add(int.Parse(m.Value.Substring(num1)));
            }
            return lVotes;
        }

        /// <summary>
        /// Reads the entire contents of the article summary codeproject web page
        /// for the currently selected user. An example URL for such a codeproject
        /// page may be 
        /// http://www.codeproject.com/script/articles/list_articles.asp?userid=569009
        /// which would fetch all articles for author 569009 that's Sacha Barber, which
        /// is Me.
        /// </summary>
        private void readInSiteContents()
        {
            WebClient wc = null;
            Stream strm = null;

            try
            {
                //open the codeproject site, for the currently selected user
                //basiaclly get the article summary for the currently selected user
                wc = new WebClient();
                strm = wc.OpenRead(
                    "http://www.codeproject.com/script/articles/" +
                    "list_articles.asp?userid=" + this.userId);
                //read the contents into the webContent instance field
                using (StreamReader reader = new StreamReader(strm))
                {
                    string line;
                    StringBuilder sBuilder = new StringBuilder();
                    while ((line = reader.ReadLine()) != null)
                    {
                        sBuilder.AppendLine(line);
                    }
                    this.webContent = sBuilder.ToString();
                }
            }
            catch (Exception)
            {
                Program.ErrorBox(
                "Could not access web site http://www.codeproject.com/script/" +
                "articles/list_articles.asp?userid=" + this.userId);
            }
            finally
            {
                //release the held resources if they need releasing
                if (wc != null) { wc.Dispose(); }
                if (strm != null) { strm.Close(); }
            }
        }
        #endregion
    }
    #endregion
}

CPArticle Class

This class represents a CodeProject article:

using System;
using System.Collections.Generic;
using System.Text;

namespace VainWebSpider
{
    #region CPArticle CLASS
    /// <summary>
    /// Provides a single code project article summary object, which has
    /// the following properties : Votes, views, popularity, ratings and
    /// an article URL
    /// </summary>
    public class CPArticle
    {
        #region Instance fields
        // Fields
        private string articleURL;
        private float popularity;
        private string ratings;
        private long views;
        private int votes;
        #endregion
        #region Constructor
        /// <summary>
        /// Creates a new CPArticle object, assigning the contructor parameters
        /// to public properties
        /// </summary>
        /// <param name="views">The number of view for the article</param>
        /// <param name="ratings">The ratings for the article</param>
        /// <param name="votes">The number of votes for the article</param>
        /// <param name="popularity">The popularity for the article</param>
        /// <param name="articleURL">The article url</param>
        public CPArticle(long views, string ratings, int votes, 
                         float popularity, string articleURL)
        {
            this.views = views;
            this.ratings = ratings;
            this.votes = votes;
            this.popularity = popularity;
            this.articleURL = articleURL;
        }
        #endregion
        #region Public Properties
        /// <summary>
        /// Gets the Views for the current CPArticle
        /// </summary>
        public long Views
        {
            get { return this.views; }
        }

        /// <summary>
        /// Gets the Ratings for the current CPArticle
        /// </summary>
        public string Ratings
        {
            get { return this.ratings; }
        }

        /// <summary>
        /// Gets the Votes for the current CPArticle
        /// </summary>
        public int Votes
        {
            get { return this.votes; }
        }

        /// <summary>
        /// Gets the Popularity for the current CPArticle
        /// </summary>
        public float Popularity
        {
            get { return this.popularity; }
        }

        /// <summary>
        /// Gets the ArticleURL for the current CPArticle
        /// </summary>
        public string ArticleURL 
        {
            get { return this.articleURL; }
        }
        #endregion
    } 
    #endregion
}

frmLoader Class

This class is the the initial form shown (for the complete Designer listing, refer to the attached application).

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using Microsoft.Win32;

namespace VainWebSpider
{
    #region frmLoader CLASS
    /// <summary>
    /// This form obtains the currently selected user from the registry
    /// (if there is a current user, if may be the 1st run, so there wont be)
    /// by using the <see cref="Program">Programs </see>readFromRegistry(..) 
    /// method. This class also allows the user to change the currently selected user
    /// via clicking a change user hyperlink. The user may also show the 
    /// <see cref="frmMain"> main interface</see> from this form using the hyperlink
    /// provided
    /// </summary>
    public partial class frmLoader : Form
    {
        #region Contructor
        /// <summary>
        /// Constructs a new frmLoader object
        /// </summary>
        public frmLoader()
        {
            InitializeComponent();
        }
        #endregion
        #region Private Methods
        /// <summary>
        /// Allows the user to specify a new UserId
        /// to fetch codeproject articles for by the
        /// use of a <see cref="InputBoxDialog">InputBoxDialog </see>
        /// The value entered must be a postive number
        /// </summary>
        /// <param name="sender">lnkChangeUser</param>
        /// <param name="e">LinkLabelLinkClickedEventArgs</param>
        private void lnkChangeUser_LinkClicked(object sender,
            LinkLabelLinkClickedEventArgs e)
        {
            //get the new userId
            string stringEntered = 
              Program.InputBox("Enter a new user ID to examine",
              "Enter a new user ID", "");
            //check for empty
            if (stringEntered.Equals(string.Empty)) 
            {
                Program.ErrorBox("You must enter a value for the userId");
            }
            else 
            {
                try 
                {
                    //make sure its a positive number, then update the Program
                    //held property
                    long userId = long.Parse(stringEntered);
                    if (userId > 0)
                    {
                        Program.UserID = userId;
                        lblCurrentUser.Text = 
                        "Currently set-up to fetch articles for user ID : " + 
                        Program.UserID.ToString();

                    }
                    else
                    {
                        Program.ErrorBox("User ID must be a postive value");
                    }
                }
                //its not a number that was entered, tell them off
                catch(Exception ex) 
                {
                    Program.ErrorBox("The value you entered was not valid\r\n" +
                                    "The user ID must be a number");
                }
            }
        }

        /// <summary>
        /// Check to see if there is already a user within the registry (from last time)
        /// to fetch codeproject articles for, by using the <see cref="Program">Programs
        ///  </see>readFromRegistry(..) method. And update this forms GUI accordingly
        /// </summary>
        /// <param name="sender">frmLoader</param>
        /// <param name="e">EventArgs</param>
        private void frmLoader_Load(object sender, EventArgs e)
        {
            //check if there is a user in the registry, if there is a user
            //update the Program class and the GUI label
            long userId = Program.readFromRegistry();
            Program.UserID = userId;
            if (userId != -1)
            {
                lblCurrentUser.Text = "Currently set-up to fetch " + 
                                      "articles for user ID : " + userId.ToString();
            }
            else
            {
                lblCurrentUser.Text = "Not setup for any user as yet, " + 
                                      "use the link to pick a new user";
            }
        }

        /// <summary>
        /// Create and show a new <see
        ///    cref="frmMain">frmMain</see> object, and hide this form
        /// </summary>
        /// <param name="sender">lnkLoadMainForm</param>
        /// <param name="e">LinkLabelLinkClickedEventArgs</param>
        private void lnkLoadMainForm_LinkClicked(object sender, 
                     LinkLabelLinkClickedEventArgs e)
        {
            frmMain fMain = new frmMain();
            this.Hide();
            fMain.ShowDialog(this);
        }
        #endregion
    }
    #endregion
}

frmMain Class

This class is the main interface (for the complete Designer listing, refer to the attached application).

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using Microsoft.Win32;

namespace VainWebSpider
{
    #region frmMain CLASS
    /// <summary>
    /// Creates a new BackGroundWorker which creates a new 
    /// <see cref="WebScreenScraper">WebScreenScraper </see>
    /// and subscribe to its StartParse/EndParse events. If the  WebScreenScraper
    /// data signifies that the currently selected codeproject user has some
    /// articles, get the article data out of the WebScreenScraper, and display
    /// the data in a DataGridView.
    /// </summary>
    /// <param name="sender">BackgroundWorker</param>
    /// <param name="e">DoWorkEventArgs</param>
    public partial class frmMain : Form
    {
        #region Instance Fields
        //instance fields
        private Boolean formShown = true;
       
        #endregion
        #region Contructor
        /// <summary>
        /// Constructs a new frmMain object
        /// </summary>
        public frmMain()
        {
            InitializeComponent();
        }
        #endregion
        #region Private Methods

        /// <summary>
        /// User double clicked the system tray icon, so if the form
        /// is shown it is hidden, if its hidden its shown
        /// </summary>
        /// <param name="sender">The notify icon</param>
        /// <param name="e">The event arguments</param>
        private void nfIcon_DoubleClick(object sender, EventArgs e)
        {
            if (formShown)
            {
                this.Hide();
                formShown = false;
            }
            else
            {
                this.Show();
                formShown = true;
            }
        }

        /// <summary>
        /// Shows the form
        /// </summary>
        /// <param name="sender">The show menu</param>
        /// <param name="e">The event arguments</param>
        private void showFormToolStripMenuItem_Click(object sender, EventArgs e)
        {
            this.Show();
        }

        /// <summary>
        /// Hides the form
        /// </summary>
        /// <param name="sender">The hide menu</param>
        /// <param name="e">The event arguments</param>
        private void hideFormToolStripMenuItem_Click(object sender, EventArgs e)
        {
            this.Hide();
        }

        /// <summary>
        /// Calls the ClearRemoteObjectReference() method if the user confirms they
        /// wish to quit.
        /// </summary>
        /// <param name="sender">The exit menu</param>
        /// <param name="e">The event arguments</param>
        private void exitToolStripMenuItem_Click(object sender, EventArgs e)
        {
            DialogResult dr = MessageBox.Show("Are you sure you want to quit.\r\n" +
                "There may be client connected at present", "Exit",
                 MessageBoxButtons.YesNo, MessageBoxIcon.Question);
            if (dr.Equals(DialogResult.Yes))
            {
                Application.Exit();
            }
        }
        
        /// <summary>
        /// Creates a new <see cref="WebScreenScraper">WebScreenScraper </see>
        /// and subscribe to its StartParse/EndParse events. If the  WebScreenScraper
        /// data signifies that the currently selected codeproject user has some
        /// articles, get the article data out of the WebScreenScraper, and display
        /// the data in a DataGridView.
        /// </summary>
        /// <param name="sender">BackgroundWorker</param>
        /// <param name="e">DoWorkEventArgs</param>
        private void bgw_DoWork(object sender, DoWorkEventArgs e)
        {
            //create a new WebScreenScraper and subscribe to its events
            WebScreenScraper wss = new WebScreenScraper(Program.UserID);
            wss.StartParse += new EventHandler(wss_StartParse);
            wss.EndParse += new EventHandler(wss_EndParse);
            //get the initial article summary area only, discard the other 
            //text that doesnt hold any text we need to parse
            wss.getInitialData();

            //need to test for an invoke initially, as the BackgroundWorker
            //that is run to do the web site parsing is on a different handle
            //to that of this forms controls, so will need to be marshalled to
            //the correct thread handle, on order to change properties
            if (this.InvokeRequired)
            {
                this.Invoke(new EventHandler(delegate
                {
                    //are there any articles for the current user
                    if (wss.HasArticles)
                    {
                        //only worry about getting the rest if the
                        //author has articles
                        DataTable dt = wss.getWebData();
                        lblCurrentUser.Text = wss.AuthorName + " " +
                            wss.NoOfArticles + " articles available";
                        
                        //check there is at least 1 article, before showing the 
                        //article DataGridView
                        if (dt.Rows.Count > 0)
                        {
                            dgArticles.Columns.Clear();
                            dgArticles.DataSource = dt;
                            alterColumns();
                            resizeColumns();
                            dgArticles.Visible = true;
                            pnlResults.Visible = true;
                            this.Invalidate();
                            Application.DoEvents();
                        }
                        //known author, but no articles to show
                        else
                        {
                            dgArticles.Visible = false;
                            pnlResults.Visible = false;
                            this.Invalidate();
                            Application.DoEvents();
                        }
                    }
                    //there are no articles to show, so update GUI to show this
                    else
                    {
                        pnlResults.Visible = false;
                        lblCurrentUser.Text = "Unknown Or Unpublished Author";
                        lblProgress.Visible = false;
                        prgBar.Visible = false;
                        dgArticles.Visible = false;
                        pnlResults.Visible = false;
                        pnlUser.Visible = true;
                        this.Invalidate();
                        Application.DoEvents();
                        Program.InfoBox(
                            "There are no CodeProject articles avaialble for user ("
                            + Program.UserID + ")");
                    }
                }));
            }
        }

        /// <summary>
        /// Alter the article DataGridView columns, by firstly adding an image column
        /// which will be a new column index of 4. And then Delete the auto mapped
        /// "ArticleURL" column, and create a new DataGridViewLinkColumn column for
        /// the "ArticleURL" column, which will be column index 5.
        /// </summary>
        private void alterColumns()
        {

            //need to catch this, as this column may not be in existence
            //when the request to remove it is made.
            try
            {
                //remove existing ArticleURL column
                dgArticles.Columns.Remove("ArticleURL");
            }
            catch (Exception ex)
            {
                //cant do much about the removal of a non-existent column
            }
            //create a new image column
            DataGridViewImageColumn imgs = new DataGridViewImageColumn();
            imgs.Image = global::VaneWebSpider.FormResources.LinkIcon;
            imgs.DisplayIndex = 0;
            imgs.Width = 40;
            dgArticles.Columns.Add(imgs);
            //create a new hyperlink column
            DataGridViewLinkColumn links = new DataGridViewLinkColumn();
            links.HeaderText = "ArticleURL";
            links.DataPropertyName = "ArticleURL";
            links.ActiveLinkColor = Color.Blue;
            links.LinkBehavior = LinkBehavior.SystemDefault;
            links.LinkColor = Color.Blue;
            links.SortMode = DataGridViewColumnSortMode.Automatic;
            links.TrackVisitedState = true;
            links.VisitedLinkColor = Color.Blue;
            links.DisplayIndex = 1;
            links.Width = 300;
            dgArticles.Columns.Add(links);
        }

        /// <summary>
        /// Resize all article DataGridView columns to fixed sizes
        /// </summary>
        private void resizeColumns()
        {
            //resize all other columns to have default width of 60
            dgArticles.Columns[0].Width = 60; //Views column
            dgArticles.Columns[1].Width = 60; //Ratings column
            dgArticles.Columns[2].Width = 60; //Votes column
            dgArticles.Columns[3].Width = 60; //Popularity column
        }

        /// <summary>
        /// Puts all the GUI components into a EndParse state
        /// </summary>
        /// <param name="sender"><see cref="WebScreenScraper">
        /// The WebScreenScraper</param>
        /// <param name="e">EventArgs</param>
        private void wss_EndParse(object sender, EventArgs e)
        {
            lblProgress.Visible = false;
            prgBar.Visible = false;
            pnlUser.Visible = true;
            pnlGridMainFill.Visible = true;
            this.Invalidate();
            Application.DoEvents();
        }

        /// <summary>
        /// Puts all the GUI components into a StartParse state
        /// </summary>
        /// <param name="sender"><see cref="WebScreenScraper">
        /// The WebScreenScraper</param>
        /// <param name="e">EventArgs</param>
        private void wss_StartParse(object sender, EventArgs e)
        {
            //need to test for an invoke initially, as the BackgroundWorker
            //that is run to do the web site parsing is on a different handle
            //to that of this forms controls, so will need to be marshalled to
            //the correct thread handle, on order to change properties
            if (this.InvokeRequired)
            {
                this.Invoke(new EventHandler(delegate
                {
                    lblProgress.Visible = true;
                    prgBar.Visible = true;
                    this.Invalidate();
                    Application.DoEvents();
                }));
            }
        }

        /// <summary>
        /// If the column of the DataridView clicked was the link column
        /// call the startProcess, passing it the correct URL to navigate to
        /// </summary>
        /// <param name="sender"></param>
        /// <param name="e"></param>
        private void dgArticles_CellContentClick(object sender, 
                     DataGridViewCellEventArgs e)
        {
            int LINK_COLUMN_INDEX = 5;
            //the link column is index 5,
            //as it was created at index 5, as there were
            //originally 5 auto generated columns
            //created by the WebScreenScraper.createDataSet() 
            //method, but then we deleted that auto-generated
            //column, and swapped it for a hyperlink
            //column which was added to the end of the
            //existing auto-generated columns. Thats why its
            //at index 5 which is a little strange, but there you go.
            if (e.ColumnIndex == LINK_COLUMN_INDEX)
            {
                startProcess(@"http://www.codeproject.com" +
                    dgArticles[e.ColumnIndex, e.RowIndex].Value.ToString());
            }
        }

        /// <summary>
        /// Attempts to start the process which has
        /// the name of the parameter supplied, So
        /// long as the process is a URL. Must start
        /// with www or http, as we are attempting
        /// to start a web browser
        /// </summary>
        /// <param name="target">The process to start</param>
        private void startProcess(string target)
        {
            // If the value looks like a URL, navigate to it.
            if (null != target && (target.StartsWith("www") || 
                               target.StartsWith("http")))
            {
                try
                {
                    System.Diagnostics.Process.Start(target);
                }
                catch (Exception ex)
                {
                    Program.ErrorBox("Problem with starting process " + target);
                }
            }
        }

        /// <summary>
        /// Creates a new BackgroundWorker thread and calls the 
        /// BackgroundWorkers bgw_DoWork(..) method, where the 
        /// argument is the value of the <see cref="Program">
        /// Program classes </see>UserID
        /// </summary>
        /// <param name="sender">frmMain</param>
        /// <param name="e">EventArgs</param>
        private void frmMain_Load(object sender, EventArgs e)
        {
            pnlUser.Visible = false;
            pnlGridMainFill.Visible = false;
            BackgroundWorker bgw = new BackgroundWorker();
            bgw.DoWork += new DoWorkEventHandler(bgw_DoWork);
            bgw.RunWorkerAsync(Program.UserID);
        }

        /// <summary>
        /// Allows the user to specify a new UserId
        /// to fetch codeproject articles for by the
        /// use of a <see cref="InputBoxDialog">InputBoxDialog </see>
        /// The value entered must be a postive number
        /// </summary>
        /// <param name="sender">lnkChangeUser</param>
        /// <param name="e">LinkLabelLinkClickedEventArgs</param>
        private void lnkChangeUser_LinkClicked(object sender, 
                     LinkLabelLinkClickedEventArgs e)
        {
            //get the new userId
            string stringEntered = 
                Program.InputBox("Enter a new user ID to examine",
                "Enter a new user ID", "");
            //check for empty
            if (stringEntered.Equals(string.Empty)) 
            {
                Program.ErrorBox("You must enter a value for the userId");
            }
            else 
            {
                try 
                {
                    //make sure its a positive number, then update the Program
                    //held property
                    long uId = long.Parse(stringEntered);
                    if (uId > 0)
                    {
                        Program.UserID = uId;
                        BackgroundWorker bgw = new BackgroundWorker();
                        bgw.DoWork += new DoWorkEventHandler(bgw_DoWork);
                        bgw.RunWorkerAsync(Program.UserID);
                    }
                    else
                    {
                        Program.ErrorBox("User ID must be a postive value");
                    }
                }
                //its not a number that was entered, tell them off
                catch(Exception ex) 
                {
                    Program.ErrorBox("The value you entered was not valid\r\n" +
                                    "The user ID must be a number");
                }
            }
        }

        /// <summary>
        /// Hide the notify icon, and shutdown the application
        /// </summary>
        /// <param name="sender">frmMain</param>
        /// <param name="e">FormClosedEventArgs</param>
        private void frmMain_FormClosed(object sender, FormClosedEventArgs e)
        {
            nfIcon.Visible = false;
            Application.Exit();
        }

        /// <summary>
        /// Create and show a new <see
        ///     cref="frmPie">frmPie</see> object, and hide this form
        /// </summary>
        /// <param name="sender">lnkResults</param>
        /// <param name="e">LinkLabelLinkClickedEventArgs</param>
        private void lnkResults_LinkClicked(object sender, 
                     LinkLabelLinkClickedEventArgs e)
        {
            frmPie fPie = new frmPie();
            fPie.GridIsUse = dgArticles;
            fPie.AuthorString = lblCurrentUser.Text;
            this.Hide();
            fPie.ShowDialog(this);
            this.Show();
        }
        #endregion
    }
    #endregion
}

frmPie Class

This class is the the pie chart display window (for the complete Designer listing, refer to the attached application). This form makes use of a third party DLL, which is by Julijan Sribar, and is available right here at CodeProject at the following URL: pie library. Credit where credit is due. Thanks Julijan, great work.

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Text;
using System.Windows.Forms;
using Microsoft.Win32;

//http://www.codeproject.com/csharp/julijanpiechart.asp
using System.Drawing.PieChart;


namespace VainWebSpider
{
    #region frmPie CLASS
    /// <summary>
    /// Sets up the pie chart with the values that match
    /// the data type the user selected "views", "votes", 
    /// "popularity", "ratings"
    /// </summary>
    public partial class frmPie : Form
    {
        #region Instance Fields
        //instance fields
        private DataGridView gridInUse;
        private int CLUSTERED_THRESHOLD = 20;
        #endregion
        #region Contructor
        /// <summary>
        /// Constructs a new frmPie object
        /// </summary>
        public frmPie()
        {
            InitializeComponent();
        }
        #endregion
        #region Public Properties
        /// <summary>
        /// Sets the <see cref="DataGridView">DataGridView</see> to use
        /// </summary>
        public DataGridView GridIsUse
        {
            set { gridInUse = value; }
        }

        /// <summary>
        /// Sets the AuthorString to use
        /// </summary>
        public string AuthorString
        {
            set { lblCurrentUser.Text = value; }
        }
        #endregion
        #region Private Methods
        /// <summary>
        /// Calls the populatePieData() method and sets up some
        /// other miscellaneous pie chart values
        /// </summary>
        private void setupPie()
        {
            populatePieData();
            pnlPie.Font = new Font("Arial", 8F);
            pnlPie.ForeColor = SystemColors.WindowText;
            pnlPie.EdgeColorType = EdgeColorType.DarkerThanSurface;
            pnlPie.LeftMargin = 10F;
            pnlPie.RightMargin = 10F;
            pnlPie.TopMargin = 10F;
            pnlPie.BottomMargin = 10F;
            pnlPie.SliceRelativeHeight = 0.25F;
            pnlPie.InitialAngle = -90F;
        }

        /// <summary>
        /// Sets up the pie chart with the values that match
        /// the data type the user selected "views", "votes", 
        /// "popularity", "ratings"
        /// </summary>
        private void populatePieData()
        {
            //Switch on the current data type
            switch (cmbViewData.SelectedItem.ToString().ToLower())
            {
                //Views DataGridView column = 0
                //Rating DataGridView column = 1
                //Votes DataGridView column = 2
                //Popularity DataGridView column = 3
                //URL DataGridView column = 5
                case "views" :
                    getGridData("views", 0);
                    break;
                case "votes":
                    getGridData("votes", 2);
                    break;
                case "popularity":
                    getGridData("popularity", 3);
                    break;
                case "ratings":
                    getGridData("ratings", 1);
                    break;
                default:
                    getGridData("views", 0);
                    break;
            }
        }

        /// <summary>
        /// Returns a single dimesion decimal array of data, extracted
        /// from this forms gridInUse field, which is then used to display
        /// on the embedded pie chart
        /// </summary>
        /// <param name="type">The type
        ///       of columns "views", "votes", 
        /// "popularity", "ratings" </param>
        /// <param name="column">Column number 0-3</param>
        private void getGridData(string type, int column)
        {
            try
            {
                //setup some golding fields for the pie data
                int qty = gridInUse.RowCount;
                decimal[] results = new decimal[qty];
                string[] pieToolTips = new string[qty];
                string[] pieText = new string[qty];
                float[] pieRelativeDisplacements = new float[qty];
                Color[] pieColors = new Color[qty];
                int alpha = 60;
                Random rnd = new Random();
                Color[] colorsAvailable = new Color[] { Color.FromArgb(alpha, Color.Red), 
                                                Color.FromArgb(alpha, Color.Green), 
                                                Color.FromArgb(alpha, Color.Yellow), 
                                                Color.FromArgb(alpha, Color.Blue),
                                                Color.FromArgb(alpha, Color.CornflowerBlue), 
                                                Color.FromArgb(alpha, Color.Cyan), 
                                                Color.FromArgb(alpha, Color.DarkGreen), 
                                                Color.FromArgb(alpha, Color.PeachPuff),
                                                Color.FromArgb(alpha, Color.Plum), 
                                                Color.FromArgb(alpha, Color.Peru)         };
                //loop through the grid and set up the pie chart to use the grids data
                for (int i = 0; i < gridInUse.RowCount; i++)
                {
                    //Views DataGridView column = 0
                    //Rating DataGridView column = 1
                    //Votes DataGridView column = 2
                    //Popularity DataGridView column = 3
                    //URL DataGridView column = 5
                    pieToolTips[i] = "URL " + gridInUse[5, i].Value.ToString() + " " +
                                     "Views " + gridInUse[0, i].Value.ToString() + " " +
                                     "Rating " + gridInUse[1, i].Value.ToString() + " " +
                                     "Votes " + gridInUse[2, i].Value.ToString() + " " +
                                     "Popularity " + gridInUse[3, i].Value.ToString();
                    if (type.Equals("ratings"))
                    {
                        string val = gridInUse[column, i].Value.ToString();
                        int idx = val.LastIndexOf("/");
                        string sNewValue = val.Substring(0, idx);
                        results[i] = decimal.Parse(sNewValue);
                    }
                    else
                    {
                        results[i] = decimal.Parse(gridInUse[column, i].Value.ToString());
                    }
                    //if there are loads of articles, we dont want any text on pie chunks
                    //as it becomes illegible
                    if (gridInUse.RowCount < CLUSTERED_THRESHOLD)
                    {
                        pieText[i] = gridInUse[column, i].Value.ToString();
                    }
                    else
                    {
                        pieText[i] = " ";
                    }
                    pieRelativeDisplacements[i] = 0.1F;
                    int idxColor = rnd.Next(0, colorsAvailable.Length - 1);
                    pieColors[i] = colorsAvailable[idxColor];
                }
                //update the pie components
                pnlPie.ToolTips = pieToolTips;
                pnlPie.Texts = pieText;
                pnlPie.SliceRelativeDisplacements = pieRelativeDisplacements;
                pnlPie.Colors = pieColors;
                pnlPie.Values = results;

            }
            catch (Exception ex)
            {
                //Cant do much about it, but catch it all the same.
                //just dont update pie chart if we get an Exception
            }
        }

        /// <summary>
        /// Selects the 1st index in the cmbViewData combobox and then
        /// Calls the setupPie() method
        /// </summary>
        /// <param name="sender">frmPie</param>
        /// <param name="e">EventArgs</param>
        private void frmPie_Load(object sender, EventArgs e)
        {
            cmbViewData.SelectedIndex = 1;
            setupPie();
        }

        /// <summary>
        /// Calls the setupPie() method
        /// </summary>
        /// <param name="sender">cmbViewData</param>
        /// <param name="e">EventArgs</param>
        private void cmbViewData_SelectedValueChanged(object sender, EventArgs e)
        {
            setupPie();
        }
        #endregion
    }
    #endregion
}

InputBoxDialog Class

This class is a simple input box.

using System;
using System.Drawing;
using System.Collections;
using System.ComponentModel;
using System.Windows.Forms;

namespace VainWebSpider
{
    #region InputBoxDialog CLASS
    /// <summary>
    /// Provides a generic modal text input box, for use with
    /// any other form
    /// </summary>
    public class InputBoxDialog : System.Windows.Forms.Form
    {
        #region Instance Fields
        //instance fields
        string formCaption = string.Empty;
        string formPrompt = string.Empty;
        string inputResponse = string.Empty;
        string defaultValue = string.Empty;
        private System.Windows.Forms.Label lblPrompt;
        private System.Windows.Forms.Button btnOK;
        private System.Windows.Forms.Button btnCancel;
        private System.Windows.Forms.TextBox txtInput;
        private System.ComponentModel.Container components = null;
        #endregion
        #region Constructor
        /// <summary>
        /// Constructs a new InputBoxDialog object
        /// </summary>
        public InputBoxDialog()
        {
            InitializeComponent();
        }
        #endregion
        #region Windows Form Designer generated code
        /// <summary>
        /// Required method for Designer support - do not modify
        /// the contents of this method with the code editor.
        /// 
        private void InitializeComponent()
        {
            this.lblPrompt = new System.Windows.Forms.Label();
            this.btnOK = new System.Windows.Forms.Button();
            this.btnCancel = new System.Windows.Forms.Button();
            this.txtInput = new System.Windows.Forms.TextBox();
            this.SuspendLayout();
            // 
            // lblPrompt
            // 
            this.lblPrompt.Anchor = (
                (System.Windows.Forms.AnchorStyles)((((
                System.Windows.Forms.AnchorStyles.Top | 
                System.Windows.Forms.AnchorStyles.Bottom)
                        | System.Windows.Forms.AnchorStyles.Left)
                        | System.Windows.Forms.AnchorStyles.Right)));
            this.lblPrompt.BackColor = System.Drawing.SystemColors.Control;
            this.lblPrompt.Font = new System.Drawing.Font("Microsoft Sans Serif",
                8.25F, System.Drawing.FontStyle.Regular,
                System.Drawing.GraphicsUnit.Point, ((byte)(0)));
            this.lblPrompt.Location = new System.Drawing.Point(9, 35);
            this.lblPrompt.Name = "lblPrompt";
            this.lblPrompt.Size = new System.Drawing.Size(302, 22);
            this.lblPrompt.TabIndex = 3;
            // 
            // btnOK
            // 
            this.btnOK.DialogResult = System.Windows.Forms.DialogResult.OK;
            this.btnOK.Location = new System.Drawing.Point(265, 59);
            this.btnOK.Name = "btnOK";
            this.btnOK.Size = new System.Drawing.Size(60, 20);
            this.btnOK.TabIndex = 1;
            this.btnOK.Text = "Ok";
            this.btnOK.Click += new System.EventHandler(this.btnOK_Click);
            // 
            // btnCancel
            // 
            this.btnCancel.DialogResult = System.Windows.Forms.DialogResult.Cancel;
            this.btnCancel.Location = new System.Drawing.Point(331, 59);
            this.btnCancel.Name = "btnCancel";
            this.btnCancel.Size = new System.Drawing.Size(60, 20);
            this.btnCancel.TabIndex = 2;
            this.btnCancel.Text = "Cancel";
            this.btnCancel.Click += new System.EventHandler(this.btnCancel_Click);
            // 
            // txtInput
            // 
            this.txtInput.Location = new System.Drawing.Point(8, 59);
            this.txtInput.MaxLength = 40;
            this.txtInput.Name = "txtInput";
            this.txtInput.Size = new System.Drawing.Size(251, 20);
            this.txtInput.TabIndex = 0;
            // 
            // InputBoxDialog
            // 
            this.AutoScaleBaseSize = new System.Drawing.Size(5, 13);
            this.ClientSize = new System.Drawing.Size(398, 103);
            this.Controls.Add(this.txtInput);
            this.Controls.Add(this.btnCancel);
            this.Controls.Add(this.btnOK);
            this.Controls.Add(this.lblPrompt);
            this.FormBorderStyle = System.Windows.Forms.FormBorderStyle.FixedDialog;
            this.KeyPreview = true;
            this.MaximizeBox = false;
            this.MinimizeBox = false;
            this.Name = "InputBoxDialog";
            this.StartPosition = System.Windows.Forms.FormStartPosition.CenterScreen;
            this.Text = "InputBox";
            this.KeyDown += new System.Windows.Forms.KeyEventHandler(
                this.InputBoxDialog_KeyDown);
            this.Load += new System.EventHandler(this.InputBox_Load);
            this.ResumeLayout(false);
            this.PerformLayout();
        }

        #region Dispose
        /// <summary>
        /// Clean up any resources being used.
        /// 
        protected override void Dispose(bool disposing)
        {
            if (disposing)
            {
                if (components != null)
                {
                    components.Dispose();
                }
            }
            base.Dispose(disposing);
        }

        #endregion
        #endregion
        #region Public Properties
        // property FormCaption
        public string FormCaption
        {
            get { return formCaption; }
            set { formCaption = value; }
        }
        // property FormPrompt
        public string FormPrompt
        {
            get { return formPrompt; }
            set { formPrompt = value; }
        }
        // property InputResponse
        public string InputResponse
        {
            get { return inputResponse; }
            set { inputResponse = value; }
        }
        // property DefaultValue
        public string DefaultValue
        {
            get { return defaultValue; }
            set { defaultValue = value; }
        } 

        #endregion
        #region Form and Control Events
        /// <summary>
        /// The InputBoxDialog form load event, sets focus to the
        /// txtInput control
        /// </summary>
        /// <param name="sender">The InputBoxDialog</param>
        /// <param name="e">The event arguments</param>
        private void InputBox_Load(object sender, System.EventArgs e)
        {
            this.txtInput.Text = defaultValue;
            this.lblPrompt.Text = formPrompt;
            this.Text = formCaption;
            this.txtInput.SelectionStart = 0;
            this.txtInput.SelectionLength = this.txtInput.Text.Length;
            this.txtInput.Focus();
        }

        /// <summary>
        /// The btnOk click event, sets the InputResponse=txtInput
        /// and then closes the form
        /// </summary>
        /// <param name="sender">The btnOK</param>
        /// <param name="e">The event arguments</param>
        private void btnOK_Click(object sender, System.EventArgs e)
        {
            InputResponse = this.txtInput.Text;
            this.Close();
        }

        /// <summary>
        /// The btnCancel click event, closes the form
        /// </summary>
        /// <param name="sender">The btnCancel</param>
        /// <param name="e">The event arguments</param>
        private void btnCancel_Click(object sender, System.EventArgs e)
        {
            this.Close();
        }

        /// <summary>
        /// The InputBoxDialog key down event, if the key == Enter, sets the
        /// InputResponse=txtInput and then closes the form
        /// </summary>
        /// <param name="sender">The InputBoxDialog</param>
        /// <param name="e">The event arguments</param>
        private void InputBoxDialog_KeyDown(object sender, KeyEventArgs e)
        {
            if (e.KeyCode == Keys.Enter)
            {
                InputResponse = this.txtInput.Text;
                this.Close();
            }
        }
        #endregion
    }
    #endregion
}

Demonstration Screenshots

The first screen that is shown is the form (frmLoader) which looks as follows:

The VainWebSpider user can either select another CodeProject user to get articles about, or can simply proceed to the main interface using the links provided.

It can be seen in the screenshot above that there is already a user configured for the application. This user ID is stored within the Registry. Anytime a new user ID is picked, the Registry is updated.

The VainWebSpider key and its associated values are stored under HKEY_LOCAL_MACHINE\SOFTWARE\; a new folder "VainWebSpider" will be created. And the currently selected user ID will be stored in the VainWebSpider key. This allows the VainWebSpider application to know at start up which user it was using last time, or even if there was a user last time; if it's the first time the VainWebSpider application has been run, there won't be any Registry key or associated values. They will, of course, be created as soon as a user ID is selected.

The main interface (frmMain) when loaded looks as shown above, where all the articles for the currently selected user are presented in a standard ADO.NET DataGridView. The user may sort these entries using any of the column headers; they may also open the article by clicking on the hyperlink provided for each article.

The main interface (frmMain) also provides a notiy icon within the system tray to allow the VainWebSpider user to hide/show the main interface (frmMain) or exit the application entirely.

From the main interface (frmMain), the VainWebSpider user may also choose to examine the web results, using pie charts (big thanks to Julijan Sribar for his great (award winning even) pie chart library which is available here), which I simply had to find a use for. The VainWebSpider user may choose what results are shown within the pie chart. The tooltips show all the web results as the user hovers over the pie chunks.

The VainWebSpider user may also select a new user from the main interface (frmMain) using the "Select a different user" link provided. The application will grab the new entry via the use of the input box (inputboxdialog). If the value entered is a positive integer, the relevant website is queried and the new data extracted. This is shown below for CodeProject user number 1, that Chris Maunder, the originator of CodeProject.

It can be seen that Chris Maunder has quite a few articles, 102 when this article was published. As such, the pie diagram does not include any text on the pie chunks. This is due to visual clarity issues that arise when dealing with a CodeProject user that has loads of articles. The pie simply becomes too cluttered.

What Do You Think?

That's it. I would just like to ask, if you liked the article, please vote for it.

Conclusion

I think this article shows just how easy it is to trawl what could potentially be a very, very large amount of data, to extract the data required. On a personal level, I am fairly happy with how it all works, and will probably use it, as it's quicker to use than me firing up Firefox and going to my articles, then examining them, and it also shows it in the nice pie charts (again, thanks to Julijan Sribar for his great (award winning even) pie chart library which is available here).

Bugs

None that I know of.

History

  • v1.0: 22/12/06: Initial issue.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here