A class for getting the RSS feed list of a website

Paw Jershauge

3.33/5 (6 votes)

14 Aug 2007CPOL

358

A very simple class for listing the RSS feed from a website.

Download source code - 175 KB

Introduction

This is a very simple class for getting the RSS feed list of a website. Just parse the website URL in the constructor and you are off.....

Example:

FeedListOfWebsite MyFeedList = new FeedListOfWebsite(
   new Uri("http://www.codeproject.com"));
if (MyFeedList.Success)
{
    foreach (FeedDetail fd in MyFeedList.FeedDetails)
    {
        //Some code handle here....
    //fd.Name 
    //fd.Url
    }
}

The code

Here is the class constructor:

public FeedListOfWebsite(Uri WebsiteUrl)
//Constructor where you have to parse the Url to look for feed.
{
    Regex RegX = new Regex("<link.*type=\"application " + 
                 "feedsonwebsite="RegX.Matches(GetHtml(WebsiteUrl));" /> 0)
                 //If we find some link tags then...
    {
        //Set FeedDetail Array
        FeedDetails = new FeedDetail[FeedsOnWebsite.Count];
        int fdi = 0;        //      Array index count up value
        foreach (Match Feed in FeedsOnWebsite)
        //Loop through the link tags
        {
            //Extract data from the html line
            FeedDetails[fdi] = ExtractFeed(Feed.Value.ToString());
            fdi++;      //Count Array index 1 up
        }
        _success = true;
    }
}

Here are the private functions of the class:

private string GetHtml(Uri UriPath)
//a Function for getting HTML code of a website.
{
    try
    {
        //Create a Response
        HttpWebResponse Hwr = (HttpWebResponse)WebRequest.Create(UriPath).GetResponse();
        Stream Hwrstrm = Hwr.GetResponseStream();        //Get streamet data
        StreamReader HwrSr = new StreamReader(Hwrstrm);  //Create a streamreader
        string strHTML = HwrSr.ReadToEnd();              //Read all data from website.
        HwrSr.Dispose();                                 //Dispose object
        Hwrstrm.Dispose();                               //Dispose object
        Hwr.Close();                                     //Close object
        return strHTML;                                  //Return HTML code of website
    }
    catch
    {
        return "";  //Return empty string apon error.
    }
}


private FeedDetail ExtractFeed(string HtmlLine)
//a Function for extracting feed data from a HTML code
{
    string name = "";
    string url = "";
    
    #region Find The Title

    try
    {
        Match Title = Regex.Match(HtmlLine, "(?<=title=).*");
        if (Title.Success)
        {
            int EndOfTitle = Title.Value.ToString().IndexOf("\"", 1);
            if (EndOfTitle == -1)
            { EndOfTitle = Title.Value.ToString().IndexOf("'", 1); }
            name = Title.Value.ToString().Substring(0, EndOfTitle);
            name = name.Replace("\"", "").Replace("'", "");
        }
    }
    catch { name = "[Error finding name...]"; }

    #endregion

    #region Find the url

    try
    {
        Match Url = Regex.Match(HtmlLine, "(?<=href=).*");
        if (Url.Success)
        {
            int EndOfHref = Url.Value.ToString().IndexOf(" ", 1);
            if (EndOfHref == -1) { EndOfHref = Url.Value.ToString().IndexOf("\"", 1); }
            if (EndOfHref == -1) { EndOfHref = Url.Value.ToString().IndexOf("'", 1); }
            url = Url.Value.ToString().Substring(0, EndOfHref);
            url = url.Replace("\"", "").Replace("'", "");
        }
    }
    catch { url = ""; }

    #endregion

    return new FeedDetail(new Uri(url), name);
}

History

Updated the code after Marc Jacobi pointed out some basic stuff.
I have added one more class, just to illustrate working with RSS feeds.

The classes added are the following:

RssFeed
RssFeedEntries (sub class of RssFeed, holds all entries from the feed)
RssFeedEntry (sub class of RssFeedEntries, holds data for each entry in the feed)

I didn't have the time to write the description for these classes. I am sorry... so it is posted as is.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)