Introduction
In Part one of this series I showed you the history of RSS versions and the new standard for news feeds which is Atom. We also introduced the abilities of news reader applications. I think by now you must have got the feeling of how important these blogs are. I hope you have got your own blog installed on one of the various engines introduced, you can check my blog at Cairo Cafe. In this part we'll analyze the format of RSS versions and take a quick look at the Atom format, we'll make a custom RSS feed and for simplicity we'll consume the same feed that we develop. Note that we'll stick to RSS 2.0, as it's the most well known RSS version and the simplest one too.
Background
RSS 1.0, RSS 2.0, and Atom are XML based languages, they adhere to one schema and each output is introduced based on the schema of the feed itself, RSS 2.0 is the simplest one and it's widely used, more than the other formats. We'll stick to the RSS 2.0 here as it's the simplest one. At the end of this article, you will be able to make your own news feeds and you will be able to consume other website's feeds, like the latest articles provided by Code Project for ASP.NET category.
Different feeds formats and OPML
RSS 1.0 format
="1.0" ="utf-8"
<rdf:RDF xmlns:rdf="htpp://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns="http://purl.org/rss/1.0/" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel rdf:about="http://www.KareemShaker.com/myBlog/Articles.xml">
<title>Kareem Shaker Website</title>
<link>http://www.KareemShaker.com/Articles.aspx</link>
<description>Kareem Shaker is an Egyptian developer who likes to
exchange knowledge with developers all over the worlds</description>
<image rdf:resource="http://www.KareemShaker.com/Articles/images/Logo.gif"/>
<items>
<rdf:Seq>
<rdf:li resource="http://www.KareemShaker.com/Articles.aspx?Id=3212" />
<rdf:li resource="http://www.KareemShaker.com/Articles.aspx?Id=3552" />
</rdf:Seq>
</items>
</channel>
-->
<image rdf:about="http://www.KareemShaker.com/images/Logo.gif">
<title>KareemShaker.com</title>
<link>http://www.KareemShaker.com</link>
<url>http://www.KareemShaker.com/Articles/images/Logo.gif</url>
</image>
<item rdf:about="http://www.KareemShaker.com/Articles.aspx?Id=3212">
<title>DataSet Nitty Gritty</title>
<link>http://www.KareemShaker.com/Articles.aspx?Id=3212</link>
<description>Explains all disconnected environment provided by ADO.NET,
you will be using SQL Server and Oracle to build a simple POS System that
posts all the sales to a central headquarter</description>
<dc:date>2004-01-13T17:16:44.5605908-08:00</dc:date>
</item>
<item rdf:about="http://www.KareemShaker.com/Articles.aspx?Id=3552">
<title>Custom Controls Revisited</title>
<link>http://www.KareemShaker.com/Articles.aspx?Id=3552</link>
<description>Build a custom control that encapsulates the functionality
of Image gallery</description>
<dc:date>2004-01-13T17:16:44.5605908-08:00</dc:date>
</item>
</rdf:RDF>
As you can see above, the RSS 1.0 is based on RDF and its namespace qualified, you don't need to know more about the RDF but if you want to dig into it you can review the W3C RDF standard, all the items referenced are listed after the closing element of "channel" and it's referenced in the items collection which is listed between the channel opening and closing tags. This provides the flexibility of referencing any item anywhere within the RSS 1.0 document.
RSS 2.0 format
="1.0" ="utf-8"
<rss version="2.0">
<channel>
<title>Kareem Shaker Website</title>
<link>http://www.KareemShaker.com/Articles.aspx</link>
<description>Kareem Shaker is an Egyptian developer who
likes to exchange knowledge with developers all
over the world</description>
<image>
<url>"http://www.KareemShaker.com/Articles/images/Logo.gif"</url>
<title>KareemShaker.com</title>
<link>http://www.KareemShaker.com</link>
</image>
<item>
<title>DataSet Nitty Gritty</title>
<link>http://www.KareemShaker.com/Articles.aspx?Id=3212</link>
<description>Explains all disconnected environment provided by ADO.NET,
you will be using SQL Server and Oracle to build a simple POS System
that posts all the sales to a central headquarter</description>
<pubDate>Wed, 14 Jan 2004 16:16:16 GMT</pubDate>
</item>
<item>
<title>Custom Controls Revisited</title>
<link>http://www.KareemShaker.com/Articles.aspx?Id=3552</link>
<description>Build a custom control that encapsulates the functionality
of Image gallery</description>
<pubDate>Wed, 14 Jan 2004 20:50:44 GMT</pubDate>
</item>
</channel>
</rss>
RSS 2.0 is the simplest standard and it's widely used. The root element is RSS and the version attribute is mandatory. As you can see the items are just serialized within the channel body and no namespaces are used, RSS 2.0 is simple to consume and produce. Some elements are required for RSS 2.0 document and others are optional. You can review the complete detailed schema definition here.
Atom 0.3 format
="1.0" ="utf-8"
<feed version="0.3" xml:lang="en-us" xmlns="http://purl.org/atom/ns#">
<title>Kareem Shaker Atom Feeder</title>
<link>http://www.KareemShaker.com/Articles.aspx</link>
<modified>2004-01-13T17:16:45.0004199-07:00</modified>
<tagline>Kareem Shaker is an Egyptian developer who likes to exchange
knowledge with developers all over the world</tagline>
<author>
<name>Kareem Shaker</name>
</author>
<entry>
<title>DataSet Nitty Gritty</title>
<link>http://www.KareemShaker.com/Articles.aspx?Id=3212</link>
<created>Wed, 14 Jan 2004 16:16:16 GMT</created>
<content type="text/html" mode="xml">
<body xmlns="http://www.w3.org/1999/xhtml">
<p>Explains all disconnected environment provided by
ADO.NET,you will be using SQL Server and Oracle to build a simple
POS System that posts all the sales to a central headquarter</p>
</body>
</content>
</entry>
<entry>
<title>Custom Controls Revisited</title>
<link>http://www.KareemShaker.com/Articles.aspx?Id=3552</link>
<created>Wed, 14 Jan 2004 16:02:16 GMT</created>
<content type="text/html" mode="xml">
<body xmlns="http://www.w3.org/1999/xhtml">
<p>Build a custom control that encapsulates the functionality
of Image gallery</p>
</body>
</content>
</entry>
</feed>
Atom root element is "feed" and the version attribute is mandatory. Actually Atom standard is something between the RSS 1.0 and RSS 2.0. It's namespace qualified but it's not based on RDF. Here you have an "entry" element instead of "item". For further information you can visit Atom official website.
OPML format
Open Markup Language (OPML), is nothing more than an XML file. It is very simple to grasp. The main element is "outline" and you just have to supply type
, title
, description
, xmlUrl
, and htmlUrl
attributes. You will find that all news readers support reading OPML files. I find it so useful especially when I take some featured feeds from a friend, he just exports his channels as an OPML file and passes it to me. I can then import that OPML file easily. All news readers support OPML import / export.
="1.0" ="utf-8"
<opml>
<head>
<title>Kareem Shaker's HotList</title>
</head>
<body>
<outline type="rss" title="Arabic Developers Bloggers"
description="This is a great collection of Arabic programming loggers"
xmlUrl="http://www.arabicbloggers.org/GiantFeed.rss"
htmlUrl="http://www.ArabicBloggers.org" />
<outline type="rss"
title="Kareem Shaker ASP.NET Blog"
description="Kareem Shaker's ASP.NET Community Central for Arabs"
xmlUrl="http://www.KareemShaker.com/blog.xml"
htmlUrl="http://www.KareemShaker.com" />
<outline type="rss"
title="MacroCell Developers Blogs"
description="MacroCell is a innovative software house"
xmlUrl="http://www.MacroCell.com/team/blogs.rss"
htmlUrl="http://www.MacroCell.com/eg/team" />
</body>
</opml>
Generating RSS 2.0 document using repeater control
RSS 2.0 document is nothing more than a XML document that adheres to one schema, we can generate XML document using any of the System.XML namespace classes, we can use XMLTextWriter
, or DataSet.WriteXML
method, or even use System.IO
classes, but the easiest way is to use XMLTextWriter
. If you are not familiar with these classes, you can review the article at C# Corner. As you saw in the above RSS 2.0 document, we produce a standard output so that we can use Repeater
control to easily output the RSS document we want. Indeed after reviewing many articles and ways to output RSS document, I found that this is the easiest one. For the sake of simplicity, we'll generate RSS items on the fly, but in real world you would get RSS items from a SQL Server database or you can point to one RSS file that's generated periodically.
Page HTML
As you see above, we have just written or hard coded the lines and items we will output to the RSS document and we have just bound the items' values that we want. Don't forget to assign the contentType
attribute of the page directive to "text/xml".
Page Code behind
We will be generating RSS items dynamically, and we hold a variable in web.config. This variable is called rssItemsNumber
. I check this variable value before I generate RSS items, this is simply the items count to be generated. You should add this variable after the configuration node and before the system.web node.
<appSettings>
<add key="rssItemsNumber" value="10"></add>
</appSettings>
You should read this value at the page load:
System.Int32 numberOfGeneratedItems =
System.Int32.Parse(
System.Configuration.ConfigurationSettings.AppSettings["rssItemsNumber"]);
rssProducts.DataSource = GenerateRss(numberOfGeneratedItems);
rssProducts.DataBind();
GenerateRSS
function is responsible for producing a DataTable
that we bind to the Repeater
control. We then call the repeater Databind
method to bind data to the Repeater
. In GenerateRSS
we just build a DataTable
object on the fly by adding the required columns to the DataTable
object. We then fill the data by looping into that table and generating the RSS items. The number of RSS items is grabbed from the web.config as shown above:
private DataTable GenerateRss(int numberOfItems)
{
DataTable dtItems = new DataTable("rssItems");
DataColumn dcItem = new DataColumn();
dcItem.ColumnName = "Id";
dcItem.DataType = System.Type.GetType("System.Int32");
dcItem.AutoIncrement= true;
dtItems.Columns.Add(dcItem);
dcItem = new DataColumn();
dcItem.ColumnName = "title";
dcItem.DataType = System.Type.GetType("System.String");
dtItems.Columns.Add(dcItem);
dcItem = new DataColumn();
dcItem.ColumnName = "description";
dcItem.DataType = System.Type.GetType("System.String");
dtItems.Columns.Add(dcItem);
dcItem = new DataColumn();
dcItem.ColumnName = "pubDate";
dcItem.DataType = System.Type.GetType("System.DateTime");
dtItems.Columns.Add(dcItem);
dcItem = new DataColumn();
dcItem.ColumnName = "link";
dcItem.DataType = System.Type.GetType("System.String");
dtItems.Columns.Add(dcItem);
DataColumn[] pk = {dtItems.Columns[0]};
dtItems.PrimaryKey = pk;
DataRow drItem = dtItems.NewRow();
for(int iCounter = 1; iCounter <= numberOfItems; iCounter++)
{
drItem["title"] =
"Product No. " + iCounter.ToString() + " From MacroCell";
drItem["description"] = "Product " + iCounter.ToString() +
" is the most promising product in our wide group";
drItem["pubDate"] = DateTime.Now;
drItem["link"]= "http://www.kareemshaker.com/products.aspx?id="
+ iCounter.ToString();
dtItems.Rows.Add(drItem);
drItem = dtItems.NewRow();
}
return dtItems;
}
The code is straightforward and the comments are well descriptive, once you get the DataTable
returned and bound to data Repeater
, and you will get the resultant RSS 2.0 document as XML file. Don't forget that we have added the contentType
attribute to be "text/xml".
If you have a news reader application installed, you can add this channel to it and you can see how the news reader handles the RSS document. If it reads it correctly and throws no exceptions/errors, you can emit the well-formed RSS document and you can add the RSS channel to your new reader using this URL : "http://localhost/RssFeed/rss.aspx", you can replace localhost with your server.
Consuming RSS 2.0 news feed
You can simply consume the RSS feed we have just developed, using a few lines of code. You can consume any other RSS feed using the same code. It's pretty simple. You will find a web project called RSSReader. It contains just one WebForm in which you will find a gird. This grid is bound to the RSS feed we read. You will find it very easy to grasp.
In code behind we read the RSS feed using the DataSet
method ReadXml
. This is the easiest method I have ever seen to consume a RSS feed and if you have a deep look into how the DataSet
handles tables when it maps hierarchical data (XML nodes) into tabular ones (DataTable
s) you will find that for each nesting level you get a new DataTable
added, item node is the second nested node after the RSS and channel nodes so the index of the table that holds all the items is "2". It's a zero based index, so if we bind the grid to the second table it will hold all the RSS items. If you want to read the channel's title or description you should read table index "1". Write the following code in the page load even handler:
private void Page_Load(object sender, System.EventArgs e)
{
DataSet dsFeed = new DataSet("Feed");
dsFeed.ReadXml("http://localhost/RssFeed/rss.aspx");
recentProducts.DataSource = dsFeed.Tables[2];
recentProducts.DataBind();
}
You can supply any other URL to the ReadXml
method, you can try the CodeProject latest articles on RSS feed, and you will get all the latest articles listed.
Conclusion
In this part, we have seen the various formats for news feeds. Here we tried to dig into the RSS 2.0, as it's the simplest and the widely used one. You have also seen how to make your own feed and how to consume others' feeds. I guess I will write a third part to discuss more advanced topics.