Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Generating a Valid Sitemap Automatically with .NET

0.00/5 (No votes)
30 Dec 2012 1  
In this programming article, I will show how to generate an approved sitemap automatically for use with search engines.

Overview

Jambr is still a baby, as such its content and structure is changing. It originally existed on two URLs (www and non-www), and Google was indexing both of them and to add to it, not long ago, I changed the url structure for articles to be more, SEO friendly. All of these changes confuse search engine indexers and one way to help them out is to provide them with a Sitemap. My rough list of requirements were:

  • To comply fully with the Sitemap protocol
  • To generate automatically, when /sitemap.xml was called
  • To be able to decorate fixed controller actions with an attribute which would include them in the map.
  • To provide a simple way of adding the dynamic content
  • To cache the output for a period of time

Implementation

First things first, we need to create an XML document which matches the Sitemap protocol. So we create a new XmlDocument and from there, we add the xmlns for the Sitemap protocol, and add the root element "urlset".

''' <summary>
''' The scheme we add to the document
''' </summary>
Private Const SiteMapSchemaURL As String = "http://www.sitemaps.org/schemas/sitemap/0.9"

''' <summary>
''' The full URL to your website, for example http://www.jambr.co.uk
''' </summary>
Private Property FullyQualifiedUrl As String

Private _document As XmlDocument
''' <summary>
''' Returns the XML document
''' </summary>
Private ReadOnly Property Document As XmlDocument
    Get
        Return _document
    End Get
End Property

''' <summary>
''' Create a new instance of the SiteMapGenerator, initialise the XML document
''' and add the required namespaces
''' </summary>
''' <param name="FullyQualifiedUrl">The full URL to your website, 
''' for example http://www.jambr.co.uk</param>
Public Sub New(ByVal FullyQualifiedUrl As String)

    Me.FullyQualifiedUrl = FullyQualifiedUrl.Replace("\", "/")

    _document = New XmlDocument
    Document.AppendChild(Document.CreateNode(XmlNodeType.XmlDeclaration, Nothing, Nothing))

    'Create the root element and add the sitemap namespace
    Dim rootelement = Document.CreateElement("urlset", SiteMapSchemaURL)
    Document.AppendChild(rootelement)

End Sub

Next, I wanted to create a flexible method to add new urls, that accepted all the valid options for the url child elements, on an optional basis, and only adding them if they're passed:

''' <summary>
''' Adds a URL to the site map
''' </summary>
''' <param name="Location">The URL to the page, will check for your domain and add if required.</param>
''' <param name="LastModified">Optional: The date the URL was last modified</param>
''' <param name="ChangeFrequency">Optional: The expected change frequency of the URL</param>
''' <param name="Priority">Optional: The priority of the page, ranging from 0.0 to 1.0, default is 0.5
''' </param>
Public Sub AddUrl(ByVal Location As String,
                  Optional ByVal ChangeFrequency As ChangeFrequency = Nothing,
                  Optional ByVal Priority As Decimal = Nothing,
                  Optional LastModified As DateTime = Nothing)

    'sanitise the url
    Location = Location.Replace("\", "/")
    If Not Location.ToLower.Contains(FullyQualifiedUrl.ToLower) Then
        Location = FullyQualifiedUrl & If(Left(Location, 1) = "/", Location, "/" & Location)
    End If

    'check we haven't added it already in a stored list of urls we've added
    If AddedUrls.Contains(Location) Then Exit Sub
    AddedUrls.Add(Location)

    'Required elements
    Dim newUrl = Document.CreateElement("url", SiteMapSchemaURL)
    newUrl.AppendChild(CreateTextElement("loc", Location))

    'Optional Elements
    If Not LastModified = Nothing Then
        newUrl.AppendChild(CreateTextElement("lastmod", LastModified.ToW3C))
    End If

    If Not ChangeFrequency = Nothing Then
        newUrl.AppendChild(CreateTextElement("changefreq", ChangeFrequency.ToString))
    End If

    If Not Priority = Nothing Then
        newUrl.AppendChild(CreateTextElement("priority", Priority))
    End If

    Document.DocumentElement.AppendChild(newUrl)

End Sub

Reflection

I mentioned previously that I wanted an easy way to add URLs, I didn't want to create a class which needed me to call AddUrl() over and over for all my pages. I decided to go down the route of creating a custom SettingAttribute, that I could just stick at the top of the controller actions I wanted to map, like this:

<SiteMap(ChangeFrequency:=ChangeFrequency.daily, Priority:=0.7)>
Function Index() As ActionResult
    Return View(New HomeViewModel)
End Function

Next huh? Now, you've probably realised that this would only work for static URLs, dynamic actions that require parameters like this, wouldn't work. In the context of Jambr, I have two controllers which serve dynamic content, Articles and News. I decided to go down the route of creating an interface, which allowed me to have a sub routine that could be called by the site map generator, like this:

''' <summary>
''' Populate the site map with the dynamic data
''' </summary>
''' <param name="generator">the generate object that gets passed</param>
Public Sub PopulateSiteMap(ByRef generator As SiteMapGenerator) Implements ISiteMap.PopulateSiteMap

    'We need to initialise the UrlHelper because of the way we've invoked this method
    Url = New UrlHelper(System.Web.HttpContext.Current.Request.RequestContext)

    Using db As New JambrDBContext

        'Lets add dynamic data, starting with my articles
        Dim articles = (db.
                       ArticlePosts.
                       Where(Function(w) w.IsLive = True).
                       OrderByDescending(Function(o) o.LastUpdated).
                       Select(Function(s) New With {.SEOUrl = s.SEOUrl,
                                                    .LastUpdated = s.LastUpdated})).tolist

        'Add my root element, with a last modified date of the latest article
        generator.AddUrl(Url.Action("Index", "Article"), _
        ChangeFrequency.daily, Nothing, articles.First.LastUpdated)
        'Add the RSS feed, as it has the same last updated date
        generator.AddUrl(Url.Action("RSS", "Article"), _
        ChangeFrequency.daily, Nothing, articles.First.LastUpdated)

        'Add my other elements
        For Each post In articles
            generator.AddUrl(Url.Action("View", "Article", _
            New With {.SEOUrl = post.SEOUrl}), Nothing, Nothing, post.LastUpdated)
        Next
        articles = Nothing

    End Using

End Sub

We just look for either the SiteMapAttribute, or the Implementation of ISiteMap using reflection and get the associated details like so:

''' <summary>
''' When called, the site map generator will attempt to load any action methods
''' that are decorated with the SiteMapAttribute from your controller classes and
''' add a url for them based on it
''' </summary>
''' <remarks></remarks>
Public Sub LoadFromAttribute()

    'Get all the controllers in the project
    Dim controllers = Assembly.
                      GetExecutingAssembly.
                      GetTypes().
                      Where(Function(t) GetType(System.Web.Mvc.ControllerBase).IsAssignableFrom(t))

    'First we want to get all controllers that implement the ISiteMap interface and fire the method
    For Each c In controllers.Where(Function(t) GetType(ISiteMap).IsAssignableFrom(t))

        'Create an instance
        Dim obj As ISiteMap = Activator.CreateInstance(c, True)
        obj.PopulateSiteMap(Me)

    Next

    'Now get all the methods which are decorated with the SiteMapAttribute
    Dim objs = (From c In controllers
               From act In c.GetMembers
               Where act.GetCustomAttributes(True).OfType(Of SiteMapAttribute)().Count > 0
               Select New With {.controller = c,
                                .action = act,
                                .actionnameattribute = act.GetCustomAttributes(True).OfType_
                                (Of ActionNameAttribute)().FirstOrDefault,
                                .sitemapattribute = act.GetCustomAttributes(True).OfType_
                                (Of SiteMapAttribute)().FirstOrDefault}).ToList

    'We need a url helper to help us generate the url path
    Dim UrlHelper = New UrlHelper(HttpContext.Current.Request.RequestContext)

    For Each p In objs
        'Now we have the objects, we need to build the url. 
        'We need to look out for the ActionNameAttribute in case people are using it
        'to name their action methods, we also need to remove Controller from the name of the controller
        Dim url As String = UrlHelper.Action_
        (If(p.actionnameattribute Is Nothing, p.action.Name, p.actionnameattribute.Name),
        p.controller.Name.Replace("Controller", ""))

        'Add the object
        AddUrl(url,
               p.sitemapattribute.ChangeFrequency,
               p.sitemapattribute.Priority,
               If(p.sitemapattribute.LastModified Is Nothing,
                  Nothing,
                  DateTime.Parse(p.sitemapattribute.LastModified, (New CultureInfo("en-us")))
                  )
               )
    Next

End Sub

Now, add a route for sitemap.xml (remember this programming article is based around .NET MVC) in your RouteConfig.vb.

'This is to overwrite the sitemap request
routes.MapRoute( _
    name:="SiteMap", _
    url:="sitemap.xml", _
    defaults:=New With {.controller = "SiteMap", .action = "Index"})

Set the controller and action to wherever you're going to put your method, I decided to put mine in a new controller. Finally, create your action method. I've decorated mine with the OutputCache attribute and set it to 6 hours, with the ability to clear the cache by using the query string parameter ClearCache.

''' <summary>
''' Returns the site map
''' </summary>
<OutputCache(Duration:=21600, VaryByParam:="ClearCache", Location:=OutputCacheLocation.Server)>
Function Index() As ActionResult

    'Create our site map
    Dim p As New SiteMapGenerator("http://www.jambr.co.uk")

    'Load any methods which are tagged with the attribute
    p.LoadFromAttribute()

    'Return the content
    Return Content(p.ToString, "text\xml")

End Function

Something to note here is that I have created a ToString method, which takes the XmlDocument and outputs it as a UTF8 string. UTF8 is important so there is another class in the source code which creates a UTF8 based string writer.

Conclusion

I hope this article has shown you a clean way to implement a dynamic site map in .NET MVC using flexible attributes. The full source code can be downloaded from here, if you want to see my sitemap, check it out here and as usual - any questions, please drop me a comment!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here