Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / ASP.NET

Search Engine Optimization in AJAX site

4.71/5 (6 votes)
11 Aug 2010CPOL4 min read 34.5K   472  
How to achieve search engine optimization in AJAX site

Introduction

Search engine crawlers do not follow JavaScript links. If you are using AJAX to load the text dynamically, it will not be indexed by Search engines. URL rewriting is the method which can be used to make crawlers index all links on the site.

Objective

  1. If normal user hits the site, he should see AJAX enabled site (JavaScript) links.
  2. If the crawler hits the site, the links should be anchor links which can be index.
  3. If the anchor link which is indexed by the crawler is browsed, then the dynamic content should be visible while keeping the URL same.

Example: If the user browses the site http://example.com and clicks on the link 'First Article', then the content of the first article is loaded dynamically. But if search crawler browses the site, then the link will be http://example.com/Articles/FirstArticle which is crawlable. Now if normal user requests this URL which is indexed by search engine, then First Article should be displayed keeping the URL http://example.com.

Using the Code

URL Rewriting

  1. Write anchor tag as <a href=”http://www.example.com/Aditya/Bhave/”>Aditya Bhave</a> in HTML.
  2. When user will click on this link, a request will be sent by browser with the above URL.
  3. This URL will be processed by the custom HttpModule.
  4. HttpModule will convert the above URL into the URL which is required by the application. i.e. http://www.example.com/alllinks.aspx?firstname=Aditya&lastname=Bhave
  5. The request is processed as if the request was http://www.example.com/alllinks.aspx?firstname=Aditya&lastname=Bhave
  6. Response is sent to the browser while the URL still remains http://www.example.com/alllinks/Aditya/Bhave/

You can compare this thing with Server.Transfer(“/…./…”) in ASP.NET.

Simple Example of HttpModule which rewrites URL

public class SimpleRewriter : IHttpModule
    {
        HttpApplication _application = null;

        #region IHttpModule Members

        public void Dispose()
        {

        }

        public void Init(HttpApplication context)
        {
            context.BeginRequest += new System.EventHandler(context_BeginRequest);
            _application = context;
        }

        void context_BeginRequest(object sender, EventArgs e)
        {
            try
            {
                string requestUrl = _application.Context.Request.Path.Substring
			(0, _application.Context.Request.Path.LastIndexOf("/"));

                string[] parameters = requestUrl.Split(new char[] { '/' });

                if (parameters.Length > 2)
                {
                    int paramLength = parameters.Length;
                    string firstname = parameters[paramLength - 2];
                    string lastname = parameters[paramLength - 1];

                    _application.Context.RewritePath(string.Format
		   ("~/alllinks.aspx?firstname={0}&lastname={1}", firstname, lastname));
                }

            }
            catch (Exception ee)
            {
                //Redirect to error page
                //Or throw custom exception
            }
        }

        #endregion
    }

URL Rewriting as a Search Engine Optimization Technique in AJAX Site

  1. AJAX sites have all JavaScript links.
  2. If we use LinkButton, then the link will be:
    HTML
    <a href="javascript:__doPostback(‘….’,’…..’);">Aditya Bhave</a> 

    as rendered in the browser.

  3. If a crawler comes across a JavaScript link, it is skipped (not indexed).
  4. If we want our link : Aditya Bhave to be indexed by the crawler and searchable in search engine, then the link should be in the anchor tag, e.g.
    HTML
    <a href="http://www.example.com/Aditya/Bhave/">Aditya Bhave</a> 
  5. Most Crawlers do not support JavaScript. When the crawler hits the site, the user agent is sent with request that tells us that this is a crawler and not normal user. We can check if the browser supports JavaScript. If it does not support JavaScript, then we should hide the LinkButtons and add anchor tags at runtime. So there should be two links:
    ASP.NET
    <asp:LinkButton ID="lnkLink" runat="server" Text='Some Text'    
    OnClick="lnkLink_Click"></asp:LinkButton> 

    and

    ASP.NET
    <a href="" runat="server" id="htmlLnkLink"></a> 
  6. If browser does not support JavaScript, then hide LinkButton and change the href attribute of anchor link and make it visible.

Example:

C#
if (Request.Browser.EcmaScriptVersion.Major <= 0) // Checks if browser supports JS
{ 
    lnkArticle.Visible = false;            
   aLnkArticle.attributes["href"]=
	"http://www.example.com/alllinks.aspx/Firstname/Lastname;
   aLnkArticle.InnerText = "FirstName LastName"; //To show the link text
   aLnkArticle.Visible = true;
}  

When the crawler will hit the anchor link, i.e. http://www.example.com/alllinks.aspx/Aditya/Bhave/, change the URL to http://www.example.com/alllinks.aspx?firstname=Aditya&lastname=Bhave using URL Rewriting.

Check in the Page_Load event of the alllinks.aspx page if the Querystring variable with name firstname or lastname exists.
If it exists, then execute the code in LinkButton_Click event to show the content to crawler.

C#
string firstname = Request.QueryString["firstname"];
string lastname = Request.QueryString["lastname"];
SetName(firstname, lastname); 

Now the crawler will see the content and index your URL, i.e., http://www.example.com/Aditya/Bhave/ and it will appear in search results.

Now, if the normal user clicks the above link which was found in search results, he must see the dynamic content while keeping the URL same as http://www.example.com/alllinks.aspx.
We can implement the following approach to achieve this:

  1. The URL http://www.example.com/Aditya/Bhave/ will be rewritten by HttpModule as
    http://www.example.com?firstname=Aditya&lastname=Bhave
  2. Check if querystring exists and if browser supports JavaScript. If both are true, then put firstname and lastname in Session variable and redirect to the same page.
  3. Now the querystring is gone but Session variable exists. Check if Session variable exists, then execute the same code as in LinkButton_Click to show dynamic content and remove the Session variables.
C#
#region Process RequestParameters
        if (Request.QueryString["firstname"] != null)
        {
            if (Request.Browser.EcmaScriptVersion.Major > 0)
            {
                //Normal user
                Session["firstname"] = null;
                Session["lastname"] = null;
                Session["firstname"] = Request.QueryString["firstname"];
                Session["lastname"] = Request.QueryString["lastname"];
                Response.Redirect("alllinks.aspx", true);
            }
            else
            {
                //Crawler : Show Dynamic Content.
                string firstname = Request.QueryString["firstname"];
                string lastname = Request.QueryString["lastname"];
                SetName(firstname, lastname);
            }
        }
        #endregion

        #region Process Session Variables
        if (Session["firstname"] != null)
        {
            //Normal user : show dynamic content and remove session variables.
            string firstname = Convert.ToString(Session["firstname"]);
            string lastname = Convert.ToString(Session["lastname"]);
            SetName(firstname, lastname);
            Session.Remove("firstname");
            Session.Remove("lastname");
        }
        #endregion

Please see the attached solution.

How to Check If It Is Working Fine

Use Mozilla firefox's addon 'PrefBar'. Select the user agent as Lynx. (This will browse the page as if the browser is Lynx. Lynx is a text browser which does not support JavaScript.)

Disable colour, images, JavaScript, Flash and browse the site. Browse the same site in Internet Explorer. Compare the two sites. Links are normal anchor links in Mozilla (Lynx) and if you click on it performs full postback. But if you click on the Link in Internet Explorer, then AJAX is used and content is loaded dynamically. Copy the link from Firefox, e.g., http://localhost:1234/alllinks.aspx/Aditya/Bhave/ and browse it in Internet Explorer. In Internet Explorer, it should show the URL as http://localhost:1234/alllinks.aspx while the dynamic content is loaded.

Your suggestions are most welcome. Thanks for reading.

History

  • 11th August, 2010: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)