Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Creating Multilingual Websites - Part 2

0.00/5 (No votes)
25 Aug 2004 65  
Creating multilingual websites - Part 2

Table of Contents

Introduction

In Part 1, we briefly looked at how localizing in .NET is achieved. We then extended the functionality by building our own ResourceManager class as well as expanding a number of Server Controls to be localization-aware.

In this second Part, we'll go in more depth about the architecture of creating a multilingual web application. We'll begin by using URL Rewriting to maintain the culture the user is in (as opposed to the simpler querystring method we used before), then talk about database design and integration. Finally, we'll discuss more advanced issues and possible solutions.

URL Rewriting

In our previous samples, we simply used the QueryString to determine a user's language of choice. While great to showcase localization, it definitely won't do in a real-world application - namely because it would be a maintenance nightmare across multiple pages. An alternative I've used in the past is to use different domains or subdomains per supported culture. While this has the advantage of requiring no maintenance, not everyone has access to multiple domains or subdomains. Another alternative would be to store the culture in a cookie on the client. Of course, this has all the advantages and disadvantages of any other cookie-dependent solution. In the end, I'm of the strong opinion that using URL Rewriting is the best alternative.

URL Rewriting Basics

If you're new to URL Rewriting, you'll soon wonder how you ever lived without it. There are a lot of reasons to use URL rewriting, but the best is to make your URLs a little friendlier and less complicated. Basically, URL Rewriting allows you to create a link to a page which doesn't exist, capture the request, extract information from the URL and send the request to the correct page. There's no redirect so there's no performance penalty. A simple example would be if you had some type of member system where each member had their own public profile. Typically, that page would be accessed something like:

http://www.domain.com/userDetails.aspx?UserId=93923

With URL Rewriting, you could support a much nicer URL, such as:

http://www.domain.com/johnDoe/details.aspx

Even though johnDoe/details.aspx doesn't really exist, you would capture the URL, parse the address, lookup the userId for the username johnDoe and rewrite the URL to userDetails.aspx?UserID=93923. While the end effect is the same, the URL is far more personal, clean and easy to remember.

URL Rewriting Culture

We want to embed the culture's name in the URL, extract that out to load the culture and rewrite the URL as though the culture had never been in there. For example:

http://www.domain.com/en-CA/login.aspx  --> en-CA culture --> 
        http://www.domain.com/login.aspx
http://www.domain.com/fr-CA/login.aspx  --> fr-CA culture -->
   http://www.domain.com/login.aspx
http://www.domain.com/en-CA/users/admin.aspx?id=3  --> 
  en-CA culture --> http://www.domain.com/users/admin.aspx?id=3
http://www.domain.com/fr-CA/users/admin.aspx?id=3  --> 
  fr-CA culture --> http://www.domain.com/users/admin.aspx?id=3
http://www.domain.com/virtualDir/en-CA/users/admin.aspx?id=3  --> 
  en-CA culture --> http://www.domain.com/virtualDir/users/admin.aspx?id=3
http://www.domain.com/virtualDir/fr-CA/users/admin.aspx?id=3  --> 
  fr-CA culture --> http://www.domain.com/virtualDir/users/admin.aspx?id=3

We'll now replace the code previously put in the Global.Asax's Application_BeginRequest method to use URL Rewriting. At the same time, we'll move the code out of the Global.asax and place it in a custom HTTPModules. HTTPModules are basically more portable Global.asax.

 1:     public class LocalizationHttpModule : IHttpModule {
 2:        public void Init(HttpApplication context) {
 3:           context.BeginRequest += new EventHandler(context_BeginRequest);
 4:        }
 5:        public void Dispose() {}
 6:        private void context_BeginRequest(object sender, EventArgs e) {
 7:           HttpRequest request = ((HttpApplication) sender).Request;
 8:           HttpContext context = ((HttpApplication)sender).Context;
 9:           string applicationPath = request.ApplicationPath;
10:           if(applicationPath == "/"){
11:              applicationPath = string.Empty;
12:           }
13:           string requestPath = request.Url.AbsolutePath.Substring(
                      applicationPath.Length);
14:           LoadCulture(ref requestPath);
15:           context.RewritePath(applicationPath + requestPath);
16:        }
17:        private void LoadCulture(ref string path) {
18:           string[] pathParts = path.Trim('/').Split('/');
19:           string defaultCulture =
      LocalizationConfiguration.GetConfig().DefaultCultureName;
20:           if(pathParts.Length > 0 && pathParts[0].Length > 0) {
21:              try {
22:                 Thread.CurrentThread.CurrentCulture =
                             new CultureInfo(pathParts[0]);
23:                 path = path.Remove(0, pathParts[0].Length + 1);
24:              }catch (Exception ex) {
25:                 if(!(ex is ArgumentNullException) &&
                      !(ex is ArgumentException)) {
26:                    throw;
27:                 }
28:                 Thread.CurrentThread.CurrentCulture =
                         new CultureInfo(defaultCulture);
29:              }
30:           }else {
31:              Thread.CurrentThread.CurrentCulture =
                             new CultureInfo(defaultCulture);
32:           }
33:           Thread.CurrentThread.CurrentUICulture =
                    Thread.CurrentThread.CurrentCulture;
34:        }
35:     }

The Init function is inherited from the IHttpModule interface and lets us hook into a number of ASP.NET events. The only one we are interested in is the BeginRequest [line: 3]. Once we've hooked into the BeginRequest event, the context_BeginRequest method [line: 6] will fire each time a new HTTP Request is made to a .aspx page - just like before, this is the ideal place to set our thread's culture. context_BeginRequest takes the requested path and removes the application paths from it [line: 13]. LoadCulture is then called [line: 14]. By removing the application path from the requested path, the culture name should be in the first segment of our path. LoadCulture attempts to create a culture out of that first segment [line: 22], if it succeeds, it removes the culture from the path [line: 23]. If it fails, the default culture is loaded [line: 28]. Finally, context_BeginRequest rewrites the URL to the path modified by LoadCulture [line: 15].

Once this HttpModule is loaded through the web.config, it can be used as-is as the core localization engine without any additional work.

Database Design

The main issue I want to address with database design deals with keeping a multilingual application properly normalized. I've run across the same pattern of denormalized databases due to localization far too often.

Sample Application Requirements

To better understand, we'll look at a very simple example: a database that'll be used to sell items (well, a small part of it). The basic requirements are:

  1. Each item has a single category which is selected from a lookup of values
  2. Each item will have a name, description, Seller ID and price, and
  3. The application must support English and French

The Bad Design

Coming up with a bad design really shouldn't take long, here's the schema I came up with in a couple seconds:

Image 1

As you can see, each item has a Seller Id which links to a mythical User table, it has an EnglishName and FrenchName column, as well as an EnglishDescription and FrenchDescription column and a price. It also has a CategoryId which links to the Category table. The Category table has an English and French name as well as an English and French description. This schema WILL work, but has some severe problems:

  • The schema violates the first normal form, it has duplicate columns.
  • The schema puts an additional burden on the developer. The developer needs to know he wants the column named "EnglishDescription" instead of simply knowing he wants the description.
  • Many database engines have a maximum row size (like any version of SQL Server except Yukon which is only in beta). We'll quickly run into that limitation using the above schema.
  • Even though the requirement clearly stated only English and French had to be supported, requirements change, and this schema isn't at all flexible. If you have 200 tables like this, you'll need to modify each one and add the appropriate column. That'll make the above three points even worse.

The Good Design

Creating a clean and flexible schema is as simple as properly normalizing the tables - that is removing the duplicate columns. Here's what the improved schema looks like:

Image 2

The trick is to identify fields which are culture-specific, which was pretty easy because they either began with "EnglishXXXX" or "FrenchXXXX", these fields are extracted into a _Locale table (just a name I came up with) and partitioned vertically (rows) instead of horizontally (columns). For vertical partitioning to work, we introduced a new Culture table. I realize the single-field Category table isn't very nice. Unfortunately, the Item's CategoryId can't join directly to the Category_Locale table because it has a joint primary key. Regardless, it's likely that non-culture specific things will go in the Category table, such as an "Enabled" and "SortOrder" columns.

In case you are having difficulty seeing this model, let's populate our tables with sample data:

Culture

CultureId Name DisplayName
1 en-CA English
2 fr-CA Français

Item

ItemId CategoryId SellerId Price
20 2 54 20.00
23 1 54 25.00
24 1 34 2.00
25 3 543 2000.00

Item_Locale

ItemId CultureId Name Description
20 1 A first look at ASP.NET v 2.0 Buy the first book that talks about the next version of ASP.Net
20 2 Le premier livre sur la prochain version de ASP.NET Achetez ce livre incroyable pour devenir un expert sur la technologie de demain
23 1 Duct table 50m of premium quality duct tap
23 2 bande de conduit 50m de bande de conduit d'haute qualité

Category

CategoryId
1
2

Category_Locale

CategoryId CultureId Name Description
1 1 Books All types of reading material should be placed here
1 2 Livre Tous les types de matériel de lecture devraient être placés ici
2 1 Tapes Category for all tapes and other adhesive products
2 2 Adhésifs Catégorie pour tous bandes et d'autres produits adhésifs

As you can see, all non-culture specific information is held in the main tables (Item, Category). All culture specific information is held in a _Locale table (Item_Locale, Category_Locale). The _Locale tables have a row for each supported culture. The schema is now normalized, we've helped avoid the row size limit, and adding culture is just a matter of adding rows to the Culture table. You might think that this schema makes querying the database more complicated, in the next section, we'll see just how easy it is.

Querying Our Localized Database

Believe it or not, querying the database is a lot easier with the localized schema. Let's look at an example. Say we wanted to get all the items sold by a specific user (@SellerId) localized in the language your site visitor was using (@CultureName). In the bad schema, we'd have to write:

 1:  IF @CultureName = 'en-CA' BEGIN
 2:
 3:     SELECT I.ItemId, I.EnglishName, I.EnglishDescription, I.Price,
                 C.EnglishName, C.EnglishDescription
 4:        FROM Item I
 5:           INNER JOIN Category C ON I.CategoryId = C.CategoryId
 6:        WHERE I.SellerId = @SellerID
 7:
 8:  END ELSE BEGIN
 9:
10:     SELECT I.ItemId, I.FrenchName, I.FrenchDescription, I.Price,
               C.FrenchName, C.FrenchDescription
11:        FROM Item I
12:           INNER JOIN Category C ON I.CategoryId = C.CategoryId
13:        WHERE I.SellerId = @SellerID
14:
15:  END

If you've been following along, you'll quickly realize this isn't at all flexible and will be a nightmare to maintain. You could use dynamic SQL, but you'll be trading in one nightmare for another.

Using the new database schema, we'll be able to write a stored procedure that won't need to be modified whenever a new language is added, it'll be more readable and less maintenance. Check it out:

 1:  DECLARE @CultureId INT
 2:  SELECT @CultureId = CultureId
 3:     FROM Culture WHERE CultureName = @CultureName
 4:
 5:
 6:  SELECT I.ItemId, IL.[Name], IL.[Description],
            I.Price, CL.[Name], CL.[Description]
 7:     FROM Item I
 8:        INNER JOIN Item_Locale IL ON I.ItemId =
               IL.ItemID AND IL.CultureId = @CultureIdD
 9:        INNER JOIN Category C ON I.CategoryId =
   C.CategoryId  //could be removed since we aren't actually using it
10:        INNER JOIN Category_Locale CL ON C.CategoryID
               = CL.CategoryId AND CL.CultureId = @cultureId
11:     WHERE I.SellerId = @SellerID

Hopefully, you can see the huge advantage this has. It might be a little less performant, but there's a single query no matter how many languages you are going to support which makes it very flexible and easy to maintain. I left the join to the category table [line: 9] in because, as I've said before, we could definitely add some columns to Category which would make it desirable to have. Additionally, getting the @CultureId from the @CultureName [line: 2-3] is an excellent candidate for a user defined function as almost every sproc do it. Other than that, the main difference is that we are joining the _Locale tables [line: 8 and10] to their parents and for the specified @CultureId.

Wrapping It Up

There are only two things to add to our database design discussion. First off, in case you missed it, the @CultureName parameter which you'll need to pass into your stored procedures is actually a .NET CultureInfo.Name. This makes things really easy, since the user's current culture will be available in System.Threading.Thread.CurrentThread.CurrentCulture.Name, or from our code example in Part 1 is accessible from our ResourceManager in the form of ResourceManager.CurrentCultureName.

The second issue is that our ResourceManager in Part 1 used XML files, but using a similar schema as above (with a Culture table), it could easily be modified to use a database. That exercise is left up to you.

Advanced Considerations

In this final section, I'd like to point out and address some advanced issues.

Basics of Placeholders

The first issue was pointed out to me by Frank Froese in the Code Project's publication of Part 1. It's an issue all multilingual application developers have had to face and I'm thankful that he reminded me of it. The issue is that you'll frequently want to have sentences with a placeholder (or two) in them. For example, say you wanted to say something along the line's of "{CurrentUserName}'s homepage", you might be tempted to do something like:

1:  <asp:literal id="user" runat="server" />
2:  <Localized:LocalizedLiteral id="passwordLabel"
       runat="server" Key="password" Colon="True" />

and set the "user" literal in your codebehind. However, not only does this become tedious when binding, it simply won't work in a lot of languages because their grammar is simply different. For example, in French, you'd want it to be "Page d'acceuil de {CurrentUserName}". In the French example, the user's name comes after the sentence, and our above code simply won't work. The problem only gets worse when additional placeholders are needed.

What we want to do is have placeholders in the values of our XML files and replace them at runtime with actual values. While you could use numbered placeholders, such as {0}, I find using named placeholders, such as {Name}, conveys considerably more meaning and can really be helpful to a translator trying to understand the context.

Let's look at a basic example:

   1:  using System.Collections.Specialized;
   2:  using System.Web.UI;
   3:  using System.Web.UI.WebControls;
   4:   
   5:  namespace Localization {
   6:     public class LocalizedLiteral : Literal, ILocalized {
   7:        #region fields and properties
   8:        private string key;
   9:        private bool colon = false;
  10:        private NameValueCollection replacements;
  11:   
  12:        public NameValueCollection Replacements {
  13:           get {
  14:              if (replacements == null){
  15:                 replacements = new NameValueCollection();
  16:              }
  17:              return replacements;
  18:           }
  19:           set { replacements = value; }
  20:        }
  21:   
  22:        public bool Colon {
  23:           get { return colon; }
  24:           set { colon = value; }
  25:        }
  26:   
  27:        public string Key {
  28:           get { return key; }
  29:           set { key = value; }
  30:        }
  31:        #endregion
  32:   
  33:   
  34:        protected override void Render(HtmlTextWriter writer) {
  35:           base.Text = ResourceManager.GetString(key);
  36:           if (colon){
  37:              base.Text += ResourceManager.Colon;
  38:           }
  39:           if (replacements != null){
  40:              foreach (string placeholder in replacements.Keys) {
  41:                 string value = replacements[placeholder];
  42:                 base.Text = base.Text.Replace("{" 
               + placeholder + "}", value);
  43:              }
  44:           }
  45:           base.Render(writer);
  46:        }
  47:     }
  48:  }

Basically, we've added a replacements NameValueCollection [line: 10, 12-20] (a NameValueCollection is the same as a Hashtable but is specialized to have a string key and string value as opposed to objects). In our Render method [line: 34] we loop through the new field [line: 40-44] and replace the key with the specified value. (As an aside, you might want to consider using a System.Text.StringBuilder object for performance if you like this method).

We can use the Replacement collection in our page's codebehind, like so:

1:           usernameLabel.Replacements.Add("Name", CurrentUser.UserName);
2:           usernameLabel.Replacements.Add("Email", CurrentUser.Email);

Advanced Placeholders Support

In order to have true support for placeholders, we really need to enable page developers to set values without having to write any code. This is especially true when databinding.

1:  <Localized:LocalizedLiteral id="Literal1" key="LoginAudit" runat="server">
2:        <Localized:Parameter runat="Server" Key="Username"
           value='<%# DataBinder.Eval(Container.DataItem, "UserName")%>' />
3:        <Localized:Parameter runat="Server" Key="Name"
           value='<%# DataBinder.Eval(Container.DataItem, "Name")%>'/>
4:        <Localized:Parameter runat="Server" key="Date"
           value='<%# DataBinder.Eval(Container.DataItem, "lastLogin")%>' />
5:  </Localized:LocalizedLiteral>

Adding this very important feature is a fairly simple process. First, we'll create our new Parameter control:

 1:  using System.Web.UI;
 2:
 3:  namespace Localization {
 4:     public class Parameter: Control {
 5:        #region Fields and Properties
 6:        private string key;
 7:        private string value;
 8:
 9:        public string Key {
10:           get { return key; }
11:           set { key = value; }
12:        }
13:
14:        public string Value {
15:           get { return this.value; }
16:           set { this.value = value; }
17:        }
18:        #endregion
19:
20:
21:        public Parameter() {}
22:        public Parameter(string key, string value) {
23:           this.key = key;
24:           this.value = value;
25:        }
26:     }
27:  }

It's basically a control with two properties a key [line: 9-12] and a value [line: 14-17].

There's only one more step to be able to use this new control as a child control of your other Localized controls (well, actually, you can use it now, but it won't do anything). Here's a new render method, with the explanation following it:

 1:        protected override void Render(HtmlTextWriter writer) {
 2:           string value = ResourceManager.GetString(key);
 3:           if (colon){
 4:              value += ResourceManager.Colon;
 5:           }
 6:           for (int i = 0; i < Controls.Count; i++){
 7:              Parameter parameter = Controls[i] as Parameter;
 8:              if(parameter != null){
 9:                 string k = parameter.Key;
10:                 string v = parameter.Value;
11:                 value = value.Replace('{' + k.ToUpper() + '}', v);
12:              }
13:           }
14:           base.Text = value;
15:           base.Render(writer);
16:        }

Basically, in the render of each of your controls, you need to loop through their Control collection [line: 6-13 (a collection of all child controls). If the child is a Localized.Parameter [line 7,8] you need to do a replace [line: 11] any placeholder that are equal to parameter's Key [line: 9] with the parameter's value [line: 10.

In the code that's included with this article, this looping functionality has been extracted to a helper class, LocalizedUtility, so that it isn't repeated for each control. Also, please note that you can still set the parameters in codebehind:

1:  myLiteral.Controls.Add(new Parameter("url",
     "BindingSample.aspx"));

Still works.

I'd like to mention a couple caveats which I ran into doing this. First off, the Literal control which LocalizedLiteral inherits from doesn't allow you to have child controls. As such, I had to change it to inherit from the Label control. I also noticed that if you did anything to the base class, the control collection would get wiped. For example, if at the beginning of the Render method, you did base.Text = "";, the Control's collection would switch to 0. I'm sure there's a good reason for this, but it did strike me as odd.

ASP.NET 2.0

The last thing to talk about is how the next version of ASP.NET will change how you develop multilingual applications. Unfortunately, I haven't spent a lot of time with the alpha and beta packages available thus far. Hopefully in the near future, I'll be in a position to write a follow-up. What I do know is promising. It looks like they've really beefed up the builtin support - namely by adding new methods to utilize resources (as opposed to the single ResourceManager way currently available). Fredrik Normén has done an excellent job of providing us some initial details in this blog entry.

Download

This download is very familiar to the one from Part 1. The things to keep an eye out for is the removal of Global.Asax and the introduction of LocalizationHttpModule (check out the HttpModule section in the web.config to see how it's hooked up). I really think this is a nice way to store the user's culture. Also, take a look at how the Localization.Parameter works and play with it, I hope you'll find that it meets your needs. Finally, I didn't include any database code in the sample. I'm hoping the above screenshots and code were self-sufficient. Enough rambling, download!.

Conclusion

Hopefully, some of this tutorial will be helpful. The goal was to make things fairly copy and paste friendly while at the same time providing flexible code and some general guidelines. There are a number of enhancements that could be done, such as giving intellisense support to the ILocalized controls in the designer, expanding the ResourceManager to be more like a provider model, providing management facilities for resources and so on. The power Microsoft has given us with .NET makes all those things possible. This has been a real pleasure!

History

  • 26th August, 2004: Initial version

License

This article has no explicit license attached to it, but may contain usage terms in the article text or the download files themselves. If in doubt, please contact the author via the discussion board below. A list of licenses authors might use can be found here.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here