Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Automatically localizing applications with Google Translate

0.00/5 (No votes)
21 Aug 2012 1  
Localizing your apps with Google Translate

Introduction 

If you're like me, localizing your applications is a tedious task that is often skipped when possible and left to the last possible moment when needed. Once I get to the localization task, I open my browser on one screen and go to translate.google.com and begin to copy paste, copy paste, and copy paste some more. It's not that this is a difficult task, but boy it would be nice if there was a way to automatically do these translations; that is the purpose behind this project.

How the code works

As you are probably aware, most development languages use a simple file to hold the various words and phrases that are localized within the application. In the case of .NET these are within a .resx file which is just an XML file. In Java this is typically done via a .properties file.

The basics of how the application works are:

You provide:

  • The source resource file
  • The name of the desired resource file to create
  • The source and destination languages  

The application then:

  • Copies the source file to the new destination file
  • Determines if the file type is a .NET .resx or a Java .properties file based on the file extension (we need to know in order to determine how to iterate through the file)
  • We then iterate through each setting, translate it to the desired output language, and update the value in the file.
    • Note, in the case of the .NET .resx files I provided the option to only translate nodes where the name attribute ends with .Text. This helps avoid the translation of other things in the file such as control names and types that you may not want translated.

Although there are a bunch of moving parts here, the main piece I'd like to look at is the translatePhrase method in the Localizer class.

private string translatePhrase(string phrase, string sourceLangCode, string destLangCode)
{
    string translatedText = null;

    try
    {
        //create the web client
        WebClient client = new WebClient();

        //Open the page and get the results
        string url = c_GoogleTranslateUrl + sourceLangCode + "|" + destLangCode + "&text=" + phrase;
        string sPage = client.DownloadString(url);

        // Parse as the page as a string
        //  Page can have bad HTML causing problems if you try to parse as xml
        //  Find the span with the title of the original string
        int tagStart = sPage.IndexOf("<span title=\"" + phrase + "\"");
        int tagEnd = sPage.IndexOf("</span>", tagStart);
        string resultsTag = sPage.Substring(tagStart, (tagEnd - tagStart));
        //get rid of the start tag
        resultsTag = resultsTag.Substring(resultsTag.IndexOf(">") + 1);

        //You now have the translated text
        translatedText = resultsTag.Trim();                


        //dispose of the web client
        client.Dispose();
    }
    catch(Exception err)
    {
        throw new Exception("Failed to download results from Google Translator.\r\n" + err.Message);
    }

    return translatedText;
} 

This method is where the magic happens. A web client object is created and opens a connection to a URL for Google translator for the specified phrase and languages. The returned page is then parsed as a string.

I originally was simply going to parse it as an XML document but found in many cases the tags can be malformed (missing ", etc.) causing parsing errors. Parsing the page as a string definitely isn't as pretty as dealing with it as XML, but it does the job.  

Using the code

If you'd like to try the code, either install the application or run it from source. I've provided a sample C# and Java application that you can use to try the app out. The steps below show you how translate the included C# sample application.

  1. Notice in the provided sample C# app, I have already enabled localization on the main form, and also provided a second resource file (ResourceFile.resx) that stores other message strings that might be translated.  You'll also notice in the Program.cs I am explicitly setting the language of the application to Spanish.
  2. Launch the sample application; notice that none of the strings are translated. This is because no translated resource files are provided.
  3. Select the Form1.resx as your source file. Set the output file as Form1.es-ES.resx (this is the naming .NET is expecting for a Spanish version of the resource file).
  4. Since this resource file contains other strings we don't want to translated (object names, types, etc.), select the "For .resx files only translate .Text properties".
  5. Set the source language as English and the Output language as Spanish.
  6. After running the translation, you will notice the message "Completed translation".
  7. Translation complete

  8. Now change the source file to the provided ResourceFile.resx. The output file should be set to ResourceFile.es-ES.resx.
  9. Because this file only contains resource strings we have added, uncheck the "For .resx files only translate .Text properties".
  10. Once you have translated both resource files you will need to add them into the project using the add existing item option.
  11. Compile and run the project again. Notice all of the strings and messages are now translated into Spanish.
  12. Translated application

Points of Interest

Google does offer a web service that provides that ability to programmatically translate strings, however this does require a purchase of a license where this application allows for the use of the web site for free. 

I have been using Google Translate and other similar services for years to manually translate resource files and as several of you pointed out this does have its flaws. Translating single words and phrases looses context and can result in an translation that is sometimes may not be contextually correct. For this reason as well as other possible UI adjustments (e.g. adjusting the size of controls to fit translated text) QA should be done on the translated text. This application just gives developers like myself a quick way of producing the first run string translations.

History

  • 08/20/2012:  
    • Initial creation and publishing of the code and article.
  • 08/21/2012: 
    • Minor updates to code (modified namespace, added image in installer)
    • Added section in Points of Interest about translation context

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here