Introduction
If you're like me, localizing your applications is a tedious task that is often skipped when possible and left to the last possible moment when needed.
Once I get to the localization task, I open my browser on one screen and go to translate.google.com and begin to copy paste, copy paste, and copy paste some more.
It's not that this is a difficult task, but boy it would be nice if there was a way to automatically do these translations; that is the purpose behind this project.
How the code works
As you are probably aware, most development languages use a simple file to hold the various words and phrases that are localized within the application.
In the case of .NET these are within a .resx file which is just an XML file. In Java this is typically done via
a .properties file.
The basics of how the application works are:
You provide:
- The source resource file
- The name of the desired resource file to create
- The source and destination languages
The application then:
- Copies the source file to the new destination file
- Determines if the file type is a .NET .resx or a Java .properties file based
on the file extension (we need to know in order to determine how to iterate through the file)
- We then iterate through each setting, translate it to the desired output language,
and update the value in the file.
- Note, in the case of the .NET .resx files I provided the option to only translate nodes where the name attribute ends with .Text.
This helps avoid the translation of other things in the file such as control names and types that you may not want translated.
Although there are a bunch of moving parts here, the main piece I'd like to look at is the translatePhrase
method in the Localizer
class.
private string translatePhrase(string phrase, string sourceLangCode, string destLangCode)
{
string translatedText = null;
try
{
WebClient client = new WebClient();
string url = c_GoogleTranslateUrl + sourceLangCode + "|" + destLangCode + "&text=" + phrase;
string sPage = client.DownloadString(url);
int tagStart = sPage.IndexOf("<span title=\"" + phrase + "\"");
int tagEnd = sPage.IndexOf("</span>", tagStart);
string resultsTag = sPage.Substring(tagStart, (tagEnd - tagStart));
resultsTag = resultsTag.Substring(resultsTag.IndexOf(">") + 1);
translatedText = resultsTag.Trim();
client.Dispose();
}
catch(Exception err)
{
throw new Exception("Failed to download results from Google Translator.\r\n" + err.Message);
}
return translatedText;
}
This method is where the magic happens. A web client object is created and opens a connection to a URL for Google translator for the specified phrase and languages.
The returned page is then parsed as a string.
I originally was simply going to parse it as an XML document but found in many cases the tags can be malformed (missing ", etc.) causing parsing errors.
Parsing the page as a string definitely isn't as pretty as dealing with it as XML, but it does the job.
Using the code
If you'd like to try the code, either install the application or run it from source. I've provided a sample C# and Java application that you can use to try the app out.
The steps below show you how translate the included C# sample application.
- Notice in the provided sample C# app, I have already enabled localization on the main form, and also provided a second resource file (ResourceFile.resx)
that stores other message strings that might be translated. You'll also notice in the
Program.cs I am explicitly setting the language of the application to Spanish.
- Launch the sample application; notice that none of the strings are translated. This is because no translated resource files are provided.
- Select the Form1.resx as your source file. Set the output file as Form1.es-ES.resx (this is the naming .NET is expecting for a Spanish version of the resource file).
- Since this resource file contains other strings we don't want to translated (object names, types, etc.), select the "For .resx files only translate .Text properties".
- Set the source language as English and the Output language as Spanish.
- After running the translation, you will notice the message "Completed translation".
- Now change the source file to the provided ResourceFile.resx. The output file should be set to ResourceFile.es-ES.resx.
- Because this file only contains resource strings we have added, uncheck the "For .resx files only translate .Text properties".
- Once you have translated both resource files you will need to add them into the project using the add existing item option.
- Compile and run the project again. Notice all of the strings and messages are now translated into Spanish.
Points of Interest
Google does offer a web service that provides that ability to
programmatically translate strings, however this does require a purchase of a license where
this application allows for the use of the web site for free.
I have been using Google Translate and other similar services for years to manually translate resource files and as several of you pointed out this does have its flaws. Translating single words and phrases looses context and can result in an translation that is sometimes may not be contextually correct. For this reason as well as other possible UI adjustments (e.g. adjusting the size of controls to fit translated text) QA should be done on the translated text. This application just gives developers like myself a quick way of producing the first run string translations.
History
- 08/20/2012:
- Initial creation and publishing of the code and article.
- 08/21/2012:
- Minor updates to code (modified namespace, added image in installer)
- Added section in Points of Interest about translation context