Introduction
Information technology applications increasingly have a global reach. Thus, it is essential for software developers to design applications with an eye towards localization into multiple languages. Ideally, releasing a well-designed application in a new language is simply a matter of creating a new set of localized resources and perhaps changing a setting at compile-time or runtime. The .NET Framework provides excellent support for this scenario via the ability to swap in XML-based .resx files or binary .resources files and deployment and loading of satellite assemblies. For a good introduction to localization in the .NET Framework, see MSDN.
A key part of localization is the management of string resources.
The .NET Framework contains a ResourceManager
class with a GetString
method to which you pass the literal name of the string resource you want to retrieve, and in return, you get the appropriate localized string per the rules described in the article above.
ResourceManager rm =
new System.Resources.ResourceManager("MyProjectName.MyResourceBaseName",
Assembly.GetExecutingAssembly());
string localized = rm.GetString( "MyStringResource" );
Of course, should you misspell the resource name, GetString
will return the empty string at runtime. Depending on the application, this bug may be difficult to isolate. Since it is always preferable to detect errors at compile-time rather than runtime, in the past, I have manually added symbols for each string defined in a resource, like the following:
class MyStrings
{
public const string MyStringResource = "MyStringResource";
}
Then I would use these instead:
ResourceManager rm =
new System.Resources.ResourceManager("MyProjectName.MyResourceBaseName",
Assembly.GetExecutingAssembly());
string localized = rm.GetString( MyStrings.MyStringResource );
This is a safer approach. However, once projects get into the hundreds or thousands of strings, the process of defining string constants becomes tedious and cries out for automation. Before building my own tool, I searched around and found two other attempts to address this problem: a command-line utility [1] and a Visual Studio .NET custom tool that makes use of the CodeDOM [2]. Both of these tools incorporate a number of good ideas, and I would encourage you to check them out. Rather than spending a great deal of time explaining them and why they did not meet my needs, I will move directly into discussing my tool, StringClassGen.
StringClassGen Features
StringClassGen.exe is a command-line tool written in Visual Basic .NET. The command syntax is explained below, in the section "StringClassGen Usage". As a command-line tool, depending on your build process, you can integrate it into your build batch files, NAnt scripts, or Visual Studio .NET pre-build events, to turn .resx and/or .resources files into source files containing properties and methods, to help you more easily and safely access your resource strings. StringClassGen has the following key features:
Like the approach described in [2], StringClassGen uses the CodeDOM to generate either C# or Visual Basic .NET code on demand. Because code that uses the CodeDOM is not particularly interesting (just tedious), I will not show any of it here. Suffice to say that the generated code consists of a set of static properties and methods discussed in more detail below. When this code is integrated into a C# or Visual Basic .NET project, it abstracts the process of loading strings from assembly resources in a type-safe manner.
The top-level class generated by StringClassGen encapsulates a singleton instance of a ResourceManager
that loads strings from the executing assembly. (A possible extension of StringClassGen would be to allow the ResourceManager
to load from an assembly other than the one that is executing.) Note that the ResourceManager.GetString
method is thread-safe according to the .NET Framework documentation.
When creating identifiers for string resources in code, I have often found it useful to organize them into groups, such as by the web page or form in which they are used. Inner classes provide a nice encapsulation for this concept. This leads to the problem of how to identify the desired groups when doing string class generation.
If you have edited .resx files in Visual Studio, you may have noticed that, associated with each string entry, there is a column entitled "comment" in which you can place any text you want. If you adopt the convention that the contents of the "comment" column represent the desired group (inner class) for a string identifier, then when working with .resx files, StringClassGen will create the appropriate inner classes and populate them with the correct string identifiers. (Clearly, that implies that in this scheme, the contents of the "comment" column must be a valid identifier name in the target programming language.)
How this is accomplished and why it only works with .resx files is worth some discussion. The .NET Framework provides an interface, IResourceReader
, and two implementations, ResourceReader
and ResxResourceReader
, that allow you to pull string resource name-value pairs out of .resources and .resx files, respectively. If you peek underneath the hood at the .resx file format, you'll see that it is just XML, with string elements that look like:
<data name="String1">
<value>This is the first string.</value>
<comment>Class1</comment>
</data>
Unfortunately, the IResourceReader
interface provides no mechanism to access the <comment>
. Indeed, for .resources files (the compiled binary format of .resx files), the comment is no longer a part of the file's data. StringClassGen processes .resources files using ResourceFileReader
, as shown in the following code snippet:
Protected Overrides Sub ProduceStrings()
Dim reader As New ResourceReader(filename)
Try
Dim readerEnumerator As IDictionaryEnumerator = reader.GetEnumerator()
While readerEnumerator.MoveNext
AddString(readerEnumerator.Key.ToString(), _
readerEnumerator.Value.ToString())
End While
Catch ex As Exception
Finally
reader.Close()
End Try
End Sub
But for .resx files, we have another option. Since a .resx file is just XML, we can forgo the use of the IResourceReader
interface entirely and process it as an ordinary XML file. For StringClassGen, I chose to use XPathNavigator
and XPathDocument
:
Protected Overrides Sub ProduceStrings()
Dim doc As New XPathDocument(filename)
Dim nav As XPathNavigator
nav = doc.CreateNavigator
Dim exp As XPathExpression
exp = nav.Compile("//data")
exp.AddSort("comment", XmlSortOrder.Ascending, _
XmlCaseOrder.None, "", XmlDataType.Text)
Dim nodes As XPathNodeIterator = nav.Select(exp)
While nodes.MoveNext()
Dim comment As String
Dim value As String
nodes.Current.MoveToFirstChild()
value = nodes.Current.Value
nodes.Current.MoveToParent()
Dim commentNodes As XPathNodeIterator = _
nodes.Current.SelectDescendants("comment", "", False)
If commentNodes.Count > 0 Then
nodes.Current.MoveToFirstChild()
nodes.Current.MoveToNext()
comment = nodes.Current.Value
nodes.Current.MoveToParent()
Else
comment = ""
End If
AddString(nodes.Current.GetAttribute("name", _
nav.NamespaceURI), value, comment)
End While
End Sub
The most interesting part of this routine is the sort on "comment". This allows us to process all strings belonging to a given inner class sequentially, so that only one inner class is "open for business" at a time as we use the CodeDOM to generate identifiers. Strings that have an empty comment defined in the .resx file will not be placed in an inner class but rather belong to the outermost generated class.
One of the more useful string manipulation features in the .NET Framework is the ability to embed "format item" tokens into strings and replace them at runtime using the String.Format
method. Ordinarily, to take advantage of this feature while using string resources, one would write code like the following (where a resource manager has already been obtained):
formattedString =
String.Format( resourceManager.GetString( "StringWithParams" ),
"Param Value 1", "Param Value 2" );
where the value associated with StringWithParams
might be: "This is the first param: {0}. And this is the second param: {1}".
To streamline this process, StringClassGen uses regular expression matching to attempt to detect format item tokens in the strings it processes and, if it finds any, generates wrapper methods similar to those in [1] that take the correct number of parameters. The wrapper methods handle both the retrieval of the string from the resource and the formatting. Thus, the generation of the following code:
Public Overloads Shared Function StringWithParams(ByVal param0 _
As Object, ByVal param1 As Object) As String
Return [String].Format(CultureInfo.InvariantCulture, _
resources.GetString("StringWithParams"), param0, param1)
End Function
or:
public static string StringWithParams(object param0, object param1) {
return String.Format(CultureInfo.InvariantCulture,
resources.GetString("StringWithParams"), param0, param1);
}
allows us to reduce the above code to:
formattedString =
GeneratedStringResource.StringWithParams( "Param Value 1", "Param Value 2" );
Furthermore, as FxCop is fond of pointing out, when you use String.Format
, you ought to use the overload that accepts an IFormatProvider
to provide culture-specific formatting information, especially in global-friendly applications. Thus, StringClassGen actually creates two overloads for each string method, one which accepts a client-supplied IFormatProvider
and another that uses a default format provider that you specify via a command-line option, which may be either InvariantCulture
, CurrentUICulture
, or CurrentCulture
. If no format provider is specified, the default is to use the InvariantCulture
.
As we saw in the last section, strings with format item tokens cause the generation of methods taking one or more parameters. If a string contains no format item tokens, there is no need to generate a method: a property is more appropriate. Thus, a string with no format tokens will trigger the generation of code like the following:
Public Shared ReadOnly Property StringWithNoParams As String
Get
Return resources.GetString("StringWithNoParams")
End Get
End Property
or:
public static string StringWithNoParams {
get {
return resources.GetString("StringWithNoParams");
}
}
When you edit a .resx file in Visual Studio .NET or your text editor of choice, there is nothing to prevent you from using the same string name for more than one string. However, this is almost certainly a bug, since there is no way to retrieve more than one string with a given name using a ResourceManager
. Because of this, StringClassGen.exe tracks the list of used names and returns an error if the same name is used more than once in the resource file.
StringClassGen Usage
The behavior of StringClassGen.exe may be controlled by several command-line arguments:
- (-vb|-cs) - The language of the generated code, either VB.NET or C#. The default is C#.
- (-c) - Specifies to output .resx file comments as
<summary>
comments for the generated properties and methods, instead of using them to group strings into inner classes. If you choose this option, all of your strings will be defined in the top-level class. The default is to have this option off.
- (-ns namespacename) - The namespace that the generated top-level string class lives in. The default is the name of the resource file (minus extension) suffixed with the string "Namespace".
- (-ic|-cuic|-cc) - Use
InvariantCulture
, CurrentUICulture
, or CurrentCulture
as default format provider (defaults to InvariantCulture
).
- (-class classname) - The name of the generated string class, which defaults to the name of the resource file minus extension.
- (-out outfilename) - The name of the generated file. If this option is omitted, output is sent to standard out.
You should add the StringClassGen-generated source file(s) to your assembly, compiling them along with the rest of your source code. Depending on the build tools you are employing, you may want to establish a dependency so that StringClassGen only runs if the .resx file is newer than the source file.
If you are having difficulties using the generated source code, the most likely problem is that you are providing the wrong namespace (-ns) to StringClassGen. This has the effect of causing an exception the first time you access the resource manager. Visual Studio .NET, by default, places resources from project-included .resx files into the default namespace for the project, which is configurable in the project properties. Chances are, you want to pass this as the -ns parameter to StringClassGen. (You can verify the name of your specific resource by examining your assembly manifest in ildasm.exe. An exact explanation of this process is beyond the scope of this article, however.)
Demo Project
Included in the download are sample applications in Visual Basic .NET and C# that load a few strings from the resource using StringClassGen-generated classes. In each application, you can switch between using inner classes or not by defining to true the conditional compilation directive "INNERCLASSES
", and including an appropriate version of the generated TestStringResource.cs/TestStringResource.vb. Here are the sample command lines for generating the files (relative to the build-output directory for StringClassGen):
- C#, INNERCLASSES: StringClassGen.exe ..\TestStringResource.resx -cs -ns CSTestApplication -out ..\CSTestApplication\TestStringResource
- C#, no INNERCLASSES: StringClassGen.exe ..\TestStringResource.resx -cs -c -ns CSTestApplication -out ..\CSTestApplication\TestStringResource
- VB, INNERCLASSES: StringClassGen.exe ..\TestStringResource.resx -vb -ns VBTestApplication -out ..\VBTestApplication\TestStringResource
- VB, no INNERCLASSES: StringClassGen.exe ..\TestStringResource.resx -vb -c -ns VBTestApplication -out ..\VBTestApplication\TestStringResource
History
- Initial release: 01/02/2005.
- Allows selection of default format provider: 01/25/2005.