Sunday 18 November 2012

Static typed, non-static internationalisation with C#

Microsoft Visual Studio provides reasonable support for internationalising an application. Add a resource file (resx) containing a string table and attach a custom tool called ResXCodeFileGenerator (which the IDE does by default). For a C# project the ResXCodeFileGenerator produces C# code that defines a class providing static access to the strings:
  /// <summary>
  ///   Looks up a localized string similar to Invalid Username.
  /// </summary>
  internal static string ErrorInvalidUsername {
    get {
      return ResourceManager.GetString("ErrorInvalidUsername",
          resourceCulture);
    }
  }
This provides an easy way to reference the strings throughout your application such that if a string is removed from the string table then you'll get a compile time error rather than a problem at run-time. If you want your strings accessible from another assembly you can use PublicResXCodeFileGenerator instead. The class generated by ResXCodeFileGenerator also contains a static Culture property that allows the application to set a CultureInfo instance that will be used for resolving the strings.

There are a few problems with this approach.

Non-static access

The static approach employed by (Public)ResXCodeFileGenerator  is not appropriate for a multi-threaded application that needs to switch between cultures. A common case where this arises is a web-application. Each request is processed in a separate thread, potentially requiring different cultures for each request. If one thread needs to resolve a string in one culture while another thread needs to resolve a string in another culture then serialisation is required to avoid race-conditions. That's just too much complexity. A better solution is to use a non-static class with each thread holding an instance of the class, and each instance holding its own resourceCulture field. For a recent project I implemented my own custom tool to achieve this. It started out modelling the ResXCodeFileGenerator directly, just without all the static keywords. But I soon discovered there was much more I could do with this custom tool.

Format fields

In many cases the string contained in a resource file is a template that has place-holders replaced at run-time. Typically the string.Format method is used to achieve this:
  string.Format("The {0} is invalid: {1}", "username",
      "Only alphanumeric characters are permitted");
Placing the templated string in a resx file separates the definition of the string from the actual use. The compiler will not be able to detect that the number of arguments supplied in the call to string.Format matches the number of template place-holders. I figured that since I was using a Custom Tool to generate code for accessing the strings it would be easy enough to turn those accessor properties into methods with arguments matching the template place-holders. Even better, those arguments could be statically typed. To make this work I used the comment column of the resource string table to define the expected types. For example,
the template "Task failed at {0}% with error {1}" would have the following in the string comment:
  $params(int percent, string error)
The Custom Tool then generates a method as follows:
  public string TaskFailed(int percent, string error) { ... }
The rule I employed was that if a string contains place-holders then it must have a corresponding $params(...) comment with the same number of arguments, otherwise a warning is issued and no method is generated. Similarly if a string has a $params(...) comment that does not match the number of place-holders in the string then warning and no method. No comment is required for strings that contain no place-holders. Also, to recover the original ResXFileCodeGenerator behaviour a string containing place-holders can have a comment of $params() to have a simple accessor created.

Just for fun I extended the comment syntax to optionally include documentation comments for the arguments and a summary comment for the method (ignore the string wrapping):
  $params(int percent /* The task completion percent */, 
           string error /* Error message */) 
           /// Task percent failure
For strings without any place-holders a documentation summary comment can be used without $params().

Returning objects instead of strings

For a web-application, localising error messages presents a challenge for logging. The localised error message needs to be generated and presented to the user, but it is typically desirable to record a non-localised message in the log file. I wanted to avoid the overhead of having to maintain two sets of error messages and having additional code to select both the localised and non-localised error message.

The approach I settled on was for the string accessor methods and properties generated by my Custom Tool to return instances of a Result class rather than the string itself:
  public class Result
  {          
    private Strings _parent;            
    private string _name;            
    private object[] _params;
            
    internal Result(Strings parent, string name,
        params object[] parms)
    {
      _parent = parent;
      _name = name;
      _params = parms;
    }
            
    public static implicit operator String(Result result)
    {
      return result.Value(result._parent.Culture);
    }
            
    public string Value(System.Globalization.CultureInfo culture)
    {
      string value;
      value = _parent.ResourceManager.GetString(_name, culture);
      if ((_params.Length > 0))
        value = string.Format(value, _params);
      return value;
    }
            
    public string Name
    {
      get { return _name; }
    }
  }
The string property and method accessors each just construct and return a Result object:
  public Result ErrorInvalidUsername(string username)
  {
    return new Result(this, "ErrorInvalidUsername", username);
  }
The Result object records the string resource name and any place-holder parameters supplied. The string resource lookup is deferred until actually required and can be performed against an explicit culture using the Value method. An implicit conversion to string is included so that the Result object can be treated as a string and resolved using the default culture set for the parent resource class instance.

A nice side-effect of this is that the Result class can be used to enforce the use of string resources by writing functions to accept the Result class as parameter rather than a string. In particular, the function used for reporting errors (or perhaps the constructor of an exception object) accepts a Result class so that it is possible to use the implicit string resolution to report the error to the user in the appropriate culture, as well as resolving the string to a common culture for logging.

No comments:

Post a Comment