Introduction
Is this a famous quote from Shakespeare? Is there a virus asking for my permission to be installed? Is there a quick brown fox jumping over a Chinese dog? Most programmers pretend to speak a couple of languages, program languages, but are ignorant of the spelling rules in Kannada (spoken somewhere in India), the strokes of the Chinese Mandarin, or German dativ. A lot of users act as if they understand English, but this knowledge is frequently limited to the words: one, two, three...
In a small attempt to reconcile both groups, here's a library that can localize an application at runtime. The system folder has more than 35 mega of text available, hidden in menus, dialog boxes, message- and stringtables. If we can extract this information in our language to one program, it is possible to extract the same information in another language in another program.
Collecting resources
Collecting
To use or collect the resources hidden in a file, we load the file as a data file (in Vista as an image) and assure that our library will be closed properly:
public class qResourceReader : SafeHandleZeroOrMinusOneIsInvalid
{
public qResourceReader(string fileName)
:base(true)
{
if (Environment.OSVersion.Version.Major > 5)
base.handle = LoadLibraryEx(filename, IntPtr.Zero, 0x20);
else
base.handle = LoadLibraryEx(filename, IntPtr.Zero, 0x01 | 0x02);
}
protected override bool ReleaseHandle()
{
return FreeLibrary(base.handle);
}
}
Every resource is uniquely identified by three attributes :
- Type : 1 (Cursor) / 2 (Bitmap) / 3 (Icon) / "Avi" / "MUI" / ...
- Name : 1 / 2 / 3 / "Dialog_Open" / ...
- Language : 1013(English) / 2052(Chinese) /...
The type and name are strings or unsigned integers. Microsoft uses a simple trick to do this: they use an IntPtr
, and when the value of this IntPtr
is less or equal to ushort.MaxValue
, it refers to a number, else it points to a place in memory where the string can be read with the Marshal.PtrToString
function.
Resources can be collected using three callback functions, revealing an extra characteristic every time:
public bool StartCollect()
{
return EnumResourceTypes(base.handle,
new TypeDelegate(EnumTypes), new IntPtr.Zero);
}
private bool EnumTypes(IntPtr hModule, IntPtr lpType, IntPtr lParam)
{
return EnumResourceNames(hModule, lpType,
new NameDelegate(EnumNames), lParam);
}
private bool EnumNames(IntPtr hModule, IntPtr lpType,
IntPtr lpName, IntPtr lParam)
{
return EnumResourceLanguages(hModule, lpType, lpName,
new LanguageDelegate(EnumLanguages), lParam);
}
private bool EnumLanguages(IntPtr hModule, IntPtr lpType,
IntPtr lpName, ushort langID, IntPtr lParam)
{
IntPtr hRes = FindResource(hModule, lpType, lpName, langID);
IntPtr hResData = LoadResource(hModule, hRes);
IntPtr result = LockResource(hResData);
...
return true;
}
Now, the IntPtr
result is a locked resource that can be analyzed.
Strings
If the lpType IntPtr
has a value of 6, the result is a block of sixteen ordered pairs of length and text. Using a simple formula, we calculate the ID of each string: ((lpName
-1) * 16) + place.
lpType : |
6 |
String |
lpName : |
4 |
(4-1) * 16 = 48 |
langId : |
1033 |
English |
result : |
7aaaaaaa6bbbbbb005ccccc4dddd3eee2ff1g01i2jj2kk3lll3mmm3nnn |
((4 - 1) * 16) + 0 = 48 : aaaaaaa (Language English) |
((4 - 1) * 16) + 1 = 49 : bbbbbb (Language English) |
((4 - 1) * 16) + 2 = 50 : |
((4 - 1) * 16) + 3 = 51 : |
((4 - 1) * 16) + 4 = 52 : ccccc (Language English) |
... |
Message tables
Messages are stored in triple values. The first DWORD
is the count of these triples. The first DWORD
in each triple is the start numbering ID, the second DWORD
is the end numbering ID, and the last DWORD
is the offset of the string.
lpType : |
11 |
Message Table |
lpName : |
1 |
Always 1 |
langId : |
1030 |
Danish |
result : |
3,11,15,22,17,17,99,20,21,144,14,1, aaaaaaa,12,1,bbbbbb.....Message17....Message20... |
3 triples of Danish messages |
first triple has messages 11, 12, 13, 14 and 15 at offset 22 |
next triple has message 17 at offset 99 |
last are messages 20 and 21 at offset 144 |
14 unicode bytes aaaaaa |
12 unicode bytes bbbbb |
... |
Menus and Popups
We can examine a menu or popup without showing the menu. Because the handle has to be closed properly, we use a SafeHandle
.
public class qMenu : SafeHandleZeroOrMinusOneIsInvalid
{
public qMenu(qResourceReader reader, IntPtr result)
{
base.handle = LoadMenu(reader.DangerousGetHandle(), result);
}
protected override bool ReleaseHandle()
{
return DestroyMenu(this.handle))
}
}
Now, using a recursive function (some menus have sub sub sub sub sub-items), all items can be enumerated and uniquely identified.
private Dictionary<uint, string> _sortdict = new Dictionary<uint, string>()
private void CollectAllIds(IntPtr ptr)
{
int count = GetMenuItemCount(ptr);
if (count < 0)
return;
StringBuilder sb = new StringBuilder(500);
qMenuItemInfo inf = new qMenuItemInfo();
inf.cbSize = (uint)Marshal.SizeOf(inf.GetType());
inf.fMask = 0x02 | 0x04 | 0x40;
for (uint i = 0; i < count + 5; i++)
{
uint ui = GetMenuItemID(ptr, i);
if (GetMenuString(ptr, ui, sb, sb.Capacity, 0) > 0)
_sortdict.Add(ui, sb.ToString());
else
{
inf.cch = 0;
inf.dwTypeData = null;
if (GetMenuItemInfoW(ptr, i, true, ref inf))
{
if (inf.cch > 0)
{
inf.dwTypeData = new string(' ', (int)++inf.cch);
if (GetMenuItemInfoW(ptr, i, true, ref inf))
_sortdict.Add(--_counter, inf.dwTypeData);
}
if (inf.hSubMenu == IntPtr.Zero)
continue;
if (inf.hSubMenu.ToInt64() < Int32.MaxValue)
CollectAllIds(inf.hSubMenu);
}
else
CollectAllIds(GetSubMenu(ptr, i));
}
}
}
Dialogs
Dialogs are quite interesting for finding plurals. But, it is not easy to use dialog functions without actually showing the dialog on a screen. Furthermore, in Vista, you need administrative rights to use certain dialogs. So, we have to analyze the data manually.
Unicode allows code ("\r\n"
) and almost quite everything as a legal character, even the identifiers of a button (0x8000) or a static control (0x8200). So, we cannot use the Char.IsControl
functions to determine if we are dealing with an identifier or text. The only sure thing is: 0x00 is at the end and 0xFF is at the beginning of a string. An UnmanagedMemoryStream
can read the result backwards. Every time it reads two consecutive 0 bytes, it is a possible end of a string. Every time two consecutive 255 value bytes are read, perhaps the start of a new string is encountered. For the actual zigzagging code, I refer to the source code provided.
Extracting resources
Strings and messages
Windows provides two very fast functions: LoadString
and FormatMessage
:
public bool TryFindStringResource(uint resourceId, out string result)
{
if (LoadStringW(base.handle, resourceId, sb, sb.Capacity) > 0)
{
result = sb.ToString();
return true;
}
result = null;
return false;
}
public bool TryFindMessageResource(uint resourceId,
ushort resourceLangId, out string result)
{
if (FormatMessageW(0xA00, base.handle, resourceId, resourceLangId,
sb, sb.Capacity, IntPtr.Zero) > 0)
{
result = sb.ToString().Trim(null);
return true;
}
result = null;
return false;
}
Dialogs and Menus
We load the complete resource, and a TryGetValue
on the Dictionary
returns the string we are searching for.
public bool TryFindDialogString(uint dialogId, uint itemId, out string result)
{
IntPtr ptr = IntPtr.Zero;
int size = 0;
if (TryLockResource(new qResource(qResourceType.Dialogs, dialogId),
ref size, out ptr))
return new qDialog(ptr, size).Items.TryGetValue(itemId, out result);
result = null;
return false;
}
public bool TryFindMenuString(uint menuId, uint itemId, out string result)
{
IntPtr ptr = IntPtr.Zero;
int size = 0;
if (TryLockResource(new qResource(qResourceType.Menus, menuId),
ref size, out ptr))
return new qMenu(ptr).Items.TryGetValue(itemId, out result);
result = null;
return false;
}
Bitmaps - Icons - Cursors
Because an image says a thousand words, extraction of bitmaps, icons, and cursors have to be provided. Individual icons and cursors are loaded with the CreateIconFromResource
function; the others with the LoadImage
function. Because each image has to be closed properly, once again a SafeHandle
.
Localizing an application
For each targeted Windows version, we search in the local language for possible candidates:
Then, we provide a link to the qResourceReader
to extract the resources:
qResourceReader _rr;
string s;
ToolStripMenuItem tsmi = new ToolStripMenuItem("For testing purpose only");
if (Environment.OSVersion.Version.Major > 5)
{
_rr = new qResourceReader("User32.dll");
if (_rr.TryFindStringResource(718, out s)
tsmi.Text = s;
}
else
{
_rr = new qResourceReader("Win32k.sys");
if (_rr.TryFindMessageResource(213, out s)
tsmi.Text = s;
}
If different files have to be opened, we only change the filename property of the reader. It is also possible to extract a complete menu, dialog, or string resource.
_rr.FileName = "hhctrl.ocx";
Dictionary<uint, string> hh = _rr.CollectMenuResources(6000);
if (hh.TryGetValue(4294967294, out s))
fileToolStripMenuItem.Text = s;
if (hh.TryGetValue(6002, out s))
exitToolStripMenuItem.Text = s;
Points of interest
A better approach to localizing an application would be to write the results to a XAML file during the setup or modification of the program. At this point, we know if localization is really necessary, we can ask to have administrative rights, and/or require a specific application to be pre-installed.
In Vista, the file shell32.dll hides a lot of interesting information in the impair string resources between 24069 and 25065:
24837, print; print out; printer; printers; printing; printner; ... ;uninstalls;
unistall; printen; afdrukken; druk; af; afdruk;
verwijder; verwijderen; ...; deactiveren; deactiveer;
History
- 22 January 2008: Initial version.
- 11 February 2008 : Minor text updates.