I’m going to start with a simple code snippet which sorts an array of string
s using LINQ.
IEnumerable<string> line = new[] {"Z","A","Ä"};
var result = line.OrderBy(letter => letter);
Console.WriteLine("{0}", string.Join(" ", result));
The result might look like this:
A Ä Z
… or not. It depends on the thread culture the sorting is running in. The string
order is culture aware (unlike char
order which is culture invariant), so if we switch for instance on one of the Norwegian cultures by adding this line...
Thread.CurrentThread.CurrentCulture = CultureInfo.GetCultureInfo("nn-NO");
...before calling sort, we will get the following output instead:
A Z Ä
As next I extended my code snippet to create 4 arrays and sort each of them parallely.
Thread.CurrentThread.CurrentCulture = CultureInfo.GetCultureInfo("nn-NO");
Console.WriteLine("Main thread-{0} \t Culture-'{1}'",
Thread.CurrentThread.ManagedThreadId, Thread.CurrentThread.CurrentCulture);
Console.WriteLine(new string('-', 80));
List<string[]> list = new List<string[]>();
for (int i = 0; i < 3; i++)
{
list.Add(new[] { "Ä", "A", "Z" });
}
var result =
list
.Select(
line => line
.OrderBy(letter => letter));
Parallel.ForEach(result,
line =>
Console.WriteLine(
"Thread-{0} \t Culture-'{1}' \t {2}",
Thread.CurrentThread.ManagedThreadId,
Thread.CurrentThread.CurrentCulture,
string.Join(" ", line)));
Console.WriteLine();
Console.WriteLine("Press any key to quit");
Console.ReadKey();
The result looks like this:
Main thread-1 Culture-'nn-NO'
------------------------------------------------
Thread-1 Culture-'nn-NO' A Z Ä
Thread-5 Culture-'de-DE' A Ä Z
Thread-3 Culture-'de-DE' A Ä Z
Thread-4 Culture-'de-DE' A Ä Z
Press any key to quit
Line 4 sorting order differs from line 5. The sorting was split up into 4 threads, one main and 3 new threads.
All three newly created threads got the default culture of my system – not the culture of the main thread which was set manually.
The culture is a property of the executing thread. When a thread is started, its culture is initially determined by using GetUserDefaultLCID
from the Windows API. There is no way that I know how to manipulate this. See CultureInfo.CurrentCulture property at MSDN.
The same result if you use PLINQ syntax:
list
.AsParallel()
.Select(
line => line
.OrderBy(letter => letter))
.ForAll(
line =>
Console.WriteLine(
"Thread-{0} \t Culture-'{1}' \t {2}",
Thread.CurrentThread.ManagedThreadId,
Thread.CurrentThread.CurrentCulture,
string.Join(" ", line)));
The same query without parallel execution delivers consistent output, all four sequences are sorted in the same order.
The solution is to pass a specific culture aware comparer across into the OrderBy
method.
var norvegianIgnoreCaseComparer = StringComparer.Create
(CultureInfo.GetCultureInfo("nn-NO"), false);
list
.AsParallel()
.Select(
line => line
.OrderBy(letter => letter, norvegianIgnoreCaseComparer))
.ForAll(
line =>
Console.WriteLine(
"Thread-{0} \t Culture-'{1}' \t {2}",
Thread.CurrentThread.ManagedThreadId,
Thread.CurrentThread.CurrentCulture,
string.Join(" ", line)));
Well, but what about foreach
and LINQ
legacy code which can be paralelized with simple replacement of a single line by Parallel.ForEach()
or adding AsParallel()
. The result might be unpredictable and difficult to figure out. So if I would be the author of .NET or PLINQ, I would take over the culture of the main thread into the child threads, thus the data come from the main thread, the split-up takes place implicitly and in most cases results are merged back into the main thread back to be consumed there.
Similar issues might occur in queries using any of culture aware calculations, for instance DateTime
formatting and parsing.
So if you are targeting systems having different regional settings, it is a good idea to pass CultureInfo
or Culture specific staff (like comparers) into every PLINQ query and parallel call.