Introduction
This is an advanced article.
In many situations, you simply don't need the use of the volatile
keyword or to use the Thread.Volatile*
methods, as all the synchronization primitives will do the job for you and are usually easier to use.
But if you really care about performance and want to use volatile
fields, you should understand that:
- If you mark a field as
volatile
, all its reads and writes will be volatile
, even if you access it inside a lock - If you decide not to mark the field as
volatile
and then use the Thread.VolatileRead()
or Thread.VolatileWrite()
, you will be doing full fences when only a half fence is required
Note: I wrote this class before knowing that the .NET 4.5 has a Volatile
class that also works as expected. But I am still using .NET 4.0, so this class is useful to me and I think it is at least an interesting topic even if you don't need it.
Understanding the Missing Information
Maybe item 1 didn't make things really clear. I did my tests to be sure. The volatile
keyword makes all reads and writes volatile
, but they use the half fences (that is: read-only [acquire] fence and write-only [release] fence) while the Thread.VolatileRead()
and Thread.VolatileWrite()
always use full fences. For a long time, I thought that was a bug in .NET itself. I really thought that the volatile
modifier marked the field as volatile
in the IL and that all reads and writes were normal IL reads and writes, over a volatile
field.
But that's not what happens. In fact, the problem is in C# (that does not allows us to tell which action is volatile
, making the volatile
to be an all or nothing modifier) and in the Thread.Volatile*
implementations. At IL level, we can prefix ldfld
and stfld
(for example) with the volatile
modifier. And such volatile
modifier will only apply the right half-fence, not the full fence.
I must say that I discovered that by accident, I was looking at IL instructions for other reasons when I saw the volatile
prefix. So I decided to try ... I really wanted to use such correct behavior in C#, so I decided to do some tests.
The First Test - It Was Not Useful
In my first test, I wrote a DynamicMethod
that used the volatile
prefix. I then generated a delegate like this:
public int ReadDelegate(ref int variable);
And I did the tests. In fact, I generated a non-volatile
and a volatile
delegate and, by the difference in performance, I though it was working fine. Yet, the virtual call to use the delegate was making the code with half fence slower than using the full fence of the Thread.VolatileRead()
method, so I decided to abandon such an idea.
The Second Test - IL + C#
A great thing in .NET is that we can write a library in one language and access it by another language. Surely, it will be better to avoid generating an entire library for a single class, but as that would solve the problem, I decided to try it. But, as I never created a library using IL only, I decided to create a new class library in C#, with a single class and a single method, compile it, and then use ildasm to get the IL for such library.
My initial code was something like:
public static class Volatile
{
public static int Read(ref int variable)
{
return variable;
}
}
And when I decompiled it, I got this code:
.class public abstract auto ansi sealed beforefieldinit Pfz.Volatile
extends [mscorlib]System.Object
{
.method public hidebysig static int32 Read(int32& variable) cil managed
{
.maxstack 8
IL_0000: ldarg.0
IL_0001: ldind.i4
IL_0002: ret
}
}
In fact, I used ildasm to dump the entire library code, but I decided not to put the entire code in the article as it is too long.
Then, I changed the IL code:
.class public abstract auto ansi sealed beforefieldinit Pfz.Volatile
extends [mscorlib]System.Object
{
.method public hidebysig static int32 Read(int32& variable) cil managed
{
.maxstack 1
IL_0000: ldarg.0
volatile.
IL_0001: ldind.i4
IL_0002: ret
}
}
Finally I compiled this code using ilasm, with the /DLL parameter. So, I used the library in a test application, and the performance was effectively the same of the volatile
keyword for doing the reads. But now, I have the option to use a non-volatile variable with normal reads, or with half-fence reads. That's what I wanted.
So, the final step was to create all the VolatileRead()
and VolatileWrite()
overloads. To do that, I saw all the overloads of the Thread.VolatileRead()
method to get all the types that should be supported, then I wrote all the methods in C#, and I replaced the object
one by a generic method over reference types, finishing with this class:
using System;
namespace Pfz
{
public static class Volatile
{
public static byte Read(ref byte variable)
{
return variable;
}
public static double Read(ref double variable)
{
return variable;
}
public static float Read(ref float variable)
{
return variable;
}
public static int Read(ref int variable)
{
return variable;
}
public static IntPtr Read(ref IntPtr variable)
{
return variable;
}
public static long Read(ref long variable)
{
return variable;
}
public static T Read<T>(ref T variable)
where
T: class
{
return variable;
}
public static sbyte Read(ref sbyte variable)
{
return variable;
}
public static short Read(ref short variable)
{
return variable;
}
public static uint Read(ref uint variable)
{
return variable;
}
public static UIntPtr Read(ref UIntPtr variable)
{
return variable;
}
public static ulong Read(ref ulong variable)
{
return variable;
}
public static ushort Read(ref ushort variable)
{
return variable;
}
public static void Write(ref byte variable, byte value)
{
variable = value;
}
public static void Write(ref double variable, double value)
{
variable = value;
}
public static void Write(ref float variable, float value)
{
variable = value;
}
public static void Write(ref int variable, int value)
{
variable = value;
}
public static void Write(ref IntPtr variable, IntPtr value)
{
variable = value;
}
public static void Write(ref long variable, long value)
{
variable = value;
}
public static void Write<T>(ref T variable, T value)
where
T: class
{
variable = value;
}
public static void Write(ref sbyte variable, sbyte value)
{
variable = value;
}
public static void Write(ref short variable, short value)
{
variable = value;
}
public static void Write(ref uint variable, uint value)
{
variable = value;
}
public static void Write(ref UIntPtr variable, UIntPtr value)
{
variable = value;
}
public static void Write(ref ulong variable, ulong value)
{
variable = value;
}
public static void Write(ref ushort variable, ushort value)
{
variable = value;
}
}
}
And finally, I repeated the process of compiling the code, executing ildasm to dump all the library code, I corrected the maxstacksize
of all the methods and I put the volatile.
prefix in all loads and stores, and compiled the library again.
So, considering how easy it is, I don't know why it took so long to have an equivalent class in .NET (which only appeared in .NET 4.5). And, if I saw things correctly, the .NET implementation is not implemented in IL, it has all methods marked as external
. I don't think that's necessary. My only real problem is that I can't compile an IL unit as part of a C# library, I don't like to have an entire library for a single class and I don't want to hack the .targets or use ILMerge to put the Volatile
class in my main library. Yet, I will not need this class anymore in .NET 4.5, so I will live with it as a separate assembly for now.
Points of Interest
There are many things that the IL allows us to do that C# simple can't do. I really don't understand that, as C# is the main .NET language. In this case, I was able to solve the problem, but I think it will be simpler if the volatile
keyword could be used as int x = volatile(variable);
or volatile(variable) = x;
.
Unfortunately, we can't always use IL and build another library to solve our problems (for example, it is impossible to create a module initializer in C#, and that's something that can't live in another DLL).
Well, I hope this article is at least interesting to those that want to explore the limits of .NET. So, if something seems impossible in C#, look at the IL, maybe it is only a C# limitation, not a .NET limitation.