Abstract
Explicitly initializing class member variables and method variables actually reduces performance in .NET. This article explores the different types of variable initialization schemes available in .NET and the performance impact of each initialization scheme. Measurements indicate that expressly initializing variables slows object initialization by 10% and slows method calls by about 20%.
Introduction
If you come from a C/C++ background (as I do) you are probably in the habit of always initializing your variables. The classic approach is to initialize when defining the variable (as recommended by Scott Meyers in Effective C++).
class B
{
private int varA = 0;
private string varB = null;
private DataSet varC = null;
private DataSet varD = null;
private string varE = null;
...
Another approach is to explicitly initialize the variables in the constructor:
class C
{
private int varA;
private string varB;
private DataSet varC;
private DataSet varD;
private string varE;
public C()
{
varA = 0;
varB = null;
varC = null;
varD = null;
varE = null;
}
...
Well, if you follow any of the above approaches, you are hurting performance in .NET!
The case for initialization
In .NET, the Common Language Runtime (CLR) expressly initializes all variables as soon as they are created. Value types are initialized to 0 and reference types are initialized to null. When you expressly initialize your variables, the compiler creates code to set the value of the variables and includes that code as part of your object initialization (for classes) or method call (for methods). The compiler is not smart enough (Microsoft, take a hint�) to discover this duplicate initialization. Granted, if you initialize a variable to a non-default value (e.g., int A = 3;
) the compiler is perfectly justified in creating the extra code. When you instantiate an object (using the new
keyword) or make a method call, the following sequence happens:
- Your variables are allocated (based on instructions from the compiler).
- The variables are initialized by the CLR.
- The code for the object constructor or method is executed.
The problem here is: if you explicitly initialized your variables, the constructor or method code will initialize your variables again. Is this really a problem? In many cases, the double initialization is not a problem:
- If the variables are static (class variables), the initialization is done once during the entire life of the process.
- If the class is instantiated only a few times or the method is rarely called, the extra initialization time can be safely ignored.
But what about those objects instantiated many times in rapid succession? What about those methods called thousands of times? What is the performance impact?
Designing to measure performance
Classes and Methods to test
In order to measure the performance impact of initialization, I created three almost identical classes:
- Class
A
does not initialize anything.
- Class
B
uses initialization-on-definition.
- Class
C
uses constructor initialization.
class A
{
private int varA;
private string varB;
private DataSet varC;
private DataSet varD;
private string varE;
public A()
{
}
...
}
class B
{
private int varA = 0;
private string varB = null;
private DataSet varC = null;
private DataSet varD = null;
private string varE = null;
public B()
{
}
...
}
class C
{
private int varA;
private string varB;
private DataSet varC;
private DataSet varD;
private string varE;
public C()
{
varA = 0;
varB = null;
varC = null;
varD = null;
varE = null;
}
...
}
In addition, each class contains two methods:
public void set(int a, string b, DataSet c, DataSet d, string e)
{
varA = a;
varB = b;
varC = c;
varD = d;
varE = e;
}
public void print()
{
Console.WriteLine("This is class {0}", this.ToString());
Console.WriteLine("varA is {0}", varA);
Console.WriteLine("varB is {0}", varB);
Console.WriteLine("varC is {0}", varC);
Console.WriteLine("varD is {0}", varD);
Console.WriteLine("varE is {0}", varE);
}
The above methods are never called, they are designed to fool the compiler into believing that the variables are set and read. If the compiler believes variables are unused, it is allowed to optimize away the variables. If the compiler ignores the variables, the whole test is useless. To measure the impact of initialization on method calls, I created two static
methods:
static int methodUnInit(string s, DataSet d, int i)
{
string var1;
string var2;
string var3;
DataSet var4;
DataSet var5;
DataSet var6;
int var7;
int var8;
int var9;
...
}
static int methodInit(string s, DataSet d, int i)
{
string var1 = null;
string var2 = null;
string var3 = null;
DataSet var4 = null;
DataSet var5 = null;
DataSet var6 = null;
int var7 = 0;
int var8 = 0;
int var9 = 0;
...
}
The body of both methods is identical:
...
var1 = s;
var2 = s;
var3 = s;
var4 = d;
var5 = d;
var6 = d;
var7 = i;
var8 = i;
var9 = i;
if ((null != var1) && (null != var2) && (null != var3)
&& (null != var4) && (null != var5) && (null != var6)
&& (0 != var7) && (0 != var8) && (0 != var9))
{
var9 = 0;
}
return var9;
}
I had to add the setting and use code to force the compiler not to ignore the variables. Note that the above code results in double initialization for the second method, but initialization impact is what we are here to measure.
The test harness
The test class does the following:
- Pre-allocates the test result containers.
- Explicitly performs garbage collection (by calling
GC.Collect()
) before executing each test. Explicit garbage collection is used to minimize the effect of garbage collection on the time (and performance) measurements.
- Loop
- Randomly decides which test to run.
- Runs the selected test and stores the result.
- Once testing is complete, calculates the average performance for each test based on the collected data.
Why introduce randomness into the test?
If the test harness was to always perform the tests in a certain order, the order of execution might affect the measured performance. For some reason, performing test A and then B and then C always resulted in lower performance on the B test as compared to running A then C then B.
To avoid order problems, the test harness randomly chooses which test to perform. As the random distribution used is uniform, each test is as likely to be executed as any other test. Given enough repetitions, each test is executed (on average) about the same number of times.
The test harness testABC()
method executes 300 tests, randomly deciding at each iteration which test to run.
static void measureABC(int repetitions)
{
Random r = new Random((int)DateTime.Now.Ticks);
for (int i = 0; i < 300; ++i)
{
GC.Collect();
if (0 == i % 5)
{
Console.Write(".");
}
switch (r.Next(5))
{
case 0:
measureA(repetitions);
break;
case 1:
measureB(repetitions);
break;
case 2:
measureC(repetitions);
break;
case 3:
measureD(repetitions);
break;
case 4:
measureE(repetitions);
break;
}
}
Console.WriteLine();
double totalA = 0.0;
double totalB = 0.0;
double totalC = 0.0;
double totalD = 0.0;
double totalE = 0.0;
foreach (double t in Atime)
{
totalA += t;
}
foreach (double t in Btime)
{
totalB += t;
}
foreach (double t in Ctime)
{
totalC += t;
}
foreach (double t in Dtime)
{
totalD += t;
}
foreach (double t in Etime)
{
totalE += t;
}
Console.WriteLine("No init: {0:f2} Init on Def: {1:f2}" +
" Init in Const: {2:f2}", totalA / Atime.Count,
totalB / Btime.Count, totalC / Ctime.Count);
Console.WriteLine("Method No init: {0:f2} Method Init on Def: {1:f2}",
totalD / Dtime.Count, totalE / Etime.Count);
}
To get a better perspective, the entire test suite is run 10 times by the Main()
method.
The results
Using Visual Studio .NET 2003 in "Release" mode (with the optimizer using default settings) and running on my Win2K 1.6GHz Pentium-M machine:
- Creating an object and not initializing variables ~503mSec (100%)
- Creating an object and initializing on definition ~557mSec (111%)
- Creating an object and initializing in the constructor ~582mSec (116%)
- Calling a method and not initializing local variables ~253mSec (100%)
- Calling a method and initializing variables ~316mSec (125%)
Conclusions
If an object is heavily used (created many times a second) � don�t initialize the variables. Not initializing shaves a nice 10% off the initialization. Obviously, if you do time-consuming and lengthy initialization in the constructor � the variable initialization time will become insignificant and lost in the noise.
For short and quick methods which are heavily used � avoid initializing local variables to default values � just trust the CLR to do it for you.
Surprisingly, initializing variables in the constructor is a little slower than initializing on definition but this may be explained by the fact that class B
has an empty constructor so the constructor calling code might be eliminated by the optimizer, saving a little time.
Further work
Lobby Microsoft to improve the optimizer � have the optimizer strip away unnecessary initializations.
If anybody has access to an alternative .NET compiler (Mono, Borland), please publish the results of your tests.