(untagged)

Gain performance by not initializing variables

gtamir

0.00/5 (No votes)

19 May 2005

Explicitly initializing variables to default values might be reducing your performance.

Download initialization test source - 4.59 Kb

Screen shot of running benchmark

Abstract

Explicitly initializing class member variables and method variables actually reduces performance in .NET. This article explores the different types of variable initialization schemes available in .NET and the performance impact of each initialization scheme. Measurements indicate that expressly initializing variables slows object initialization by 10% and slows method calls by about 20%.

Introduction

If you come from a C/C++ background (as I do) you are probably in the habit of always initializing your variables. The classic approach is to initialize when defining the variable (as recommended by Scott Meyers in Effective C++).

class B
{
    // Member variables initialized explicitly

    private int varA = 0;
    private string varB = null;
    private DataSet varC = null;
    private DataSet varD = null;
    private string varE = null;

    ...

Another approach is to explicitly initialize the variables in the constructor:

class C
{
    // Member variables not initialized on definition

    private int varA;
    private string varB;
    private DataSet varC;
    private DataSet varD;
    private string varE;
    
    // Member variables explicitly initialized in the constructor

    public C()
    {
        varA = 0;
        varB = null;
        varC = null;
        varD = null;
        varE = null;
    }
    
    ...

Well, if you follow any of the above approaches, you are hurting performance in .NET!

The case for initialization

In .NET, the Common Language Runtime (CLR) expressly initializes all variables as soon as they are created. Value types are initialized to 0 and reference types are initialized to null. When you expressly initialize your variables, the compiler creates code to set the value of the variables and includes that code as part of your object initialization (for classes) or method call (for methods). The compiler is not smart enough (Microsoft, take a hint�) to discover this duplicate initialization. Granted, if you initialize a variable to a non-default value (e.g., int A = 3;) the compiler is perfectly justified in creating the extra code. When you instantiate an object (using the new keyword) or make a method call, the following sequence happens:

Your variables are allocated (based on instructions from the compiler).
The variables are initialized by the CLR.
The code for the object constructor or method is executed.

The problem here is: if you explicitly initialized your variables, the constructor or method code will initialize your variables again. Is this really a problem? In many cases, the double initialization is not a problem:

If the variables are static (class variables), the initialization is done once during the entire life of the process.
If the class is instantiated only a few times or the method is rarely called, the extra initialization time can be safely ignored.

But what about those objects instantiated many times in rapid succession? What about those methods called thousands of times? What is the performance impact?

Designing to measure performance

Classes and Methods to test

In order to measure the performance impact of initialization, I created three almost identical classes:

Class A does not initialize anything.
Class B uses initialization-on-definition.
Class C uses constructor initialization.

class A
{
    // References not initialized, trust the CLR

    private int varA;
    private string varB;
    private DataSet varC;
    private DataSet varD;
    private string varE;

    // References are not initialized, trust the CLR

    public A()
    {
    }

    ...
}

/// <summary>

/// This class initialize references on definitions

/// </summary>

class B
{
    // Member variables initialized explicitly

    private int varA = 0;
    private string varB = null;
    private DataSet varC = null;
    private DataSet varD = null;
    private string varE = null;

    // Trust the compiler to initialize the variables

    public B()
    {
    }

    ...
}

/// <summary>

/// This class initialize references in the constructor

/// </summary>

class C
{
    // Member variables not initialized on definition

    private int varA;
    private string varB;
    private DataSet varC;
    private DataSet varD;
    private string varE;

    // Member variables explicitly initialized in the constructor

    public C()
    {
        varA = 0;
        varB = null;
        varC = null;
        varD = null;
        varE = null;
    }
    
    ...
}

In addition, each class contains two methods:

// Method added to force the compiler to think that the variables are set

public void set(int a, string b, DataSet c, DataSet d, string e)
{
    varA = a;
    varB = b;
    varC = c;
    varD = d;
    varE = e;
}

// Method added to force the compiler to think that the variables are used

public void print()
{
    Console.WriteLine("This is class {0}", this.ToString());
    Console.WriteLine("varA is {0}", varA);
    Console.WriteLine("varB is {0}", varB);
    Console.WriteLine("varC is {0}", varC);
    Console.WriteLine("varD is {0}", varD);
    Console.WriteLine("varE is {0}", varE);
}

The above methods are never called, they are designed to fool the compiler into believing that the variables are set and read. If the compiler believes variables are unused, it is allowed to optimize away the variables. If the compiler ignores the variables, the whole test is useless. To measure the impact of initialization on method calls, I created two static methods:

/// <summary>

/// Method to measure uninitialized local variables execution time

/// </summary>

static int methodUnInit(string s, DataSet d, int i)
{
    // Declare local variables

    string var1;
    string var2;
    string var3;
    DataSet var4;
    DataSet var5;
    DataSet var6;
    int var7;
    int var8;
    int var9;
    
    ...
}


/// <summary>

/// Method to measure initialized local variables execution time

/// </summary>

static int methodInit(string s, DataSet d, int i)
{
    // Declare local variables

    string var1 = null;
    string var2 = null;
    string var3 = null;
    DataSet var4 = null;
    DataSet var5 = null;
    DataSet var6 = null;
    int var7 = 0;
    int var8 = 0;
    int var9 = 0;
    
    ...
}

The body of both methods is identical:

    ...
    
    // Fool compiler to think variables are set

    var1 = s;
    var2 = s;
    var3 = s;
    var4 = d;
    var5 = d;
    var6 = d;
    var7 = i;
    var8 = i;
    var9 = i;

    // Fool compiler to think variables are used

    if ((null != var1) && (null != var2) && (null != var3)
        && (null != var4) && (null != var5) && (null != var6)
        && (0 != var7) && (0 != var8) && (0 != var9)) 
    {
        var9 = 0;
    }

    return var9;
}

I had to add the setting and use code to force the compiler not to ignore the variables. Note that the above code results in double initialization for the second method, but initialization impact is what we are here to measure.

The test harness

The test class does the following:

Pre-allocates the test result containers.
Explicitly performs garbage collection (by calling GC.Collect()) before executing each test. Explicit garbage collection is used to minimize the effect of garbage collection on the time (and performance) measurements.
Loop
- Randomly decides which test to run.
- Runs the selected test and stores the result.
Once testing is complete, calculates the average performance for each test based on the collected data.

Why introduce randomness into the test?

If the test harness was to always perform the tests in a certain order, the order of execution might affect the measured performance. For some reason, performing test A and then B and then C always resulted in lower performance on the B test as compared to running A then C then B.

To avoid order problems, the test harness randomly chooses which test to perform. As the random distribution used is uniform, each test is as likely to be executed as any other test. Given enough repetitions, each test is executed (on average) about the same number of times.

The test harness testABC() method executes 300 tests, randomly deciding at each iteration which test to run.

static void measureABC(int repetitions)
{
    Random r = new Random((int)DateTime.Now.Ticks);

    for (int i = 0; i < 300; ++i) 
    {
        // Perform garbage collection to remove GC from the time measurement

        GC.Collect();

        // Show progress...

        if (0 == i % 5) 
        {
            Console.Write(".");
        }

        // Randomly choose which class creation to measure

        switch (r.Next(5))
        {
            case 0:
                measureA(repetitions);
                break;
            case 1:
                measureB(repetitions);
                break;
            case 2:
                measureC(repetitions);
                break;
            case 3:
                measureD(repetitions);
                break;
            case 4:
                measureE(repetitions);
                break;
        }
    }

    Console.WriteLine();

    // Calculate average times by dividing total time by number of occurances.

    // Note that according to my own assertion I should not be initializing here

    // but old habits are very hard to break and anyway, this is only executed

    // once every couple of minutes.

    double totalA = 0.0;
    double totalB = 0.0;
    double totalC = 0.0;
    double totalD = 0.0;
    double totalE = 0.0;

    foreach (double t in Atime) 
    {
        //Console.WriteLine("A {0}", t);

        totalA += t;
    }
    foreach (double t in Btime) 
    {
        //Console.WriteLine("B {0}", t);

        totalB += t;
    }
    foreach (double t in Ctime) 
    {
        //Console.WriteLine("C {0}", t);

        totalC += t;
    }
    foreach (double t in Dtime) 
    {
        //Console.WriteLine("D {0}", t);

        totalD += t;
    }
    foreach (double t in Etime) 
    {
        //Console.WriteLine("E {0}", t);

        totalE += t;
    }

    // Output results

    Console.WriteLine("No init: {0:f2}    Init on Def: {1:f2}" + 
       "    Init in Const: {2:f2}", totalA / Atime.Count, 
       totalB / Btime.Count, totalC / Ctime.Count);

    Console.WriteLine("Method No init: {0:f2}    Method Init on Def: {1:f2}",
        totalD / Dtime.Count, totalE / Etime.Count);
}

To get a better perspective, the entire test suite is run 10 times by the Main() method.

The results

Using Visual Studio .NET 2003 in "Release" mode (with the optimizer using default settings) and running on my Win2K 1.6GHz Pentium-M machine:

Creating an object and not initializing variables ~503mSec (100%)
Creating an object and initializing on definition ~557mSec (111%)
Creating an object and initializing in the constructor ~582mSec (116%)
Calling a method and not initializing local variables ~253mSec (100%)
Calling a method and initializing variables ~316mSec (125%)

Conclusions

If an object is heavily used (created many times a second) � don�t initialize the variables. Not initializing shaves a nice 10% off the initialization. Obviously, if you do time-consuming and lengthy initialization in the constructor � the variable initialization time will become insignificant and lost in the noise.

For short and quick methods which are heavily used � avoid initializing local variables to default values � just trust the CLR to do it for you.

Surprisingly, initializing variables in the constructor is a little slower than initializing on definition but this may be explained by the fact that class B has an empty constructor so the constructor calling code might be eliminated by the optimizer, saving a little time.

Further work

Lobby Microsoft to improve the optimizer � have the optimizer strip away unnecessary initializations.

If anybody has access to an alternative .NET compiler (Mono, Borland), please publish the results of your tests.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here