(untagged)

Accelerated .NET Types

Dmitriy Gakh

0.00/5 (No votes)

28 Mar 2016

Intensive Big Data processing and Mobile Applications require fast calculations and compact data storage. Design of new quick and save .NET types with small overhead is not a simple task. This article describes creating of such type without overhead and with advantages only.

Download examples - 7.6 KB (see chapter "Using the Code" below)

Introduction

Development of large web online mapping project www.GoMap.Az required intensive processing of web requests and geographical data. To simplify and accelerate calculations, the author decided to develop specialized data types. One of type need to improve the code was new integral type named SmartInt that was equivalent to int. Designing and implementing of this type is explained in this brief article. Many other types can be created in a similar manner.

This article describes:

Creating new type with specific meaning that has no disadvantages in size nor speed
Simplifying and enriching usage of standard types
Create structures similar to Nullable types, but without size overhead
Unit Tests to create high reliable code
Performance measurements
Benefits of studying of open source codes of .NET Framework
Open source project contained enterprise level code

This article should not be considered as improvements or replacement of .NET components nor trial to find weakness or disadvantages in standard .NET. .NET is a global framework that was developed for universal use. This means it is compact, has reasonable number of functions/methods, fast enough and resource consuming enough for many use cases. SSTypes in its turn is specialized to be used in several cases (the main purpose was string parsing and input sanitation). Such limitation of usage allows to optimize it for performance. SSTypesLT project is under development and I hope that published information will be useful. Besides all, it can be used as an example of creating of custom types.

The Problem

Intensive data transformations and calculations require quick and easy to use code. Required improvements include:

Accelerating of transformation from text string to numeric value was the main required improvement. New Parse method should convert input values to SmartInt quickly and without throwing exceptions.
SmartInt should also contain flag representing storage of NoValue or BadValue. In other words, provide functionality like Nullable<int>.
The new type should work whenever System.Int32 (Int32 in further text) works – in calculations, as an array index, as a member within structures and classes.

Cost-Free Improvement

The main idea of improvement is based on the fact that class derived from another class keeps features of parent and specializes it. This means that it will only constrict functionality if no new members or methods are added. So, there is an ability to accelerate existing functionality and keep occupied space intact.

Another fact is the ability of modern compilers to optimize the code. This give us the possibility to add semantics that will not add additional burden to final native code.

There are two main groups of types in C# - reference types and value types. All classes are reference types. Create net type on base of class will add support of OOP as an unnecessary burden. Value types do not add such additional things.

Source codes of Int32 are the excellent starting point and they are available now (for example, by Googling “System.Int32 source code”). These codes show usage of relation named in COM world “Containment” (as is described in old book “Programming Visual C++”, Microsoft Press, 1998). Containment is one way how inheritance can be implemented for binary COM objects. This technique is a way to inherit parent’s features and create new type that can be used whenever parent type uses (see chapter Nonconventional OOP below). Let’s consider this technique in detail.

Create SmartInt Structure

The simplest code for SmartInt structure will be as follows:

public struct SmartInt
{
    private System.Int32 m_v;

    public static readonly SmartInt BadValue = System.Int32.MinValue;
    public static readonly SmartInt MaxValue = System.Int32.MaxValue;
    public static readonly SmartInt MinValue = System.Int32.MinValue + 1;

    // Constructs from int value
    public SmartInt(System.Int32 value)
    {
        m_v = value;
    }

    // Constructs from int? value 
    public SmartInt(System.Int32? value)
    {
        if (value.HasValue)
            m_v = value.Value;
        else
            m_v = BadValue.m_v;
    }

    // Checks validity of the value
    public bool isBad()
    {
        return m_v == BadValue.m_v;
    }

    public static implicit operator SmartInt(System.Int32 value)
    {
        return new SmartInt(value);
    }

    public static implicit operator System.Int32(SmartInt value)
    {
        return value.m_v;
    }
}

The full cope for SmartInt is available at https://github.com/dgakh/SSTypes/blob/master/SSTypesLT/SSTypesLT/Native/SmartInt.cs.

Specific features of the code example include:

m_v and is contained within SmartInt that forms “Containment” relationship.
Constructors allowed to be built up from int and from int? (Nullable<int>) values
implicit operator SmartInt(System.Int32 value) allows implicitly conversion from System.Int32 to SmartInt.
implicit operator System.Int32(SmartInt value) allows implicitly conversion from SmartInt to System.Int32.

As result, the following code can compile and run:

SmartInt a = 35;

int b = a;

int c = b * 2;

a = c;

a = 3;

string h = "Hello, World !";

char ch = h[a]; // Use SmartInt as an array index

JIT will optimize of this code in a way SmartInt works fully similar to int. No performance loss or size overhead observed (this is confirmed by the tests for .NET Framework 4.0 and higher).

Nonconventional OOP

Producing new structure SmartInt from existing Int32 can be considered in terms of OOP. Let’s review the main features.

Encapsulation

SmartInt encapsulates fields and methods, including public/protected levels of access. There is no difference with encapsulation for classes.

Inheritance

SmartInt has features of its parent type Int32. You can see the overloaded functions such as ToString, Equals, GetHashCode and exposed Int32 functions such as GetType, formatted ToString.

Polymorphism

Polymorphism is presented as on value level, where SmartInt can be used whenever its parent Int32 is used, for example:

int a = SmartInt.Parse(&ldquo;38405&rdquo;); // Parsed SmartInt value is assigned to int
SmartInt ind = 7;
SmartInt len = 5;
String hw = "Hello, World !";
Console.WriteLine(hw[ind]); // SmartInt value is used as index
String ww = hw.Substring(ind, len); // SmartInt values are used as function&rsquo;s arguments

Abstraction

SmartInt reduces abstraction level of Int32, but not big enough. Type Age that is mentioned in chapter “Semantics” below shows more deep specialization and reducing of abstraction.

Semantics

Creating another structure based on Int32 can introduce type with another meaning.

public struct Age
{
    private System.Int32 m_v;

    public static readonly SmartInt MaxValue = 150;
    public static readonly SmartInt MinValue = 0;

    public static implicit operator Age(System.Int32 value)
    {
        return new SmartInt(value);
    }

    public static implicit operator System.Int32(Age value)
    {
        return value.m_v;
    }
}

Although SmartInt and Age types are binary similar (because are based on the same Int32 type), on level of C# language, they are incompatible.

The following code will not compile:

Age a = 7; // Ok &ndash; int can be converted to Age

// Compilation error &ndash; no operator for conversion from Age to SmartInt
SmartInt s1 = a;

// Ok &ndash; value will be assigned through conversion Age->int and int->SmartInt
SmartInt s2 = (int)a;

For all compiled code, JIT optimization will remove the burden.

Nullable Types

Nullable types are very convenient to use in case where NoValue should be used. But making value type Nullable increases its size because requires extra storage for the status. In general, size of this extra space equals to the alignment (32 bits for many platforms). This increases total size of Int32? twice comparing to Int32.

This size increasing has two drawbacks:

Increases size, aspecially for arrays
Make cache hit ratio worse
Influence to Garbage Collector (allocation in more long-live generation)

SmartInt has features similar to Nullable<Int32>, but keep its original size equals to size of Int32. In other words, functionality of Nullable<Int32> was added to Int32 without any burden.

Adding the following methods to SmartInt makes it compatible with Int32? (Nullable<Int32>):

//Converts the value of System.Int32? to SmartInt.
public static implicit operator SmartInt(System.Int32? value)
{
    if (!value.HasValue)
        return SmartInt.BadValue;

    return new SmartInt(value.Value);
}

// Converts the value of SmartInt to System.Int32?.
public static implicit operator System.Int32?(SmartInt value)
{
    if (value.isBad())
        return null;

    return new System.Int32?(value.m_v);
}

public bool HasValue
{
    get { return !isBad(); }
}

public SmartInt Value
{
    get
        {
            if (!HasValue)
                throw new System.InvalidOperationException("Must have a value.");
            else

            return this;
        }
}

public SmartInt GetValueOrDefault()
{
    return HasValue ? this : default(SmartInt);
}

These improvements allow to use SmartInt similar to Int32? in many cases. For example:

// Assign null to structure
SmartInt si = null;

// Checks if structure is null
if (si == null)
    return 0;

Unfortunately, the following code C# will not be compiled:

SmartInt x = null;int y = x ?? -1;

But a bit corrected code will be compiled successfully:

SmartInt x = null;
int y = (int?)x ?? -1;

Method Parse()

Parse is the most improved method in SmartInt. Improvements include specialization for for parsing only Int32 value types with input sanitation. Input sanitation allowed many input data types and value ranges where values that cannot be parsed do not throw and set output value to BadValue. While test shows significant performance improvement, the development, testing, and documenting of the method is not stopped.

It shows 4x performance improvement comparing to standard Int32.Parse (see chapter “Performance Tests” below)
It can parse substring of string without need to split it. This ability gives additional improvement of performance
There are different overloads of Parse taking different types of data
It does not throw an exception

Method ToString()

Method ToString was improved to provide output to StringBuilder. This technique does not create temporary strings in case of composing complex textual structures such as XML or JSON.

Usage for Input Sanitation

SmartInt is useful in input sanitation. For example, extracting of web request values can be coded in three lines:

// Try parse value for parameter "id" without throwing
// Context.Request is not null, but QueryString can return any value

SmartInt id = SmartInt.Parse(context.Request.QueryString["id"]);

// Return if value did not extracted or less than 0
if (id.isBadOrNegative())
    return;

Another method SmartInt.isBad() can be used to check is value not bad (is negative, 0, or positive).

Throwing can significantly reduce performance and resistance to DDOS attacks. SmartInt.Parse does not throw exceptions and quickly parses many types of input values.

Unit Tests

Unit Testing is the important component of SDLC and is used to ensure correct behavior of changing code. SmartInt is a small and simple type and should be as intensively as possible tested before use. The type that is described here was intensively tested and its correctness is proven by use in production environment.

For instance, one of used test cases is brute force cycled parsing random string values by SmartInt.Parse and Int32.Parse, and comparing the outputs. The following procedure tests common, negative, and signed by plus values:

[TestMethod]
public void Test_SmartInt_Parse_BruteForce()
{
    int test_count = 10000000;
    Random rnd = new Random();

    for (int i = 0; i < test_count; i++)
    {
        int v = rnd.Next();

        string s = v.ToString();
        string sp = "+" + s;
        string sn = "-" + s;

        SmartInt siv = SmartInt.Parse(s);
        SmartInt sipv = SmartInt.Parse(sp);
        SmartInt sinv = SmartInt.Parse(sn);

        Assert.IsTrue( (siv == v) && (sipv == v) 
        && (sinv == -v), "Parsing " + v.ToString());
    }
}

There are other tests ensuring quality of code. Author increases number of tests to ensure in reliability of code.

Performance Tests

Performance tests were implemented by technique using BenchmarkDotNet (https://github.com/PerfDotNet). The tests show performance improvement more than in 4 times. Development is not stopped and continues to assure that the code is bug-free.

Test environment:

BenchmarkDotNet=v0.9.1.0
OS=Microsoft Windows NT 6.1.7601 Service Pack 1
Processor=Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz, ProcessorCount=4
Frequency=2241298 ticks, Resolution=446.1700 ns
HostCLR=MS.NET 4.0.30319.42000, Arch=64-bit RELEASE [RyuJIT]

Type=SmartIntBM_Parse_IntSmartInt_9_Digit Mode=Throughput

-----------------------------------------------------------------------------------------------

Method Median StdDev

Parse_Int_9_Digit 16.1033 ms 0.3239 ms

Parse_SmartInt_9_Digit 3.4312 ms 0.0584 ms

-----------------------------------------------------------------------------------------------

There are many other observed performance improvements. Positive difference is observed for modern versions of .NET Framework. Versions prior to 4.0 should not be used because optimization in these versions is not effective enough.

Although observed results show advantage in performance, work on improvement of code and testing continue.

Size Tests

Simple size test shows that SmartInt uses the same memory size as int uses. Int32? uses two time bigger size. Observed outcomes are equal to expected.

public static void ArraySize()
{
    int objects_count = 1000;

    long memory1 = GC.GetTotalMemory(true);

    int[] ai = new int[objects_count];
    long memory2 = GC.GetTotalMemory(true);

    int?[] ani = new int?[objects_count];
    long memory3 = GC.GetTotalMemory(true);

    SmartInt[] asi = new SmartInt[objects_count];
    long memory4 = GC.GetTotalMemory(true);

    // Compiler can optimize and do not allocate arrays if they are not used
    // So we write their lengths

    Console.WriteLine("Array sizes {0}, {1}, {2}", ai.Length, ani.Length, asi.Length);
    Console.WriteLine("Memory for int \t {0}", memory2 - memory1);
    Console.WriteLine("Memory for int? \t {0}", memory3 - memory2);
    Console.WriteLine("Memory for SmartInt \t {0}", memory4 - memory3);
}

Output:

Array sizes 1000, 1000, 1000
Memory for int       4024
Memory for int?      8024
Memory for SmartInt  4024
Press any key to exit.

Using the Code

There is a small C# project created in Microsoft Visual Studio Community 2015. Before running the examples, you need to install SSTypesLT NuGet package (as described here https://www.nuget.org/packages/SSTypesLT for example). The project contains just a small portion of code for quick start. Additional examples are available on the project site by the link https://github.com/dgakh/SSTypes.

Points of Interest

Techniques that were touched by this article are interesting and require further studying. Abilities to improve code should not be ignored by developers, especially the following:

Containment within structure does not add burden to execution code if modern compiler is used.
It is possible to create save types with semantics controlled by compiler without any runtime overhead (may be only for first start).
The logic that is almost the same as for Nullable types can be created without overhead. Nullable consumes extra 32 bits for the status control on many platforms.
The tests show that it is possible to write methods that are faster than standard ones.
Studying of open source codes can give many interesting ideas.

Practical use of described technique can include:

Development of mobile applications, where resources are short
Big Data where each save byte can give us saving storage space, as increased throughput and saved time
Researches in C# language, CLR, .Net Framework, and so on

References

NuGet SSTypesLT package: https://www.nuget.org/packages/SSTypesLT
SSTypes project: https://github.com/dgakh/SSTypes
PerfDotNet project: https://github.com/PerfDotNet
typedef in C#: http://www.codeproject.com/Questions/141385/typedef-in-C
Introducing Semantic Types in .NET: http://www.codeproject.com/Articles/860646/Introducing-Semantic-Types-in-Net
Lightweight Semantic Types in .NET: http://www.codeproject.com/Articles/1036239/Lightweight-Semantic-Types-in-NET#xx5141198xx
Strong Type Checking with Semantic Types: http://www.codeproject.com/Articles/1031504/Strong-Type-Checking-with-Semantic-Types
Nullable Types in C#.NET: http://www.codeproject.com/Articles/275471/Nullable-Types-in-Csharp-Net

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here