Introduction
Development of large web online mapping project www.GoMap.Az required intensive processing of web requests and geographical data. To simplify and accelerate calculations, the author decided to develop specialized data types. One of type need to improve the code was new integral type named SmartInt
that was equivalent to int
. Designing and implementing of this type is explained in this brief article. Many other types can be created in a similar manner.
This article describes:
- Creating new type with specific meaning that has no disadvantages in size nor speed
- Simplifying and enriching usage of standard types
- Create structures similar to
Nullable
types, but without size overhead
- Unit Tests to create high reliable code
- Performance measurements
- Benefits of studying of open source codes of .NET Framework
- Open source project contained enterprise level code
This article should not be considered as improvements or replacement of .NET components nor trial to find weakness or disadvantages in standard .NET. .NET is a global framework that was developed for universal use. This means it is compact, has reasonable number of functions/methods, fast enough and resource consuming enough for many use cases. SSTypes
in its turn is specialized to be used in several cases (the main purpose was string
parsing and input sanitation). Such limitation of usage allows to optimize it for performance. SSTypesLT
project is under development and I hope that published information will be useful. Besides all, it can be used as an example of creating of custom types.
The Problem
Intensive data transformations and calculations require quick and easy to use code. Required improvements include:
- Accelerating of transformation from text
string
to numeric value was the main required improvement. New Parse
method should convert input values to SmartInt
quickly and without throwing exceptions.
SmartInt
should also contain flag representing storage of NoValue
or BadValue
. In other words, provide functionality like Nullable<int>
.
- The new type should work whenever
System.Int32
(Int32
in further text) works – in calculations, as an array index, as a member within structures and classes.
Cost-Free Improvement
The main idea of improvement is based on the fact that class derived from another class keeps features of parent and specializes it. This means that it will only constrict functionality if no new members or methods are added. So, there is an ability to accelerate existing functionality and keep occupied space intact.
Another fact is the ability of modern compilers to optimize the code. This give us the possibility to add semantics that will not add additional burden to final native code.
There are two main groups of types in C# - reference types and value types. All classes are reference types. Create net type on base of class will add support of OOP as an unnecessary burden. Value types do not add such additional things.
Source codes of Int32
are the excellent starting point and they are available now (for example, by Googling “System.Int32 source code”). These codes show usage of relation named in COM world “Containment” (as is described in old book “Programming Visual C++”, Microsoft Press, 1998). Containment is one way how inheritance can be implemented for binary COM objects. This technique is a way to inherit parent’s features and create new type that can be used whenever parent type uses (see chapter Nonconventional OOP below). Let’s consider this technique in detail.
Create SmartInt Structure
The simplest code for SmartInt
structure will be as follows:
public struct SmartInt
{
private System.Int32 m_v;
public static readonly SmartInt BadValue = System.Int32.MinValue;
public static readonly SmartInt MaxValue = System.Int32.MaxValue;
public static readonly SmartInt MinValue = System.Int32.MinValue + 1;
public SmartInt(System.Int32 value)
{
m_v = value;
}
public SmartInt(System.Int32? value)
{
if (value.HasValue)
m_v = value.Value;
else
m_v = BadValue.m_v;
}
public bool isBad()
{
return m_v == BadValue.m_v;
}
public static implicit operator SmartInt(System.Int32 value)
{
return new SmartInt(value);
}
public static implicit operator System.Int32(SmartInt value)
{
return value.m_v;
}
}
The full cope for SmartInt is available at https://github.com/dgakh/SSTypes/blob/master/SSTypesLT/SSTypesLT/Native/SmartInt.cs.
Specific features of the code example include:
m_v
and is contained within SmartInt
that forms “Containment
” relationship.
- Constructors allowed to be built up from
int
and from int?
(Nullable<int>
) values
- implicit operator
SmartInt(System.Int32 value)
allows implicitly conversion from System.Int32
to SmartInt
.
- implicit operator
System.Int32(SmartInt value)
allows implicitly conversion from SmartInt
to System.Int32
.
As result, the following code can compile and run:
SmartInt a = 35;
int b = a;
int c = b * 2;
a = c;
a = 3;
string h = "Hello, World !";
char ch = h[a];
JIT will optimize of this code in a way SmartInt
works fully similar to int
. No performance loss or size overhead observed (this is confirmed by the tests for .NET Framework 4.0 and higher).
Nonconventional OOP
Producing new structure SmartInt
from existing Int32
can be considered in terms of OOP. Let’s review the main features.
Encapsulation
SmartInt
encapsulates fields and methods, including public
/protected
levels of access. There is no difference with encapsulation for classes.
Inheritance
SmartInt
has features of its parent type Int32
. You can see the overloaded functions such as ToString
, Equals
, GetHashCode
and exposed Int32
functions such as GetType
, formatted ToString
.
Polymorphism
Polymorphism is presented as on value level, where SmartInt
can be used whenever its parent Int32
is used, for example:
int a = SmartInt.Parse(“38405”); SmartInt ind = 7;
SmartInt len = 5;
String hw = "Hello, World !";
Console.WriteLine(hw[ind]); String ww = hw.Substring(ind, len);
Abstraction
SmartInt
reduces abstraction level of Int32
, but not big enough. Type Age
that is mentioned in chapter “Semantics” below shows more deep specialization and reducing of abstraction.
Semantics
Creating another structure based on Int32
can introduce type with another meaning.
public struct Age
{
private System.Int32 m_v;
public static readonly SmartInt MaxValue = 150;
public static readonly SmartInt MinValue = 0;
public static implicit operator Age(System.Int32 value)
{
return new SmartInt(value);
}
public static implicit operator System.Int32(Age value)
{
return value.m_v;
}
}
Although SmartInt
and Age
types are binary similar (because are based on the same Int32
type), on level of C# language, they are incompatible.
The following code will not compile:
Age a = 7;
SmartInt s1 = a;
SmartInt s2 = (int)a;
For all compiled code, JIT optimization will remove the burden.
Nullable Types
Nullable
types are very convenient to use in case where NoValue
should be used. But making value type Nullable
increases its size because requires extra storage for the status. In general, size of this extra space equals to the alignment (32 bits for many platforms). This increases total size of Int32?
twice comparing to Int32
.
This size increasing has two drawbacks:
- Increases size, aspecially for arrays
- Make cache hit ratio worse
- Influence to Garbage Collector (allocation in more long-live generation)
SmartInt
has features similar to Nullable<Int32>
, but keep its original size equals to size of Int32
. In other words, functionality of Nullable<Int32>
was added to Int32
without any burden.
Adding the following methods to SmartInt
makes it compatible with Int32?
(Nullable<Int32>
):
public static implicit operator SmartInt(System.Int32? value)
{
if (!value.HasValue)
return SmartInt.BadValue;
return new SmartInt(value.Value);
}
public static implicit operator System.Int32?(SmartInt value)
{
if (value.isBad())
return null;
return new System.Int32?(value.m_v);
}
public bool HasValue
{
get { return !isBad(); }
}
public SmartInt Value
{
get
{
if (!HasValue)
throw new System.InvalidOperationException("Must have a value.");
else
return this;
}
}
public SmartInt GetValueOrDefault()
{
return HasValue ? this : default(SmartInt);
}
These improvements allow to use SmartInt
similar to Int32?
in many cases. For example:
SmartInt si = null;
if (si == null)
return 0;
Unfortunately, the following code C# will not be compiled:
SmartInt x = null;int y = x ?? -1;
But a bit corrected code will be compiled successfully:
SmartInt x = null;
int y = (int?)x ?? -1;
Method Parse()
Parse
is the most improved method in SmartInt
. Improvements include specialization for for parsing only Int32
value types with input sanitation. Input sanitation allowed many input data types and value ranges where values that cannot be parsed do not throw and set output value to BadValue
. While test shows significant performance improvement, the development, testing, and documenting of the method is not stopped.
- It shows 4x performance improvement comparing to standard
Int32.Parse
(see chapter “Performance Tests” below)
- It can parse substring of
string
without need to split it. This ability gives additional improvement of performance
- There are different overloads of
Parse
taking different types of data
- It does not throw an exception
Method ToString()
Method ToString
was improved to provide output to StringBuilder
. This technique does not create temporary string
s in case of composing complex textual structures such as XML or JSON.
Usage for Input Sanitation
SmartInt
is useful in input sanitation. For example, extracting of web request values can be coded in three lines:
SmartInt id = SmartInt.Parse(context.Request.QueryString["id"]);
if (id.isBadOrNegative())
return;
Another method SmartInt.isBad()
can be used to check is value not bad (is negative, 0, or positive).
Throwing can significantly reduce performance and resistance to DDOS attacks. SmartInt.Parse
does not throw exceptions and quickly parses many types of input values.
Unit Tests
Unit Testing is the important component of SDLC and is used to ensure correct behavior of changing code. SmartInt
is a small and simple type and should be as intensively as possible tested before use. The type that is described here was intensively tested and its correctness is proven by use in production environment.
For instance, one of used test cases is brute force cycled parsing random string
values by SmartInt.Parse
and Int32.Parse
, and comparing the outputs. The following procedure tests common, negative, and signed by plus values:
[TestMethod]
public void Test_SmartInt_Parse_BruteForce()
{
int test_count = 10000000;
Random rnd = new Random();
for (int i = 0; i < test_count; i++)
{
int v = rnd.Next();
string s = v.ToString();
string sp = "+" + s;
string sn = "-" + s;
SmartInt siv = SmartInt.Parse(s);
SmartInt sipv = SmartInt.Parse(sp);
SmartInt sinv = SmartInt.Parse(sn);
Assert.IsTrue( (siv == v) && (sipv == v)
&& (sinv == -v), "Parsing " + v.ToString());
}
}
There are other tests ensuring quality of code. Author increases number of tests to ensure in reliability of code.
Performance Tests
Performance tests were implemented by technique using BenchmarkDotNet (https://github.com/PerfDotNet). The tests show performance improvement more than in 4 times. Development is not stopped and continues to assure that the code is bug-free.
Test environment:
BenchmarkDotNet=v0.9.1.0
OS=Microsoft Windows NT 6.1.7601 Service Pack 1
Processor=Intel(R) Core(TM) i7-3610QM CPU @ 2.30GHz, ProcessorCount=4
Frequency=2241298 ticks, Resolution=446.1700 ns
HostCLR=MS.NET 4.0.30319.42000, Arch=64-bit RELEASE [RyuJIT]
Type=SmartIntBM_Parse_IntSmartInt_9_Digit Mode=Throughput
-----------------------------------------------------------------------------------------------
Method Median StdDev
Parse_Int_9_Digit 16.1033 ms 0.3239 ms
Parse_SmartInt_9_Digit 3.4312 ms 0.0584 ms
-----------------------------------------------------------------------------------------------
There are many other observed performance improvements. Positive difference is observed for modern versions of .NET Framework. Versions prior to 4.0 should not be used because optimization in these versions is not effective enough.
Although observed results show advantage in performance, work on improvement of code and testing continue.
Size Tests
Simple size test shows that SmartInt
uses the same memory size as int uses. Int32?
uses two time bigger size. Observed outcomes are equal to expected.
public static void ArraySize()
{
int objects_count = 1000;
long memory1 = GC.GetTotalMemory(true);
int[] ai = new int[objects_count];
long memory2 = GC.GetTotalMemory(true);
int?[] ani = new int?[objects_count];
long memory3 = GC.GetTotalMemory(true);
SmartInt[] asi = new SmartInt[objects_count];
long memory4 = GC.GetTotalMemory(true);
Console.WriteLine("Array sizes {0}, {1}, {2}", ai.Length, ani.Length, asi.Length);
Console.WriteLine("Memory for int \t {0}", memory2 - memory1);
Console.WriteLine("Memory for int? \t {0}", memory3 - memory2);
Console.WriteLine("Memory for SmartInt \t {0}", memory4 - memory3);
}
Output:
Array sizes 1000, 1000, 1000
Memory for int 4024
Memory for int? 8024
Memory for SmartInt 4024
Press any key to exit.
Using the Code
There is a small C# project created in Microsoft Visual Studio Community 2015. Before running the examples, you need to install SSTypesLT NuGet package (as described here https://www.nuget.org/packages/SSTypesLT for example). The project contains just a small portion of code for quick start. Additional examples are available on the project site by the link https://github.com/dgakh/SSTypes.
Points of Interest
Techniques that were touched by this article are interesting and require further studying. Abilities to improve code should not be ignored by developers, especially the following:
- Containment within structure does not add burden to execution code if modern compiler is used.
- It is possible to create save types with semantics controlled by compiler without any runtime overhead (may be only for first start).
- The logic that is almost the same as for
Nullable
types can be created without overhead. Nullable
consumes extra 32 bits for the status control on many platforms.
- The tests show that it is possible to write methods that are faster than standard ones.
- Studying of open source codes can give many interesting ideas.
Practical use of described technique can include:
- Development of mobile applications, where resources are short
- Big Data where each save byte can give us saving storage space, as increased throughput and saved time
- Researches in C# language, CLR, .Net Framework, and so on
References