Introduction
This article is an attempt to compare the general performance of STL/CLR's
sequence containers with the .NET generic List<T>
collection class. Before I
began work on the article, I strongly believed that the STL/CLR containers would
be yards faster. To my utmost surprise, I found that this was not so and that
List<T>
surpassed the STL/CLR collections with ease.
How I compared performance
I wanted to keep things simple and used the common technique
of repeating a specific operation several times. To smoothen the design, I have an
interface as follows :-
namespace STLCLRTests
{
public interface class IMeasurable
{
Int64 RunCode(int iterations);
};
}
RunCode
would run a specific piece of code as many
times as specified by iterations
, and would return the time taken in milliseconds.
And I have the following abstract
class that implements this
interface.
namespace STLCLRTests
{
public ref class MeasurableDoubleOp abstract : IMeasurable
{
private:
static Stopwatch^ stopWatch = gcnew Stopwatch();
public:
virtual Int64 RunCode(int iterations)
{
Initialize();
stopWatch->Reset();
stopWatch->Start();
RunCodeFirstOp(iterations);
RunCodeSecondOp(iterations);
stopWatch->Stop();
return stopWatch->ElapsedMilliseconds;
}
protected:
virtual void Initialize() {}
virtual void RunCodeFirstOp(int iterations) abstract;
virtual void RunCodeSecondOp(int iterations) abstract;
};
}
To profile a certain collection class, I just derive from
this
abstract class and implement RunCodeFirstOp
and RunCodeSecondOp
. I also have a
MeasurableSingleOp
class for doing tests that do not
involve a two-step operation.
STL vector vs List<T> - basic insertion/removal
Here are the implementations of the vector
specific and List<T>
specific classes.
namespace STLCLRTests
{
public ref class VectorInsertRemove : MeasurableDoubleOp
{
private:
cliext::vector<int> vector;
protected:
IEnumerable<int>^ GetVector()
{
return %vector;
}
public:
virtual void RunCodeFirstOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
vector.push_back(10);
}
}
virtual void RunCodeSecondOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
vector.pop_back();
}
}
};
}
namespace STLCLRTests
{
public ref class GenericListInsertRemove : MeasurableDoubleOp
{
private:
List<int> list;
protected:
IEnumerable<int>^ GetList()
{
return %list;
}
public:
virtual void RunCodeFirstOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
list.Add(10);
}
}
virtual void RunCodeSecondOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
list.RemoveAt(list.Count - 1);
}
}
};
}
And here are my test results. As you can see, the BCL class (List<T>
)
completely outperformed the STL/CLR vector
class.
Iterations | STL/CLR | BCL |
---|
100000 | 15 | 3 |
500000 | 63 | 32 |
1000000 | 122 | 21 |
10000000 | 1311 | 299 |
Here's a graphical plot of how the two containers performed. Clearly, the BCL class's performance was quite superior to the STL vector's.
As you can imagine I was quite surprised by this result. Just
for the heck of it I thought I should also compare the standard STL vector with
the STL/CLR vector implementation. Note than I am still using managed code
(/clr) - the standard STL code is also compiled as /clr. Here are my surprising
results.
Iterations | STL/CLR | Standard STL |
---|
100000 | 11 | 39 |
500000 | 58 | 202 |
1000000 | 117 | 391 |
10000000 | 1161 | 3919 |
Based on that result, you should absolutely avoid compiling
native STL code using /clr. Merely porting to STL/CLR would help performance in
a big way. You might find that all you need is a namespace change (cliext
to std
) and you may not have to change too much code elsewhere. And no, I
did not conclude this merely on my test results with vector
, I compared the
standard list
and the STL/CLR list
containers with the following results.
Iterations | STL/CLR | Std list |
---|
100000 | 33 | 101 |
500000 | 63 | 175 |
1000000 | 274 | 349 |
10000000 | 2969 | 3663 |
As you can see, the difference in performance is non-trivial.
Please note that we are not comparing the native performance of STL here. We are
comparing how the native implementation when compiled under /clr compares with
the CLR implementation of STL.
STL list vs List<T> - basic insertion/removal
Here's my implementation for the STL list
specific class.
namespace STLCLRTests
{
public ref class StlListInsertRemove : MeasurableDoubleOp
{
private:
cliext::list<int> list;
public:
virtual void RunCodeFirstOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
list.push_back(10);
}
}
virtual void RunCodeSecondOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
list.pop_back();
}
}
};
}
And here are my test results. Here, the contrast is even more
- not surprising really, as the STL list
will always be slower than the
STL vector
for straight inserts and removals.
Iterations | STL/CLR | BCL |
---|
100000 | 32 | 2 |
500000 | 149 | 11 |
1000000 | 332 | 23 |
10000000 | 3719 | 331 |
And here's a graphical plot of the results.
STL deque vs List<T> - basic insertion/removal
Here's the deque
implementation.
namespace STLCLRTests
{
public ref class DequeInsertRemove : MeasurableDoubleOp
{
private:
cliext::deque<int> deque;
public:
virtual void RunCodeFirstOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
deque.push_back(10);
}
}
virtual void RunCodeSecondOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
deque.pop_back();
}
}
};
}
Here are my results. Nothing's changed in the pattern - the
BCL class is way faster here too.
Iterations | STL/CLR | BCL |
---|
100000 | 33 | 2 |
500000 | 66 | 13 |
1000000 | 83 | 26 |
10000000 | 1061 | 251 |
And here's the graph.
The BCL equivalent of a queue is the Queue<T>
class - so
just to be sure we are comparing apples with apples, I went ahead and ran tests
comparing the STL/CLR deque
with the BCL Queue<T>
. My results and
the corresponding graph follow.
Iterations | STL/CLR | BCL |
---|
100000 | 12 | 6 |
500000 | 49 | 15 |
1000000 | 89 | 28 |
10000000 | 1044 | 335 |
The Queue<T>
class seems to be marginally slower than
List<T>
but is still way faster than the STL/CLR deque
container.
STL vector vs List<T> - basic iteration
This time, I wanted to test the speed with which we can
iterate over a linear collection. Here are the vector
and List<T>
specific iteration test
implementations.
namespace STLCLRTests
{
public ref class VectorIterate : MeasurableDoubleOp
{
private:
cliext::vector<int> vector;
public:
virtual void RunCodeFirstOp(int iterations) override
{
vector.clear();
for(int count=0; count<iterations; count++)
{
vector.push_back(10);
}
}
virtual void RunCodeSecondOp(int iterations) override
{
for(cliext::vector<int>::iterator it = vector.begin(); it != vector.end(); it++)
{
}
}
};
}
namespace STLCLRTests
{
public ref class GenericListIterate : MeasurableDoubleOp
{
private:
List<int> list;
public:
virtual void RunCodeFirstOp(int iterations) override
{
list.Clear();
for(int count=0; count<iterations; count++)
{
list.Add(10);
}
}
virtual void RunCodeSecondOp(int iterations) override
{
for each(int x in list)
{
}
}
};
}
Here are my test results. The results further
proved the superior efficiency of the List<T>
class.
Iterations | STL/CLR | BCL |
---|
100000 | 24 | 2 |
500000 | 93 | 16 |
1000000 | 194 | 31 |
10000000 | 2009 | 394 |
And here's the corresponding graph.
STL vector vs List<T> - Linq access (where)
For the Linq tests, I used a C# project (for easier syntax). I
derived from the insert tester and merely overrode the RunCodeSecondOp
method as I wanted to keep the insertion code intact.
namespace LinqTests
{
public class VectorLinqWhere : VectorInsertRemove
{
public override void RunCodeSecondOp(int iterations)
{
IEnumerable<int> _vector = GetVector();
var newVector = _vector.Where(x => x % 2 == 0);
}
}
}
namespace LinqTests
{
public class GenericListLinqWhere : GenericListInsertRemove
{
public override void RunCodeSecondOp(int iterations)
{
IEnumerable<int> _list = GetList();
var newList = _list.Where(x => x % 2 == 0);
}
}
}
Here are the results of my test runs. The results here are
partially contaminated by the fact that the insertion code speed differences
would also come into play. But the difference in performance is large enough to
safely ignore that for now, and again LINQ works much faster on the BCL class as
compared to the STL/CLR version.
Iterations | STL/CLR | BCL |
---|
100000 | 18 | 1 |
500000 | 44 | 7 |
1000000 | 79 | 11 |
10000000 | 842 | 168 |
And here's the graph.
STL vector vs List<T> - Linq access (take)
This is similar to the previous one except I use Take
instead of Where
.
namespace LinqTests
{
public class VectorLinqTake : VectorInsertRemove
{
public override void RunCodeSecondOp(int iterations)
{
IEnumerable<int> _vector = GetVector();
var newVector = _vector.Take(_vector.Count() / 2);
}
}
}
namespace LinqTests
{
public class GenericListLinqTake : GenericListInsertRemove
{
public override void RunCodeSecondOp(int iterations)
{
IEnumerable<int> _list = GetList();
var newList = _list.Take(_list.Count() / 2);
}
}
}
Here's the result of my tests. These results are very similar
to the previous test.
Iterations | STL/CLR | BCL |
---|
100000 | 7 | 0 |
500000 | 35 | 4 |
1000000 | 70 | 10 |
10000000 | 865 | 205 |
And the corresponding graph.
Sorting
I ran tests comparing sorting speeds of the List<T>
class with
the STL/CLR vector
and list
containers. The code used follows.
namespace STLCLRTests
{
public ref class GenericListSort : MeasurableDoubleOp
{
private:
List<int> list;
protected:
IEnumerable<int>^ GetList()
{
return %list;
}
public:
virtual void RunCodeFirstOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
list.Add(10);
}
}
virtual void RunCodeSecondOp(int iterations) override
{
list.Sort();
}
};
}
namespace STLCLRTests
{
public ref class StlListSort : MeasurableDoubleOp
{
private:
cliext::list<int> list;
public:
virtual void RunCodeFirstOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
list.push_back(10);
}
}
virtual void RunCodeSecondOp(int iterations) override
{
list.sort();
}
};
}
namespace STLCLRTests
{
public ref class VectorSort : MeasurableDoubleOp
{
private:
cliext::vector<int> vector;
protected:
IEnumerable<int>^ GetVector()
{
return %vector;
}
public:
virtual void RunCodeFirstOp(int iterations) override
{
for(int count=0; count<iterations; count++)
{
vector.push_back(10);
}
}
virtual void RunCodeSecondOp(int iterations) override
{
sort(vector.begin(), vector.end());
}
};
}
Here are the results for vector
versus List<T>
.
Iterations | STL/CLR | BCL |
---|
100000 | 37 | 7 |
500000 | 136 | 53 |
1000000 | 325 | 137 |
10000000 | 2695 | 1088 |
And here are my results for stl list
versus List<T>
.
Iterations | STL/CLR | BCL |
---|
100000 | 138 | 7 |
500000 | 1162 | 51 |
1000000 | 5355 | 128 |
10000000 | 31985 | 1095 |
Conclusion
One of the features that was strongly marketed before STL/CLR was released was
its performance benefits over regular .NET collections. But the .NET generic
List<T>
seems to be much faster. At this stage all I can think of as a valid
case for using STL/CLR would be when doing a first-level port of existing C++
code ( that heavily uses STL) to managed code.