Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / HTML

Native C++ and C# - Which is the Fastest? (Part 1)

4.51/5 (38 votes)
6 Jan 2015Ms-PL6 min read 207.2K   955  
A tip that compares the performance of a simple Mandelbrot generator in C# against native C++

Introduction

What is the performance difference between C# and native C++? C# is much quicker as a development environment to make usable utility apps, but it is generally accepted that for true performance, C++ is the way to go. This article investigates that premise, and I expect the additional effort to develop in C++ yield significant runtime performance benefits.

Disclaimer

  • The program is not representative of a typical application; it is meant to be a computationally intensive test case.
  • I opted for testing all the default release settings, which hampers C++ more than C#. This will be addressed in part 2.
  • There's a disproportionate amount of time spent in the square root library function; I'll change this to an approximation in part 2 (e.g. Carmack's approach), or remove it as it's not strictly required.
  • The code is native C++ talking to Win32, not CLR C++.
  • None of the rendering or window handling is timed; just the Mandelbrot creation.

Using the Code

I used Visual Studio 2013 for this project, but there's nothing special in the projects that should stop it running in Visual Studio 2012. The C# project is compiled against .NET4.5, and the C++ has all the default optimization settings. The code in both languages does the following:

  • Creates a simple borderless window
  • Generates a 640x640x256 Mandelbrot set 20 times
  • Displays the last generated result as the background image of the window (to make sure it is calculated correctly)
  • Waits for a mouse click
  • Displays a message box containing the average time taken to generate a Mandelbrot set in milliseconds
  • Closes

The code is not meant to be optimal, it is meant to be as simple and performance stressing as possible. The actual Mandelbrot routine itself is identical in both C++ and C#.

The window width and height, the number of iterations, and whether to use floats or doubles is easily configurable with a cursory glance at the code.

The size of the Mandelbrot image is less than the size of the processor with the smallest L2 cache to try to avoid any cache complexity.

There are three configurations to compile for the C# project - Any CPU, Win2, and x64

There are two configurations to compile for the C++ project - Win32 and x64

The projects are configured to output a different executable depending on the configuration, so the previous five binaries will not overwrite each other.

I then ran all five executables with both single and double precision settings on five different machines to gauge performance. I also tried on my Surface Pro 1 (Core i5), and my Server 2012 (Core i3) machines, but the results were too inconsistent.

All the machines run Windows Update, so all had .NET4.5 installed and ran the C# versions out of the box. Only my machine had the VS2013 CRT redist installed, so I had to install those on all other machines.

Caveats:

  • The machines were not 'clean' (they had background processes running, it wasn't a clean OS install, etc.); they all are machines that at in use. However, all the tests were done at the same time, so they were all equally 'unclean'.
  • The tests were run several times to validate the results; they were normally within 2%-3%.

Results

The times are the average number of milliseconds taken to create the Mandelbrot set, so lower is quicker.

     

Xeon E5-1620 v2 @ 3.7GHz (L2:10)

Celeron 450 @ 2.2GHz (L2:.5)   Xeon E31245 @ 3.3GHz (L2:8)   Celeron 743 @ 1.3GHz (L2:1)   Xeon E5420 @ 2.5GHz (L2:12)
      AnyCPU x86 x64   AnyCPU x86 x64   AnyCPU x86 x64   AnyCPU x86 x64   AnyCPU x86 x64
  Precision Language                                      
640x640 Float C# 105 107 122   790 791 579   136 136 135   363 363 346   186 185 176
  C++   150 116   874 288   158 117   933 370   473 188
 
  Double C# 98 97 121   570 570 581   136 135 135   331 331 346   168 169 177
  C++   137 135   848 584   147 136   894 391   450 198
 
                                           

Points of Interest

The results are not what I expected at all, even with this very synthetic test.

  • The C++ x64 was always quicker than its x86 equivalent, and sometimes very much quicker.
  • The double precision version was slightly quicker overall, except in the case of C++ x64 code.
  • The AnyCPU config has the 'Prefer x86' setting, so I would expect it to perform the same as the Win32 version. This is indeed the case.
  • C# was quickest overall in 80% of the tests, and the AnyCPU config beat both C++ configurations in all of those cases (even though the C# x64 was quickest in two cases).
  • The Xeon processors had performance proportional to their clock speeds. This would be an obvious expected result.
  • The Celeron 450 at 2.2GHz ran C# significantly more slowly than the Celeron 743 at 1.3GHz. This is very confusing.

I did further comparisons of the Celeron processors with smaller Mandelbrot sets.

Results

        Celeron 450 @ 2.2GHz (L2:.5)       Celeron 743 @ 1.3GHz (L2:1)    
400x400 Float C#   309 358 225   142 145 135  
  C++   343 114   359 145  
 
  Double C#   226 225 225   128 128 135  
  C++   336 231   356 153  
 
200x200 Float C#   77 77 56   35 35 33  
  C++   86 28   89 37  
 
  Double C#   55 55 56   32 32 34  
  C++   84 58   89 39  
                                           

Points of Interest

  • C# was consistently about twice as quick on the 1.3GHz machine than the 2.2GHz machine.
  • The x64 C++ was always quicker than the x86 C++.
  • Double precision x86 C++ tends to be a fraction quicker than the single precision. This is consistent with the previous tests.
  • Double precision x64 C++ tends to be slower than the single precision. This is also consistent with the previous tests.

Conclusions

This does not show that C# is quicker than C++, there isn't a wide enough sampling of processors, the test is not generic enough, and the testing procedure too ad hoc. However, it does show that C# is potentially a performance competitor and if different tests show similar results, C++ will become even less desirable to use.

Additional data points and/or critique would be greatly appreciated!

Further Work

  • Experiment with C++ optimization settings to see what difference they make (will be addressed in part 2)
  • Explain why the Penryn Celeron outperforms the Conroe Celeron that has nearly twice the processor speed.
  • Run other tests that better emulate a real world application.
  • Find Core i3, i5, and i7 processors in desktop machines to compare against.

History

  • 5th January, 2015: First created
  • 8th January, 2015: Added notes, added explanation of timings. Removed the smiley.

License

This article, along with any associated source code and files, is licensed under The Microsoft Public License (Ms-PL)