Introduction
The purpose of this article is to examine different ways to iterate over a list of items, and see the comparative time it takes to iterate over the lists using either one thread or multi threads in different ways.
The envisioned use of such a loop is to price a set of trades, work out some complicated mathematical problems, download files from a series of remote machines or other such applications.
Visual application that need a multi threaded approach, e.g., a Winforms application that has a background worker thread, isn't covered by this article as the background thread is needed in this case to make sure that the interface stays responsive while doing some work in the background, e.g., copying large files.
The methods under examination are:
- A traditional
for
loop on a single thread - Traditional
for
loop starting a new thread for each item - Using Traditional task pooling
- Using the new task parallel library
for
loop
Background
The background for this article is that I looked on the internet for an article that would give me statistics on comparing how long each of the above methods would take to run instead of using my gut feel. Due to lack of such an article, I wrote a program to do it myself and decided to share.
Test for Thread Looping Test Class
The first thing that we need to do is write a test that will show how we are going to execute a test method in each of the four ways.
The test is located in the project called LoopExecution.Test
.
The test is as follows:
1. [TestMethod]
2. public void Assert_That_Loops_Run()
3. {
4. List<int> listOfInt = new List<int>();
5. int numberOfTimesToRun = 32;
6. for (int i = 0; i < numberOfTimesToRun; i++)
7. {
8. listOfInt.Add(i);
9. }
10.
11. LoopExecution<int> loopExecution = new LoopExecution<int>();
12.
13. loopExecution.ExecutionMethod +=
new LoopExecution<int>.MethodToExecute(loopExecution_ExecutionMethod);
14. loopExecution.CollectionToIterateOver = listOfInt;
15.
16. loopExecution.Execute();
17.
18. Assert.IsTrue(loopExecution.NormalLoopTime > 0);
19. Assert.IsTrue(loopExecution.ThreadLoopTime > 0);
20. Assert.IsTrue(loopExecution.ThreadPoolLoopTime > 0);
21. Assert.IsTrue(loopExecution.TaskLoopTime > 0);
22. }
The first part of the test (lines 4-9) creates the list we are going to iterative over. I've chosen to use a simple list of INT
so as to make the test more easier to understand and simple.
On line 11, we create the object that is going to do the loop iteration. This is a generic type so that it can be used to test all different classes.
Line 13 assigns an event to the class that will be called by each iteration of the loop.
Below is the code for the method that is used in the test. It use the standard .NET event signature and all the method does is write a Debug
message.
1. void loopExecution_ExecutionMethod
(object sender, LoopExecution<int>.LoopExecutionEventArgs e)
2. {
3. Debug.WriteLine("{0}number{1}.Threadname:{2}.Objectoperatingon{3}",
e.MethodName,e.LoopExecutionNumber,Thread.CurrentThread.ManagedThreadId,
e.ObjectToOperateOn);
4. }
One interesting point to note is that as part of the event args class, I pass in the object that is for that iteration of the loop. This means that you can access that object to perform operations on.
On line 14 of the test, the list that will be iterated over is assigned to the looping class and then on line 16, we call execute.
To check that all of the different looping methods have been executed and the time it took to execute set for each method, at the end of the test I do an assert that the time for each one is greater than zero. The asserts don't have a specific value as on different machines and depending on what is running in the background you can get different results, however all the results should be greater then zero. The asserts could be improved by using the delegate to set a flag that for that particular method and iteration the method had been called.
Thread Looping Class
The main code for the execution of the Loop execution is located in the project called LoopExecution.Library
.
Before we look at the actual main part of the class that runs the looping test, I am going to cover first the event that is fired by each of the methods.
1. public delegate void MethodToExecute(object sender, LoopExecutionEventArgs e);
2. public event MethodToExecute ExecutionMethod;
3.
4. protected void OnExecuteMethod(object sender, LoopExecutionEventArgs e)
5. {
6. if (ExecutionMethod != null)
7. {
8. ExecutionMethod(sender, e);
9. }
10. }
This is a a pretty standard way of handling events in .NET so I won't go into any detail on this. The only interesting part is the Event args. This is defined as an inner class. The reason for this is that I wanted to make sure that it uses the same type defined in the Generic of the main class.
The Loop Execution Event Args class looks like:
1. public class LoopExecutionEventArgs : EventArgs
2. {
3. public String MethodName { get; set; }
4. public T ObjectToOperateOn { get; set; }
5. public int LoopExecutionNumber { get; set; }
6. }
On line 4, the type T
is defied in the main class. This is just a normal event class with the additional info of the method that is executing it, the current object form the loop and the current position in the loop.
The Main Thread Looping Class
The class signature for the main class is:
1. public class LoopExecution<t>
2. {
3. …
4. }
This allows us to set what type of object we will be interacting with by use of the generic type.
Now we get to the crunch. The main method that will run each of our loops and give us the times for each loop.
1. public void Execute()
2. {
3. NormalExecute();
4. ThreadExecute();
5. ThreadPoolExecute();
6. TaskParallelExecute();
7. }
So nothing really exciting here. All this method does is call the four methods for each type of loop. This is formatted in a clean code way so you can see at a glance what the purpose of the method is.
Normal Loop
The first part to examine is the Normal loop. This consists of two methods which are as below:
1. private void NormalExecute()
2. {
3. watch.Start();
4. NormalWhileLoop();
5. NormalLoopTime = watch.ElapsedMilliseconds;
6. watch.Reset();
7. }
8.
9. private void NormalWhileLoop()
10. {
11. for (int i = 0; i < this.CollectionToIterateOver.Count; i++)
12. {
13. LoopExecutionEventArgs e = new LoopExecutionEventArgs();
14. StackFrame stackFrame = new StackFrame();
15. e.MethodName = stackFrame.GetMethod().ToString();
16. e.LoopExecutionNumber = i;
17. e.ObjectToOperateOn = CollectionToIterateOver[i];
18.
19. OnExecuteMethod(this, e);
20. }
21. }
This should be fairly straight forward to understand. The first method used a field called watch
which is just a System.Diagnostics.Stopwatch
to mark the start and end of how long this loop takes to run.
The second method is just a stand loop that iterates over all the items in the list and using the current item, then creates a LoopExecutionEventArgs
with the current method and iteration number. All the loop then does is execute the event callback via the OnExecuteMethod
we saw above.
Thread Loop
The next part of the program is similar to the normal loop but instead of the event callback being executed on the same thread, a new thread is created and the event callback is run on that.
1. private void ThreadExecute()
2. {
3. watch.Start();
4. ThreadLoop();
5. ThreadLoopTime = watch.ElapsedMilliseconds;
6. watch.Reset();
7. }
8.
9. private void ThreadLoop()
10. {
11.
12. for (int i = 0; i < this.CollectionToIterateOver.Count; i++)
13. {
14. Thread t = new Thread(x =>
15. {
16. LoopExecutionEventArgs e = new LoopExecutionEventArgs();
17. StackFrame stackFrame = new StackFrame();
18. e.MethodName = stackFrame.GetMethod().ToString();
19. e.LoopExecutionNumber = (int)x;
20. e.ObjectToOperateOn = CollectionToIterateOver[(int)x];
21. OnExecuteMethod(this, e);
22. });
23. t.Start(i);
24. }
25. }
As you can see, the first method is the same as the normal loops first method with the only differences being setting it for the ThreadLoop
. The second method just uses an anonymous method which will call the event and then starts the thread.
Thread Pool
The next method makes use of a thread pool. The methods used for this are:
1. private void ThreadPoolExecute()
2. {
3. watch.Start();
4. ThreadPoolLoop();
5. ThreadPoolLoopTime = watch.ElapsedMilliseconds;
6. watch.Reset();
7. }
8.
9. private void ThreadPoolLoop()
10. {
11. int toProcess = this.CollectionToIterateOver.Count;
12. using (ManualResetEvent resetEvent = new ManualResetEvent(false))
13. {
14. for (int i = 0; i < this.CollectionToIterateOver.Count; i++)
15. {
16. ThreadPool.QueueUserWorkItem(new WaitCallback(x =>
17. {
18. LoopExecutionEventArgs e = new LoopExecutionEventArgs();
19. StackFrame stackFrame = new StackFrame();
20. e.MethodName = stackFrame.GetMethod().ToString();
21. e.LoopExecutionNumber = (int)x;
22. e.ObjectToOperateOn = CollectionToIterateOver[(int)x];
23. OnExecuteMethod(this, e);
24. if (Interlocked.Decrement(ref toProcess) == 0)
25. resetEvent.Set();
26.
27. }
28. ), i);
29. }
30. resetEvent.WaitOne();
31. }
32. }
The first method is the same as we have seen in each of the other cases. The second method gets more interesting. As we are using the standard .NET Thread pool, there is a potential race condition. The Thread pool starts each worker thread off as a background process. This means we need to enforce a stop condition on all the threads completing before the application carries on executing, else we will get an incorrect reading on the amount of time it takes to execute using this method.
This is done by using a ManualResetEvent
and forcing it to wait on line 30. This is set to run once all the threads have completed running by the set event on line 25. Line 25 is only called once toProcess
reaches zero. toProcess
is set to be the number of items in the list on line 11. This is one of the advantages of using an anonymous method in this situation as you don't need to pass in a ref to toProcess
as the anonymous method can make use of it.
Task Parallel Library
The final method uses the new Task Parallel Library from .NET 4.0.
The methods look like:
1. private void TaskParallelExecute()
2. {
3. watch.Start();
4. TaskParallelLibraryLoop();
5. TaskLoopTime = watch.ElapsedMilliseconds;
6. watch.Reset();
7. }
8.
9. private void TaskParallelLibraryLoop()
10. {
11. Parallel.For(0, this.CollectionToIterateOver.Count, i =>
12. {
13. LoopExecutionEventArgs e = new LoopExecutionEventArgs();
14. StackFrame stackFrame = new StackFrame();
15. e.MethodName = stackFrame.GetMethod().ToString();
16. e.LoopExecutionNumber = i;
17. e.ObjectToOperateOn = CollectionToIterateOver[i];
18. OnExecuteMethod(this, e);
19. })
20. ;
21. }
The first method is the same as the other three however it is with the second method that we use that may be unfamiliar to people. According to MSDN, the method signature for a Parrallel.For
is:
public static ParallelLoopResult For(
int fromInclusive,
int toExclusive,
Action<int> body
)
where the body is a delegate that is executed once per iteration.
The method passes in to the delegate an int
representing the current iteration that will be executed. In this case, the delegate is an anonymous method that is used to call the delegate that will do our work.
Matrices of Execution
All of the code used to run an application that outputted the results for the Matrices is located in LoopExecution.Application
. This is currently a console application which prompts for the number of times you wish to run the loop in incremental steps of one and the two current types of delegates used to test the application. Future version will hopefully have a GUI and be able to dynamically load up custom objects and methods to work on.
Computer Specification
For the sake of completeness, the test machine being used to run this application on is an Intel Core 2 Duo T600 @ 2.00GHz 2.00GHz with 3.00GB of RAM and Windows 7 Home Premium SP1 32 bit OS.
Now the class that will be used to perform the matrix has been explained, let's run it a few times and see what results we get.
Simple Test
The first test will be run using a method that only writes to the console. This method is shown below:
1. static void loopExecution_ExecutionMethodConsoleWriteLine
(object sender, LoopExecution<int>.LoopExecutionEventArgs e)
2. {
3. Console.WriteLine("{0} number {1}. Thread name : {2}",
e.MethodName, e.LoopExecutionNumber, Thread.CurrentThread.ManagedThreadId);
4. }
All this does is a simple print line.
If we run this for 128 times, we get the following results:
Total run time for Normal loop 2210ms
Total run time for Thread loop 53663ms
Total run time for Thread Pool 3255 ms
Total run time for Task Parallel Library 3250ms
The full results for this can be found in the ThreadReultsWriteLine.csv.
There are a few interesting points to notice from these results.
The first is that although it isn't shown in the results on some of the runs, the Task Parallel Library did run quicker than the Normal loop but over the course of the entire run, the Normal loop ran quicker.
The second and most obvious point is that it doesn’t always make sense to run a process on separate threads. The reason is that the overhead of starting a new thread is greater than the time it takes to actually execute the delegate.
Simulating Work Test
So as to have an example of a delegate that stimulates actually doing some work that can take a variety of time, the following delegate was used to run the test 128 times again.
1. static void loopExecution_ExecutionMethodSleep
(object sender, LoopExecution<int>.LoopExecutionEventArgs e)
2. {
3.
4. if (e.LoopExecutionNumber % 11 == 0)
5. {
6. Thread.Sleep((int)(e.LoopExecutionNumber * 0.149) + 1);
7. }
8. else if (e.LoopExecutionNumber % 23 == 0)
9. {
10. Thread.Sleep((int)(e.LoopExecutionNumber * 0.344));
11. }
12. Console.WriteLine("{0} number {1}. Thread name : {2}",
e.MethodName, e.LoopExecutionNumber, Thread.CurrentThread.ManagedThreadId);
13. }
The above method just sends the current thread of execution to sleep depending on the current iteration. These numbers were picked at random however I did them at such intervals and with the multiplication factor in order that the test doesn’t take too long to run. Also when running a live application, the time it takes to complete each task for an object could be different depending on whether it was accessing a network share, database, internet, local disk or memory object.
If we run this 128 times, we get the following results:
Total run time for Normal loop 11838ms
Total run time for Thread loop 63148ms
Total run time for Thread Pool 6150ms
Total run time for Task Parallel Library 5110ms
The full results for this test can be found in ThreadReultsSleep.csv.
The most interesting statistic to notice here is that in this case, the two methods used pre created threads or task was the quickest. This was because off the obvious reason that when we sent the thread to sleep, other threads will be operated on whereas for the normal loop as it is only one thread, it has to wait for the sleep to complete before it can move onto the next iteration.
The other statistic of note if that there isn't much difference between the Thread Pool and the Task Parallel Library, although that will grow if more threads are used. However apart from the fact that the Task Parallel Library was quicker if you re examine the code the Task Parallel Library is much easier to read, understand and use. The other benefit of using this approach is you don't need to worry about all threads completing before you move on to the next stage of the application. So from a maintenance point of view, Task Parallel Library wins.
Conclusion
So in conclusion, for a real world program where the operation being performed is a quick operation, it doesn’t make sense to go straight for a multi threaded option.
If however the operation being performed is going to be processes intensive or depends on other external factors that mean the thread will be in a blocked state, then it is worth using a multi threaded application.
History
- 8th March, 2011: Initial post