Preface
The trend towards going parallel means that .NET Framework developers should learn about the Task Parallel Library (TPL). But in general terms, data parallelism uses input data to some operation as the means to partition it into smaller pieces. The data is divvied up among the available hardware processors in order to achieve parallelism. It is then often followed by replicating and executing some independent operation across these partitions. It is also typically the same operation that is applied concurrently to the elements in the dataset.
Task parallelism takes the fact that the program is already decomposed into individual parts - statements, methods, and so on - that can be run in parallel. More to the point, task parallelism views a problem as a stream of instructions that can be broken into sequences called tasks that can execute simultaneously. For the computation to be efficient, the operations that make up the task should be largely independent of the operations taking place inside other tasks. The data-decomposition view focuses on the data required by the tasks and how it can be decomposed into distinct chunks. The computation associated with the data chunks will only be efficient if the data chunks can be operated upon relatively independently. While these two are obviously inter-dependent when deciding to go parallel, they can best be learned if both views are separated. A powerful reference about tasks and compute-bound asynchronous operations is Jeffrey Richter's book, "CLR via C#, 3rd Edition". It is a good read.
In this brief article, we will focus on some of the characteristics of the System.Threading.Tasks
Task
object. To perform a simple task, create a new instance of the Task
class, passing in a System.Action
delegate that represents the workload that you want performed as a constructor argument. You can explicitly create the Action delegate so that it refers to a named method, use an anonymous function, or use a lambda function. Once you have created an instance of Task
, call the Start()
method, and your Task
is then passed to the task scheduler, which is responsible for assigning threads to perform the work. Here is an example code:
using System;
using System.Threading.Tasks;
public class Program {
public static void Main() {
Task task1 = new Task(new Action(printMessage));
Task task2 = new Task(delegate { printMessage() });
Task task3 = new Task(() => printMessage());
Task task4 = new Task(() => { printMessage() });
task1.Start();
task2.Start();
task3.Start();
task4.Start();
Console.WriteLine("Main method complete. Press <enter> to finish.");
Console.ReadLine();
}
private static void printMessage() {
Console.WriteLine("Hello, world!");
}
}
To get the result from a task, create instances of Task<t>
, where T
is the data type of the result that will be produced and return an instance of that type in your Task
body. To read the result, you call the Result
property of the Task
you have created. For example, let's say that we have a method called Sum
. We can construct a Task<tresult>
object, and we pass for the generic TResult
argument the operation's return data type:
using System;
using System.Threading.Tasks;
public class Program {
private static Int32 Sum(Int32 n)
{
Int32 sum = 0;
for (; n > 0; n--)
checked { sum += n; }
return sum;
}
public static void Main() {
Task<int32> t = new Task<int32>(n => Sum((Int32)n), 1000);
t.Start();
t.Wait();
Console.WriteLine("The sum is: " + t.Result);
}
}
Produces:
The sum is: 500500
If the compute-bound operation throws an unhandled exception, the exception will be swallowed, stored in a collection, and the thread pool is allowed to return to the thread pool. When the Wait
method or the Result
property is invoked, these members will throw a System.AggregateException
object. You can use CancellationTokenSource
to cancel a Task
. We must rewrite our Sum
method so that it accepts a CancellationToken
, after which we can write the code, creating a CancellationTokenSource
object.
using System;
using System.Threading;
using System.Threading.Tasks;
public class Program {
private static Int32 Sum(CancellationToken ct, Int32 n) {
Int32 sum = 0;
for (; n > 0; n--) {
ct.ThrowIfCancellationRequested();
checked { sum += n; }
}
return sum;
}
public static void Main() {
CancellationTokenSource cts = new CancellationTokenSource();
Task<int32> t = new Task<int32>(() => Sum(cts.Token, 1000), cts.Token);
t.Start();
cts.Cancel();
try {
Console.WriteLine("The sum is: " + t.Result);
}
catch (AggregateException ae) {
ae.Handle(e => e is OperationCanceledException);
Console.WriteLine("Sum was canceled");
}
}
}
outputs that the task was canceled:
Sum was canceled
There is a better way to find out when a task has completed running. When a task completes, it can start another task. Now, when the task executing Sum
completes, this task will start another task (also on some thread pool thread) that displays the result. The thread that executes the code below does not block waiting for either of these two tasks to complete; the thread is allowed to execute other code or, if it is a thread pool thread itself, it can return to the pool to perform other operations. Note that the task executing Sum
could complete before ContinueWith
is called.
using System;
using System.Threading.Tasks;
public class Program {
private static Int32 Sum(Int32 n)
{
Int32 sum = 0;
for (; n > 0; n--)
checked { sum += n; }
return sum;
}
public static void Main() {
Task<int32> t = new Task<int32>(n => Sum((Int32)n), 1000);
t.Start();
Task cwt = t.ContinueWith(task => Console.WriteLine(
"The sum is: " + task.Result));
cwt.Wait();
}
}
Produces a similar result:
The sum is: 500500
Now, when the task executing Sum
completes, this task will start another task (also on some thread pool thread) that displays the result. The thread that executes the code above does not block waiting for either of these two tasks to complete; the thread is allowed to execute other code, or if it is a thread pool thread, it can return to the pool to perform other operations. Note that the task executing Sum
could complete before ContinueWith
is called. This will not be a problem because the ContinueWith
method will see that the Sum
task is complete and it will immediately start the task that displays the result. Tasks also, by the way, support parent/child relationships. Examine the code below:
using System;
using System.Threading;
using System.Threading.Tasks;
public class Program {
private static Int32 Sum(Int32 n)
{
Int32 sum = 0;
for (; n > 0; n--)
checked { sum += n; }
return sum;
}
public static void Main() {
Task<int32[]> parent = new Task<int32[]>(() => {
var results = new Int32[3];
new Task(() => results[0] = Sum(100),
TaskCreationOptions.AttachedToParent).Start();
new Task(() => results[1] = Sum(200),
TaskCreationOptions.AttachedToParent).Start();
new Task(() => results[2] = Sum(300),
TaskCreationOptions.AttachedToParent).Start();
return results;
});
var cwt = parent.ContinueWith(parentTask =>
Array.ForEach(parentTask.Result, Console.WriteLine));
parent.Start();
cwt.Wait();
}
}
produces the parent/child task results:
5050
20100
45150
Internally, Task
objects contain a collection of ContinueWith
tasks, meaning you can actually call ContinueWith
several times using a single Task
object. It is important to note that Task
s do not replace threads: they run threads. When the task completes, all the ContinueWith
tasks will be queued to the thread pool. Recall that when the CLR initializes, the thread pool has no threads in it. Internally, the thread pool maintains a queue of operation requests. When your application wants to perform an asynchronous operation, you call some method that appends an entry into the thread pool's queue. The thread pool's code will extract entries from this queue and dispatch the entry to a thread pool thread. If there are no threads in the thread pool, a new thread will be created. This is why when writing managed applications, you needn't actually create threads by hand. The system manages a pool per-process, and therefore the thread pool offers only static methods. To schedule a work item for execution, you make a call to the QueueUserWorkItem
method passing in a WaitCallback
delegate. But recall again that we can avoid the limitations of calling ThreadPool
's QueueUserWorkItem
by creating a Task
object.
References
- CLR via C#, 3rd Edition Jeffrey Richter.