In my previous article, I mentioned a new feature of C# 5.0 – the async
and the await
keywords. They are syntactical sugars that simplify the construction of asynchronous operations code. When the C# compiler sees an await
expression, it generates code that automatically invokes the expression asynchronously, then immediately return the control flow to the caller so the caller code can continue executing without block; after the asynchronous operation finished, the control flow will be forwarded to the code below the await
expression and execute the code sequentially till an exit criteria is reached (the exit criteria may be: the end of a method, or an iteration of a loop, etc). I emphasize that the await
keyword is only a syntactical sugar, it is therefore an alternative that compiler generates the equivalent code rather than you manually write it. Before you can understand what the C# 5.0 does for you for async
and await
keywords, you should first understand how the Microsoft .NET Framework provides the asynchronous programming models (APM).
In .NET Framework, there are many ways to implement an asynchronous operation: by using thread, thread pool, BeginXxx
and EndXxx
methods, event based APM, or Task based APM. The first way, using thread is not recommended because creating a thread is very expensive*, and it requires many manual controls to work well, so I will skip this discussion; the second way, using thread pool, is the easiest and the most commonly used way to go; the BeginXxx
and EndXxx
methods declared in specified types provide the standard way to perform an asynchronous operation; the event based asynchronous programming model is less popular than BeginXxx
and EndXxx
methods, .NET Framework just provides a very small set of the types that support event based APM; the last one, Task based APM, is introduced in .NET Framework 4 and is a part of Task Parallel Library (TPL), it dispatches asynchronous operations based on a task scheduler, it also offers many features to extend the task parallelism. The default task scheduler is implemented by using thread pool, .NET Framework also provides task schedulers implemented by Synchronization Contexts, in addition, you can implement your own task schedulers and use it to work with tasks.
* Creating a thread needs about 1.5 MB memory space, Windows will also create many additional data structures to work with this thread, such as a Thread Environment Block (TEB), a user mode stack, and a kernel mode stack. Bringing new thread may also need thread context switching, which also hurts performance. So avoid creating additional threads as much as possible.
In this article, I will go through the different ways to perform asynchronous operation, and show examples to guide you to use both of them.
The Thread Pool APM
When you want to perform an asynchronous operation, it is easy to use thread pool to do so, by calling System.Threading.ThreadPool
’s QueueUserWorkItem static
method, passing an instance of WaitCallback
delegate and optionally an instance of Object
that represents the additional parameter to associate with the instance of WaitCallback
. The following example shows how to use thread pool to queue asynchronous operations.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace ApmDemo
{
class Program
{
static void Main(string[] args)
{
WaitCallback writeCallback = state => Console.WriteLine(state);
ThreadPool.QueueUserWorkItem(writeCallback, "This is the first line");
ThreadPool.QueueUserWorkItem(writeCallback, "This is the second line");
ThreadPool.QueueUserWorkItem(writeCallback, "This is the third line");
Console.ReadKey();
}
}
}
In the above example, I initialized an instance of a WaitCallback
instance by assigning a lambda expression as the delegate body, then called ThreadPool
’s static method QueueUserWorkItem
, passed this instance as the first parameter, and a string
as its second parameter. When calling this method, the thread pool seeks for a free thread in the pool, associates the instance of the WaitCallback
delegate to that thread, and dispatches this thread to execute the delegate at some time; if there is no free thread in the pool, the thread pool creates a new thread, associates the delegate instance, and then dispatches to execute at some time. I queued three user work items to the thread pool, by calling QueueUserWorkItem
method for three times.
When I try to run this program, I may get the following output:
This is the first line
This is the second line
This is the third line
But sometimes, I also get the following output:
This is the second line
This is the first line
This is the third line
Please note that the executing order of the queued user work items is unpredictable because there is no way to know when a thread in the thread pool is scheduled to execute the code. As shown above, the work items may complete in sequential, and it is also possible that the work items complete in reverse order. Therefore, do not write asynchronous code that relies on the execution order.
I highly recommended that you use the thread pool APM as much as possible, here are some reasons:
- Thread pool is managed automatically by the CLR. When you queue a user item to the thread pool, you never care which thread it will be associated with and when it will be executed; the CLR handles everything for you – this pattern enables you to write easy-to-read, straightforward and less buggy code.
- Thread pool manages threads wisely. When performing an asynchronous operation, CLR requires additional thread to perform this operation so the operation can take without blocking the current thread, but however, creating a new thread is expensive, introducing a new thread every time to serve a user work item is heavy and a waste of resources. Thread pool manages a set of threads initially, when a user work item is queued, the thread pool adds this work item to a global work item list, then a CLR thread will check this global work item list, if it is not empty, this thread picks up a work item, and dedicates it to a free thread in the pool; if there is no free thread, the thread pool will then create a new thread, and dedicate it to this newly created thread. The thread pool always chooses to use as less thread as possible to serve all queued user work items. Hence, by using thread pool, CLR uses less system resources, makes the asynchronous operations scheduling effective and efficient.
- Thread pool has better performance. Thread pool mechanism guarantees that it can use maximum or configured CPU resources to server user work items. If you are running your program in a multi-core CPU environment, the thread pool initially creates threads whose number is equal to the number of the installed CPUs in that environment; when scheduling a user work item, thread pool automatically balances the threads, and makes sure that every logical CPU core is used to serve the work items. This brings a flexibility to dispatch CPU resources and also helps to improve the whole system performance.
Though there are a lot of benefits using thread pool, there are also limits:
- Thread pool queues a user work item, and executes it at an uncertain time, when it finished processing a user item, there is no way for the caller code to know when it will complete, thus it is very difficult to write continuation code after this work item is completed. Specially, some operations, like read a number of bytes from a file stream, must get a notification when the operation is completed asynchronously, then the caller code can determine how many bytes it read from the file stream, and use these bytes to do other things.
- The
ThreadPool
’s QueueUserWorkItem
method only takes a delegate that receives one parameter, if your code is designed to process more than one parameter, it is impossible to directly pass all the parameters to this method; instead, you may create additional data structure to wrap those parameters, then alternatively pass the wrapper type instance to the method. This reduces the readability and maintainability of your code.
To solve these problems, you may use the following standard way to perform asynchronous operations.
The Standard APM
The Framework Class Library (FCL) ships various types that have BeginXxx
and EndXxx
methods, these methods are designed to perform asynchronous operations. For example, the System.IO.FileStream
type defines Read
, BeginRead
and EndRead
methods, Read
method is a synchronous method, it reads a number of bytes from a file stream synchronously; in other words, it won’t return until the read
operation from the file stream is completed. The BeginRead
and EndRead
methods are pair, when calling BeginRead
method, CLR queues this operation to the hardware device (in this case, the hard disk), and immediately returns the control flow to the next line of code and then continue to execute; when the asynchronous read operation is completed by the hardware device, the device notifies the Windows kernel that the operation is completed, then the Windows kernel notifies CLR to execute a delegate which is specified as a parameter by calling BeginRead
method, in this delegate, the code must call EndRead
method so that the CLR can transit the number of bytes read from the buffer to the calling delegate, then the code can access the bytes read from the file stream.
Here is what the Read
, BeginRead
and EndRead
method signatures are defined.
public override IAsyncResult BeginRead(byte[] array, int offset,
int numBytes, AsyncCallback userCallback, object stateObject);
public override int EndRead(IAsyncResult asyncResult);
public override int Read(byte[] array, int offset, int count);
Usually, The BeginXxx
method have the same parameters with the Xxx
method and two additional parameters: userCallback
and stateObject
. The userCallback
is of type AsyncCallback
, which takes one parameter of type IAsyncResult
which brings additional information to this asynchronous operation; the stateObject
parameter is the instance that you want to pass to the userCallback
delegate, which can be accessed by AsyncState
property defined on this delegate’s asyncResult
argument.
The following code demonstrates how to use BeginXxx
and EndXxx
methods to perform asynchronous operations.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.IO;
namespace ApmDemo
{
internal class Program
{
private const string FilePath = @"c:\demo.dat";
private static void Main(string[] args)
{
Program.TestWrite();
Thread.Sleep(60000);
}
private static void TestWrite()
{
FileStream fs = new FileStream(Program.FilePath, FileMode.OpenOrCreate,
FileAccess.Write, FileShare.None, 8, FileOptions.Asynchronous);
string content = "A quick brown fox jumps over the lazy dog";
byte[] data = Encoding.Unicode.GetBytes(content);
Console.WriteLine("Begin to write");
fs.BeginWrite(data, 0, data.Length, Program.OnWriteCompleted, fs);
Console.WriteLine("Write queued");
}
private static void OnWriteCompleted(IAsyncResult asyncResult)
{
FileStream fs = (FileStream)asyncResult.AsyncState;
fs.EndWrite(asyncResult);
fs.Close();
Console.WriteLine("Write completed");
Program.TestRead();
}
private static void TestRead()
{
FileStream fs = new FileStream(Program.FilePath, FileMode.OpenOrCreate,
FileAccess.Read, FileShare.None, 8, FileOptions.Asynchronous);
byte[] data = new byte[1024];
Console.WriteLine("Begin to read");
fs.BeginRead(data, 0, data.Length, Program.OnReadCompleted,
new { Stream = fs, Data = data });
Console.WriteLine("Read queued");
}
private static void OnReadCompleted(IAsyncResult asyncResult)
{
dynamic state = asyncResult.AsyncState;
int bytesRead = state.Stream.EndRead(asyncResult);
byte[] data = state.Data;
string content = Encoding.Unicode.GetString(data, 0, bytesRead);
Console.WriteLine("Read completed. Content is: {0}", content);
state.Stream.Close();
Console.ReadKey();
}
}
}
This program tests asynchronous read/write operations from/to a specified file stream, by using BeginRead
, EndRead
, BeginWrite
and EndWrite
methods defined on System.IO.FileStream
class. When I try to run this program, I get the following output:
Now you may already know how to use the standard way to perform an asynchronous operation by calling BeginXxx
and EndXxx
methods. In fact, this standard way supports many more features as I demonstrated here, such as cancellation, which I will discuss in later articles, and supporting cancellation is really a big plus of this pattern. By using this pattern, you can solve some problems I listed for the thread pool APM, and you can also have additional benefits which I summarize below.
- Supports continuation. When an asynchronous operation is completed, the
userCallback
delegate is invoked, so the caller code can perform other operations based on the result of this asynchronous operation. - Supports I/O based asynchronous operations. The standard APM works with kernel objects to perform I/O based asynchronous operations. When an I/O based asynchronous operation is requested by calling the
BeginXxx
method, the CLR doesn’t introduce new thread pool thread to dedicate this task, instead, it uses a Windows kernel object to wait for the hardware I/O device to return (through its driver software) when it finishes the task. CLR just uses the hardware device drivers and kernel objects to perform I/O based asynchronous operations, no more managed resources are used to handle this case. hence, it actually improves the system performance by releasing CPU time slices and threads usage. - Supports cancellation. When an asynchronous operation is triggered, user may cancel this operation by calling
System.Threading.CancellationTokenSource
’s Cancel
method, I will introduce this class in the later articles.
But however, by using standard APM, your code becomes more complicated. That’s because all the task continuation happens outside of the calling context, for example, in the above read/write file stream example, the OnReadCompleted
and the OnWriteCompleted
are separate methods and invoked by a different thread than the current calling thread, this behavior may confuse developers, and therefore make your code logic not clear to understand.
Note: The async
method and the await
expressions bring a clear, logical and organized code structure to the asynchronous programming.
The Event-based APM
The Framework Class Library (FCL) also ships with some types that support event-based APM. For example, the System.Net.WebClient
class defines a DownloadDataAsync
method, and a DownloadDataCompleted
event, by calling DownloadDataAsync
method, CLR begins an asynchronous operation for downloading the data from a specified URL, when it is completed, the DownloadDataCompleted
event will be fired, the argument e
of type System.Net.DownloadDataCompletedEventArgs
contains results and additional information of this operation. Here is the code that demonstrates how to use event based APM to perform asynchronous operation.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.IO;
using System.Net;
namespace ApmDemo
{
internal class Program
{
private static void Main(string[] args)
{
WebClient wc = new WebClient();
wc.DownloadDataAsync(new Uri("http://www.markzhou.com"));
wc.DownloadDataCompleted += (s, e) => Console.WriteLine
(Encoding.UTF8.GetString(e.Result));
Console.ReadKey();
}
}
}
Actually, it acts with the same effect as using BeginXxx
and EndXxx
methods, the difference is event-based APM is more close to the object model layer, you can ever use a designer and a property window to drag-drop the component to the user interface and then set the event handler through the property window, as opposed, standard APM doesn’t provide events to subscribe, this helps to improve the system performance because implementing events may require additional system resources.
There are very small set of types in FCL support event-based APM, personally, I suggest to not use this pattern as much as possible, the event-based APM may suit for the application developers because they are component consumers, not the component designers, and the designer supportability is not mandatory for the component designers (library developers).
The Task-based APM
Microsoft .NET Framework 4.0 introduces new Task Parallel Library (TPL) for parallel computing and asynchronous programming. The mainly used Task
class, which defined in System.Threading.Tasks
namespace, represents a user work item to complete, to use task based APM, you have to create a new instance of Task
, or Task<T>
class, passing an instance of Action
or Action<T>
delegate as the first parameter of the constructor of Task
or Task<T>
, then, call the Task
’s instance method Start
, notifies the task scheduler to schedule this task as soon as possible.
The following code shows how to use task based APM to perform a compute-bound asynchronous operation.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Task based APM demo");
Task t = new Task(() =>
{
Console.WriteLine("This test is output asynchronously");
});
t.Start();
Console.WriteLine("Task started");
Task.WaitAll(t);
}
}
}
If I run this program, I will get the following output:
Alternatively, if the task delegate returns a value, you can use Task<T>
instead of Task
, after the task is complete, you can query the result by Task<T>
’s Result
property. The following code shows how to use Task<T>
to calculate the nth exponent to 2 (n is positive only).
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Task based APM demo");
Func<int, int> calc = (n) => { return 2 << (n - 1); };
Task<int> t = new Task<int>(() =>
{
return calc(10);
});
t.Start();
Console.WriteLine("Task started");
Task.WaitAll(t);
Console.WriteLine(t.Result);
}
}
}
When I run this program, I get the following output:
The Task
’s static method WaitAll
waits all tasks specified in the parameter array synchronously, meaning that the current thread will be blocked till all the specified tasks are complete. If you don’t want to block the current thread, and you intend to do something after a certain task is complete, you may use the Task
’s instance method ContinueWith
, here is shown how to do continuation tasks in the following code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Task based APM demo");
Func<int, int> calc = (n) => { return 2 << (n - 1); };
Task<int> t = new Task<int>(() =>
{
return calc(10);
});
t.ContinueWith(task => { Console.WriteLine(task.Result); return task.Result; });
t.Start();
Console.WriteLine("Task started");
Console.ReadKey();
}
}
}
The task based APM has many features, I list some of the important features below:
- You can specify
TaskCreationOptions
when creating a task, indicating how the task scheduler will schedule the task. - You can specify a
CancellationTokenSource
when creating a task, indicating the associated cancellation token used to cancel a task. - You can use
ContinueWith
, or ContinueWith<T>
method to perform continuation tasks. - You can wait all specified tasks to complete synchronously by calling
Task
’s static WaitAll
method, or wait any of the tasks to complete synchronously by calling Task
’s static WaitAny
method. - If you want to create a bunch of tasks with the same creation/continuation settings, you can use
TaskFactory
’s instance StartNew
method. - The task based APM requires a task scheduler to work, the default task scheduler is implemented on top of the thread pool, however, you may change the task scheduler associated with a task to a synchronization context task scheduler, or a customized task scheduler.
- You can easily convert a
BeginXxx
and EndXxx
pattern asynchronous operation into a task based APM by calling TaskFactory
’s instance FromAsync
or FromAsync<T>
method.
Task, Async Method and Await Expression
I would like to point out that the async
method and the await
expression/statement in C# 5.0 are implemented in the compiler level by building on top of the task based APM. An async
method must have either a return type of void
, or a return type of Task
or Task<T>
, this limitation is obvious because if there is no await
expression in the async
method, this method will be invoked synchronously; thus this method can be treated as a normal method, making a void
return value is clear; otherwise, if the async
method contains at least one await
expression, this method will be invoked asynchronously and because of await
expressions based on the task based APM, a Task
or a Task<T>
instance must be returned from this method to enable another await
expression to perform on this method.
To make this clear, I modify the code to calculate the nth exponent to 2
by using async
and await
, see the following:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace Demo
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Task based APM demo");
Task t = new Task(async () =>
{
int result = await Program.Exponent(10);
Console.WriteLine(result);
});
t.Start();
Console.ReadKey();
}
static async Task<int> Exponent(int n)
{
Console.WriteLine("Task started");
return await TaskEx.Run<int>(() => 2 << (n - 1));
}
}
}
When I try to run this program, I get the exact same result as the example showed in the Task based APM section.
You may still confuse the above code, concern how it works and what is the exactly control flow to run this code. In the articles that follow, I will discuss it in detail.
Conclusion
The Microsoft .NET Framework provides many ways to perform asynchronous operations, you may choose one or some of them by investigating your case; though there are various ways, some of them are not recommended, such as using System.Threading.Thread
class to implement asynchronous operations, or event-based APM. The most popular ways are using thread pool or task based APM. In addition, task based APM is used to implement async
method and await
expression/statement in C# 5.0.
At last, I summarize the different asynchronous models in the following table for reference.
Pattern | Description | Based On | Notes |
Thread based | By creating System.Threading.Thread instance | Managed Thread | Expensive, not recommended |
Standard BeginXxx and EndXxx methods | By calling BeginXxx method with a user callback; calling EndXxx inside that user callback | Thread pool | Widely used, standard, recommended, support cancellation and continuation |
ThreadPool | By calling ThreadPool ’s static QueueUserWorkItem method | Thread pool | Widely used, recommended use as much as possible |
Delegate | By calling Delegate’s BeginInvoke and EndInvoke instance methods | Thread pool | Less used |
Event based | By subscribing to the appropriate event and calling the appropriate method | Thread pool | Avoid use as much as possible, not recommended |
Task based | By creating System.Threading.Tasks.Task instance | A specified task scheduler | Recommended, supports all features of a thread pool pattern, and has many other features |
async method and await expression | By using async and await keywords | Task based pattern | The new C# 5.0 asynchronous pattern |
CodeProject