Improved Multi-Threading Support in .NET 4.5

Lee Robie

4.46/5 (40 votes)

27 Oct 2011CPOL5 min read

90K

1.1K

Quick overview of the new language support for asynchonous programming in .NET 4.5.

Download source code - 3.44 KB

Introduction

This is a quick tour of the new language support for multi-threaded programming coming in .NET 4.5. These examples were programmed using the Visual Studio 11 developer preview released back in September.

Multi-Threaded Evolution

Like Java, C# has had threading support since the beginning. In the early days, this meant having a set of synchronization classes in the CLR, along with the lock statement. While this was better than having to call a system library, the difficulty of writing threaded code was still too high for most developers and projects.

But multi-threaded programming is increasingly important because multiple parallel cores, rather than faster clock speeds, is how computers are evolving into the future. As a result, Microsoft has continually enhanced threading support in .NET in almost every release. .NET 4 introduced the Task Parallel Library (TPL), a major conceptual step forward. .NET 4.5 builds on this by integrating tasks directly into the language.

Tasks not Threads

This latest approach to threading is to not think about or deal with threads at all! Instead, you program with tasks, where a task is simply a value (data) that will be available at a later time (technically, a “future”). How does this help?

The idea is to make parallel code look and act, as much as possible, like sequential code. Consider this function, which calls two other long running functions before displaying the result:

// Calling code
..
Function();  // thread blocks here until Function completes
..


public void Function()
{
 string s1 = GetExpensiveString();
 string s2 = GetAnotherExpensiveString();
 Console.WriteLine(s1 + s2);
}

The problem with this code is that GetAnotherExpensiveString can’t begin until GetExpensiveString completes. Also, the calling code (and the rest of the program) is stopped until both long-running functions return.

For the purposes of this simple example, the 'expensive' methods just mark time:

private static string GetExpensiveString()
{
 for (int i=0; i < 5; i++)
    Thread.Sleep(1000); // allow other threads to jump in

 return DateTime.Now.ToLongTimeString();
}

// GetAnotherExpensiveString() the same

In .NET 4.5, this code becomes:

// Calling code
..
FunctionAsync();  // control returns well before FunctionAsync completes 
..


public async Task FunctionAsync()
{
 string s1 = await GetExpensiveStringAsync();
   .. control yields; s1 assigned sometime later
 string s2 = await GetAnotherExpensiveStringAsync();
   .. more time passes
 Console.WriteLine(s1 + s2);
}


private static Task<string> GetExpensiveStringAsync()
{
 return Task<string>.Factory.StartNew( () => GetExpensiveString() );
}

// GetAnotherExpensiveStringAsync the same

We’ve introduced two keywords, and one naming convention, to indicate the places where control can switch to another thread and then revert back.

Let’s look at the time-consuming methods first. They have been changed to return Task<string> instead of string, and by convention, their names are now suffixed with ‘Async’. So instead of returning the string we want, they are returning a Task object, a kind of proxy or promise that says “I’ll give the string to you later.” Since my long running functions don't access any shared variables, there are no locking issues, so the asynchronous versions just call the synchronous functions inside a task.

But I’m still assigning the return values to string variables. This works because of the await keyword in front of each call. This essentially says “whenever this task completes, come back here, assign its value, and continue on.”

The other change is to the function signature: we’ve decorated the function with the async keyword, changed the return value to Task, and appended the “Async” suffix, again by convention. These changes tell the compiler and other programmers that this function contains asynchronous control flow.

Asynchronous Control Flow

Our function looks pretty similar (that’s the idea) but behaves quite differently. When you called Function, you sat for a while and it didn’t return until the line was written to the console. In FunctionAsync, the call to GetExpensiveStringAsync starts a task on another thread and then immediately returns, where control flow on this thread continues.

Sometime later, the task completes, and the CLR resumes execution back in FunctionAsync. In this case, the process repeats: the CLR starts another background task, and still later returns to finally execute the console write.

Note that unlike with Function, the caller to FunctionAsync continues executing after the quick return. You can now do other work in parallel with FunctionAsync, and sync up with him if needed. Here is the revised calling code:

// Original calling code
// Function();  // thread blocks here until Function completes ...

Task task = FunctionAsync();

// .. do other work here in parallel with FunctionAsync ..

task.Wait();  // block here in case FunctionAsync is still running

// FunctionAsync has finished. If it had a return value, 
// retrieve it from the task like this:
// var result = task.Result;

So what’s going on to make all of this work?

Under the covers, C# has turned our simple function into a state machine, and created hidden classes that essentially save the call stack on the heap. Each await expression marks a place where the function can switch threads or “pause”, and then resume later on.

This is all conceptually simple but quite complex in the details. This is especially true with resuming complex flow of control (think nested ifs and loops) and propagating exceptions out. There are similarities to how the yield statement works in iterators, and this also leverages the capture of local variables in closures first introduced with lambda expressions.

Even More Asynchronous

We’ve achieved some parallel execution here, but we can do better. We’ve moved GetExpensiveString to the background so that other processing can go on in parallel. But as currently structured, GetAnotherExpensiveString can’t start executing until after GetExpensiveString finishes. This final version fixes that problem:

public async Task FunctionAsync()
{
    Task<string> t1 = GetExpensiveStringAsync();
    Task<string> t2 = GetAnotherExpensiveStringAsync();
 
    await Task.WhenAll(t1, t2);
 
    string s1 = t1.Result;
    string s2 = t1.Result;
    Console.WriteLine(s1 + s2);
}

In order to get both time consuming functions to run in parallel, we have to deal more directly with the task objects. Here, we use the task WhenAll method to wait until all of the other tasks finish. The .NET 4.5 Task API adds several new methods that make it very convenient to compose and synchronize tasks.

Summing Up

Don’t be fooled: there is no magic here. Asynchronous programming is still difficult and complex. This simple example program ends up running on four different threads. It is much less repeatable now as it’s subject to timing variations from run to run, especially on multi-core processors.

Sequential code execution is a fundamental expectation burned into the brain of every programmer. Asynchronous programming will always be difficult precisely because it violates this basic assumption. It forces you to think differently about your programs. It took me about a day of messing around (struggling) with these new features before they started to click into place. In that sense, it is probably similar to learning LINQ or lambda expressions. As always, Microsoft’s tool, sample, and documentation support is there to help get you up to speed.

It’s worth the effort because the benefits are so significant: in this example, each of the expensive functions takes five seconds to execute, and the final parallel program takes slightly longer than five seconds total, as compared to ten for the original version. Besides the speed up, the real win in a production program is the ability to keep the UI live and responsive.

The new .NET asynchronous programming model is a big step forward. As with other areas, it allows you to think and program at a higher level while it manages the details. There’s a lot of sophisticated machinery down in the compiler and run-time to pull this off, and that’s as it should be.

History

October 27, 2011 - Revised to incorporate feedback.
October 26, 2011 - Original article.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)