(untagged)

Await Tasks in C#4 using Iterators

Keith L Robertson

0.00/5 (No votes)

29 Aug 2013

Write synchronous-looking asynchronous methods without async/await in Visual Studio 2010.

Download AwaitTasksInCSharp4.zip - 11.3 KB

Introduction

So, you've read about the C#5 async and await keywords and how they help simplify asynchronous programming. Alas, your employer (or you) upgraded to Visual Studio 2010 just two years ago and isn't ready to shell out another round for VS 2012. You are stuck with VS 2010 and C#4 where that feature is unsupported. (This article also applies to VB.NET 2010; different syntax, but same approach.) You're left pining, "Oh, how much clearer my code would be if only I could write methods in VS 2010 which look synchronous but perform asynchronously."

After reading this article, you will be able to do just that. We will develop a small piece of infrastructure code which does the heavy lifting, allowing us to write synchronous-looking asynchronous methods (SLAMs) in a manner like that enjoyed in C#5. (Note: If you are already using C#5, of if you're satisfied using Microsoft's unsupported Async CTP, then this article does not apply to you.)

We must admit at the start that async/await is a fine topping of syntactic sugar which we don't have, so our code will be a little more salty than it would be with them. But it will be far more palatable than the bitter taste of writing our own IAsyncResult callbacks! And when you finally upgrade to VS 2012 (or beyond), it will be a trivial matter to convert your methods to take advantage of the C#5 keywords; it will require simple syntactic changes, not a laborious structural rewrite.

Overview

The async/await keywords are built on the Task Asynchronous Pattern. TAP is well-documented elsewhere, so I won't cover it here. I must add as a personal note: TAP is super cool! You can create lots of little units of work (tasks) to be completed at some time; tasks can start other (nested) tasks and/or set up continuation tasks to begin only when one or more antecedent tasks have completed. A task doesn't necessarily tie up a thread (a heavyweight resource) while nested tasks complete. And you don't have to worry about scheduling threads to execute tasks; this is handled automatically by the framework with minimal helpful hints from you. Then when you set your program running, all the tasks trickle down to completion, bouncing off each other like steel balls in a virtual Pachinko Machine!

In C#4/.NET 4 we don't have async and await, but we do have the Task types minus only a few .NET 4.5 additions which we can do without or build ourselves.

In a C#5 async method, you would await a Task. This does not cause the thread to block; instead, the method returns a Task to its caller, on which it can await (if itself async) or attach continuations. (A non-async caller could also Wait() on the task or its Result, but this will tie up the thread, so avoid that.) When the awaited task completes successfully, your async method continues where it left off.

As you may know, the C#5 compiler rewrites its async methods into a generated nested class which implements a state machine. There is another feature of C# (since 2.0) which does exactly that: iterators (with yield return). The idea here is to use an iterator method to build the state machine in C#4, returning a sequence of Tasks which are the steps to await in the overall process. We will develop a method which accepts an enumeration of tasks returned by the iterator, returning a single overriding Task which represents the completion of the entire sequence and provides its final Result (if any).

The End Goal

Stephen Covey advised us to begin with the end in mind. That's what we'll do here. Examples abound of how to write SLAMs with async/await. How will we write them without those keywords? Let's start with a simple C#5 async method and see how to represent it in C#4. Then we'll discuss more generally how to transform any code segments which need it.

Here is how we might write Stream.CopyToAsync() in C#5 using asynchronous reads and writes, if it weren't already available in .NET 4.5. (We can actually use the transformed version in .NET 4 which doesn't have it! Download the sample code for ReadAsync() and WriteAsync().)

public static async Task CopyToAsync(
    this Stream input, Stream output,
    CancellationToken cancellationToken = default(CancellationToken))
{
    byte[] buffer = new byte[0x1000];   // 4 KiB
    while (true) {
        cancellationToken.ThrowIfCancellationRequested();
        int bytesRead = await input.ReadAsync(buffer, 0, buffer.Length);
        if (bytesRead == 0) break;

        cancellationToken.ThrowIfCancellationRequested();
        await output.WriteAsync(buffer, 0, bytesRead);
    }
}

For C#4, we'll break this into two: one method with the same signature and accessibility, and one private method with identical arguments but a different return type. The private method is the iterator implementing the same process, resulting in a sequence of tasks (IEnumerable<Task>) to await. The actual tasks in the sequence can be non-generic or generic of varying types, in any combination. (Fortunately, generic Task<T> types are subtypes of the non-generic Task type.)

The same-accessibility (here "public") method returns the same type as the corresponding async method would: void, Task, or a generic Task<T>. It is a simple one-liner which invokes the private iterator and transforms it into a Task or Task<T> using an extension method.

public static /*async*/ Task CopyToAsync(
    this Stream input, Stream output,
    CancellationToken cancellationToken = default(CancellationToken))
{
    return CopyToAsyncTasks(input, output, cancellationToken).ToTask();
}
private static IEnumerable<Task> CopyToAsyncTasks(
    Stream input, Stream output,
    CancellationToken cancellationToken)
{
    byte[] buffer = new byte[0x1000];   // 4 KiB
    while (true) {
        cancellationToken.ThrowIfCancellationRequested();
        var bytesReadTask = input.ReadAsync(buffer, 0, buffer.Length);
        yield return bytesReadTask;
        if (bytesReadTask.Result == 0) break;

        cancellationToken.ThrowIfCancellationRequested();
        yield return output.WriteAsync(buffer, 0, bytesReadTask.Result);
    }
}

The asynchronous method name usually ends with "Async" (unless it's an event handler, e.g. startButton_Click). Give its iterator the same name appending "Tasks" (e.g. startButton_ClickTasks). If the asynchronous method returns void, it still calls ToTask() but doesn't return the Task. If the asynchronous method returns a Task<X>, then it invokes a generic ToTask<X>() extension method. For the three return types, the async-replacement method looks as follows:

public /*async*/ void DoSomethingAsync() {
    DoSomethingAsyncTasks().ToTask();
}
public /*async*/ Task DoSomethingAsync() {
    return DoSomethingAsyncTasks().ToTask();
}
public /*async*/ Task<String> DoSomethingAsync() {
    return DoSomethingAsyncTasks().ToTask<String>();
}

The paired iterator method isn't much more complicated. Where the async method would await a non-generic Task, the iterator simply yields it. Where the async method would await a task result, the iterator saves the task in a variable, yields it, then uses its Result afterward. Both cases are shown in the CopyToAsyncTasks() example above.

For a SLAM with a generic result Task<X>, the iterator must yield a final task of that exact type. ToTask<X>() will typecast the final task to that type to extract its Result. Often your iterator will calculate the value from intermediate task results and just needs to wrap it in a Task<T>. .NET 4.5 provides a convenient static method for this. We don't have it in .NET 4, so we will implement it as TaskEx.FromResult<T>(value).

The last thing you need to know is how to handle a return from the middle. An async method can return from an arbitrarily nested block; our iterator mimics this by ending the iteration after yielding the return value (if any).

// C#5
public async Task<String> DoSomethingAsync() {
    while (…) {
        foreach (…) {
            return "Result";
        }
    }
}

// C#4;  DoSomethingAsync() is necessary but omitted here.
private IEnumerable<Task> DoSomethingAsyncTasks() {
    while (…) {
        foreach (…) {
            yield return TaskEx.FromResult("Result");
            yield break;
        }
    }
}

Now we know how to write a SLAM in C#4, but we can't actually do it until we implement FromResult<T>() and two ToTask() extension methods. Let's get to it.

An Easy Start

We will implement our 3 methods in a class System.Threading.Tasks.TaskEx, starting with the two which are straightforward. FromResult<T>() creates a TaskCompletionSource<T>, populates its result, and returns its Task.

public static Task<TResult> FromResult<TResult>(TResult resultValue) {
    var completionSource = new TaskCompletionSource<TResult>();
    completionSource.SetResult(resultValue);
    return completionSource.Task;
}

Clearly, the two ToTask() methods are essentially identical, the only difference is in whether the returned task has a result value. We don't want to code and maintain the same process twice, so we will implement one using the other. The generic implementation will look for a "marker type" to know that we don't really care about a result value, and it will avoid typecasting the final task. Then we can implement the non-generic version using the marker type.

private abstract class VoidResult { }

public static Task ToTask(this IEnumerable<Task> tasks) {
    return ToTask<VoidResult>(tasks);
}

So far, so good. Now all that's left is to implement the generic ToTask<T>(). Hang on, guys, we goin' for a ride.

A Naïve First Attempt

For our first attempt at implementing the method, we'll enumerate the returned tasks, Wait() for each to complete, then set the result from the final task (if appropriate). Of course, we don't want to tie up the current thread during this process, so we'll start another task to perform this loop.

// BAD CODE !
public static Task<TResult> ToTask<TResult>(this IEnumerable<Task> tasks)
{
    var tcs = new TaskCompletionSource<TResult>();
    Task.Factory.StartNew(() => {
        Task last = null;
        try {
            foreach (var task in tasks) {
                last = task;
                task.Wait();
            }

            // Set the result from the last task returned, unless no result is requested.
            tcs.SetResult(
                last == null || typeof(TResult) == typeof(VoidResult)
                    ? default(TResult) : ((Task<TResult>) last).Result);

        } catch (AggregateException aggrEx) {
            // If task.Wait() threw an exception it will be wrapped in an Aggregate; unwrap it.
            if (aggrEx.InnerExceptions.Count != 1) tcs.SetException(aggrEx);
            else if (aggrEx.InnerException is OperationCanceledException) tcs.SetCanceled();
            else tcs.SetException(aggrEx.InnerException);
        } catch (OperationCanceledException cancEx) {
            tcs.SetCanceled();
        } catch (Exception ex) {
            tcs.SetException(ex);
        }
    });
    return tcs.Task;
}

There are some good things here, and it actually works as long as it doesn't touch a User Interface:

It correctly returns a TaskCompletionSource's Task and sets completion status via the Source.
It shows how we can set the task's final Result using the iterator's last task, avoiding that when no result is desired.
It catches exceptions from the iterator to set Canceled or Faulted status. It also propagates the enumerated task's status (here via Wait() which may throw an AggregateException wrapping the cancellation or fault exception).

But there are major problems here. The most egregious are:

For the iterator to live up to its "synchronous-looking" promise, then when it's initiated from a UI thread, the iterator method should be able to access UI controls. You can see here that the foreach loop (which calls into the iterator) runs in the background; don't touch the UI from there! This approach does not respect the SynchronizationContext.
We have problems outside of a UI, too. We may want to spawn many, many Tasks in parallel implemented by a SLAM. But look at that Wait() inside the loop! While waiting on a nested task, possibly a long time for a remote operation to complete, we are tying up a thread. We would starve ourselves of thread pool threads.
Unwrapping the AggregateException this way is just plain hokey. We need to capture and propagate its completion status without it throwing an exception.
Sometimes the SLAM can determine its completion status immediately. In that case, a C#5 async method would operate synchronously and efficiently. We always schedule a background task here, so we've lost that possibility.

It's time to get creative!

Looping by Continuation

The big idea is to obtain the first task yielded from the iterator immediately and synchronously. We set up a continuation so that when it completes, the continuation checks the task's status and (if it was successful) obtains the next task and sets up another continuation; and so on until finished. (If ever it does, that is; there's no requirement that an iterator ends.)

// Pretty cool, but we're not there yet.
public static Task<TResult> ToTask<TResult>(this IEnumerable<Task> tasks)
{
    var taskScheduler =
        SynchronizationContext.Current == null
            ? TaskScheduler.Default : TaskScheduler.FromCurrentSynchronizationContext();
    var tcs = new TaskCompletionSource<TResult>();
    var taskEnumerator = tasks.GetEnumerator();
    if (!taskEnumerator.MoveNext()) {
        tcs.SetResult(default(TResult));
        return tcs.Task;
    }

    taskEnumerator.Current.ContinueWith(
        t => ToTaskDoOneStep(taskEnumerator, taskScheduler, tcs, t),
        taskScheduler);
    return tcs.Task;
}
private static void ToTaskDoOneStep<TResult>(
    IEnumerator<Task> taskEnumerator, TaskScheduler taskScheduler,
    TaskCompletionSource<TResult> tcs, Task completedTask)
{
    var status = completedTask.Status;
    if (status == TaskStatus.Canceled) {
        tcs.SetCanceled();

    } else if (status == TaskStatus.Faulted) {
        tcs.SetException(completedTask.Exception);

    } else if (!taskEnumerator.MoveNext()) {
        // Set the result from the last task returned, unless no result is requested.
        tcs.SetResult(
            typeof(TResult) == typeof(VoidResult)
                ? default(TResult) : ((Task<TResult>) completedTask).Result);

    } else {
        taskEnumerator.Current.ContinueWith(
            t => ToTaskDoOneStep(taskEnumerator, taskScheduler, tcs, t),
            taskScheduler);
    }
}

There is a lot to appreciate here:

Our continuations use a TaskScheduler which respects the SynchronizationContext, if there is one. This allows our iterator, invoked immediately or from a continuation, to access UI controls when initiated from the UI thread.
The process runs by continuations, so no threads are tied up waiting! Incidentially, that call within ToTaskDoOneStep() to itself is not a recursive call; it's in a lambda which is invoked after the taskEnumerator.Current task completes. The current activation exits almost immediately after calling ContinueWith(), and it does so independently of the continuation.
We check each nested task's status directly within its continuation, not by inspecting an exception.
The first iteration occurs synchronously.

However, there is at least one huge problem here and some lesser ones.

If the iterator throws an unhandled exception, or cancels by throwing an OperationCanceledException, we don't handle it and set the master task's status. This is something we had previously but lost in this version.
To fix problem #1, we would have to introduce identical exception handlers in both methods where we call MoveNext(). Even as it is now, we have identical continuations set up in both methods. We are violating the "Don't Repeat Yourself" rule.
What if Async method's task is expected to provide a Result, but our iterator exits without providing any tasks? Or what if its final task is of the wrong type? In the first case, we silently return the default result type; in the second, we throw an unhandled InvalidCastException. Since this exception is never observed, the task system would abort our entire Process!
Finally, what if a nested task cancels or faults? We set the master task status and never invoke the iterator again. It may have been inside a using block or try block with a finally and have some cleaning up to do. We should Dispose() the iterator when it terminates, not wait for the garbage collector to do it. (Previously I used a continuation for this, but it didn't handle exceptions. I found a lighter-weight alternative which does.)

To fix these issues, we'll remove the MoveNext() call from ToTask() and instead make an initial synchronous call to ToTaskDoOneStep(). Then we can add appropriate exception handling in one place.

The Final Version

Here is the final implementation of ToTask<T>(). It

returns a master task using a TaskCompletionSource,
performs the first iteration synchronously/efficiently,
respects the SynchronizationContext if any,
never blocks a thread,
handles exceptions from the iterator,
propagates nested task completion directly (without AggregateException),
returns a value to the master task when appropriate,
faults with a helpful exception when the SLAM iterator doesn't end with a valid result, and
disposes the enumerator when it completes.

public static Task<TResult> ToTask<TResult>(this IEnumerable<Task> tasks) {
    var taskScheduler =
        SynchronizationContext.Current == null
            ? TaskScheduler.Default : TaskScheduler.FromCurrentSynchronizationContext();
    var taskEnumerator = tasks.GetEnumerator();
    var completionSource = new TaskCompletionSource<TResult>();

    ToTaskDoOneStep(taskEnumerator, taskScheduler, completionSource, null);
    return completionSource.Task;
}

private static void ToTaskDoOneStep<TResult>(
    IEnumerator<Task> taskEnumerator, TaskScheduler taskScheduler,
    TaskCompletionSource<TResult> completionSource, Task completedTask)
{
    try {
        // Check status of previous nested task (if any), and stop if Canceled or Faulted.
        // In these cases, we are abandoning the enumerator, so we must dispose it.
        TaskStatus status;
        if (completedTask == null) {
            // This is the first task from the iterator; skip status check.
        } else if ((status = completedTask.Status) == TaskStatus.Canceled) {
            taskEnumerator.Dispose();
            completionSource.SetCanceled();
            return;
        } else if (status == TaskStatus.Faulted) {
            taskEnumerator.Dispose();
            completionSource.SetException(completedTask.Exception.InnerExceptions);
            return;
        }
    } catch (Exception ex) {
        // Return exception from disposing the enumerator.
        completionSource.SetException(ex);
        return;
    }

    // Find the next Task in the iterator; handle cancellation and other exceptions.
    Boolean haveMore;
    try {
        // Enumerator disposes itself if it throws an exception or completes (returns false).
        haveMore = taskEnumerator.MoveNext();

    } catch (OperationCanceledException cancExc) {
        completionSource.SetCanceled();
        return;
    } catch (Exception exc) {
        completionSource.SetException(exc);
        return;
    }

    if (!haveMore) {
        // No more tasks; set the result (if any) from the last completed task (if any).
        // We know it's not Canceled or Faulted because we checked at the start of this method.
        if (typeof(TResult) == typeof(VoidResult)) {        // No result
            completionSource.SetResult(default(TResult));

        } else if (!(completedTask is Task<TResult>)) {     // Wrong result
            completionSource.SetException(new InvalidOperationException(
                "Asynchronous iterator " + taskEnumerator +
                    " requires a final result task of type " + typeof(Task<TResult>).FullName +
                    (completedTask == null ? ", but none was provided." :
                        "; the actual task type was " + completedTask.GetType().FullName)));

        } else {
            completionSource.SetResult(((Task<TResult>) completedTask).Result);
        }

    } else {
        // When the nested task completes, continue by performing this function again.
        taskEnumerator.Current.ContinueWith(
            nextTask => ToTaskDoOneStep(taskEnumerator, taskScheduler, completionSource, nextTask),
            taskScheduler);
    }
}

Await Within a Try/Catch Block

In C#5 async methods you can await a task within the try block of a try-catch; its state machine supports that scenario. The C#2 iterator state machine we are using here does not allow yield return within such a try block; therefore we cannot easily perform our awaits in equivalent locations. Handling the general case with multiple or nested try-catch blocks will require some manual effort.

There is an easy way to deal with one common special case, where the try-catch encompases the entire method. The private iterator contains just the try body. In the master asynchronous method, attach a continuation after the ToTask() call and handle the exception there. It would look something like this:

public Task<TheResult> DoSomethingAsync(TheArgs args) {
    return DoSomethingAsyncTasks(args).ToTask<TheResult>().ContinueWith(t => {
        try {
            if (t.IsFaulted)
                throw t.Exception.Flatten().InnerException;

        } catch (Exception ex) {    // One per handled exception.
            // Handle it.
        } finally {
            // Wrap up, if you need it.
        }
        return t;
    }).Unwrap();
}

For arguments and locals used in both the try and catch blocks, create a private nested class with them as public fields. Create an instance, copy the arguments, and pass an instance to the iterator. This is some of the magic done for us by the state machine builders, but here we must do it ourselves.

public Task<TheResult> DoSomethingAsync(TheArgs args) {
    var locals = new DoSomethingAsyncLocals();
    locals.Fields = args;           // Copy each argument into a field of the nested class.
    return DoSomethingAsyncTasks(locals).ToTask<TheResult>().ContinueWith(t => {
		//...as above...

To handle the general case, break each try body with a catch into its own AsyncTasks method. Where the try-catch appears, yield return that method's iterator, converted to a task, with a continuation, with result unwrapped as shown above. Alas, this loses the simplicity of async methods. If only iterators supported yield return from a try! Fortunately, many if not most real-world scenarios can be implemented without having to create a nested class for locals. In 9 months using the technique described in this article, I haven't needed to do that even once. (I have only used the whole-method exception handling pattern shown above.)

Voila! Now you can write SLAMs (synchronous-looking asynchronous methods) in Visual Studio 2010 with C#4 (or VB.NET 2010), where async and await are not supported.

About the Download Sample

The downloaded project contains two infrastructure files which you can compile into your assembly to support asynchronous programming in .NET 4: TaskEx.cs has the methods developed in this article; AsyncIoEx.cs provides some methods added in .NET 4.5 to support asynchronous stream and web operations. (It is surely a simple matter to translate them for use in VB.NET 2010.)

As examples, MainWindow.xaml.cs implements two asynchronous methods as described in this article, and it makes productive use of a continuation in an event handler. The sample is derived from an Async/Await Walkthrough project. For an exercise, remove the ToTask() methods and try re-implementing the asynchronous methods only with task continuations or other callbacks. If the process is linear and all waits are at the top level, the code is ugly but not too difficult to write. As soon as the needed await falls within a nested block, it becomes virtually impossible to keep the same semantics and remain asynchronous (i.e. to never Wait() on a Task) without using the method described in this article.

Points of Interest

Up until the final versions, I was passing a CancellationToken into ToTask() and propagating it into the ToTaskDoOneStep() continuations. (It was irrelevant noise for this article, so I removed them.) This was for two reasons. First, when handling OperationCanceledException I would check its CancellationToken to be sure it matched the one for this operation. If not, it would be a fault instead of a cancellation. While technically correct, it's so unlikely that cancellation tokens would get mixed up that it wasn't worth the trouble of passing it into the ToTask() call and between continuations. (If you Task experts can give me a good use case in the comments where this might validly happen, I'll reconsider.)

The second reason was that I could check if the token was canceled before each MoveNext() call into the iterator, cancel the master task immediately, and exit the process. This provides cancellation behavior without your iterator having to check the token. I am now convinced it is the wrong thing to do, since cancellation at some given yield return may be inappropriate for an asynchronous process — better perhaps that's it's fully under the iterator process control — but I wanted to try it. It didn't work. I found that in some cases, the task would cancel and its continuations would not fire. In the sample code I'm depending on a continuation to re-enable the buttons, but it wasn't happening reliably, so sometimes the buttons were left disabled after the process was canceled. (If any Task experts can explain why this problem occurs, I'll appreciate it!)

History

2012-12-06

Initial version

2012-12-11

Added "Differences from Async/Await" section

2013-08-29

Replaced "Differences from Async/Await" with "Await Within a Try/Catch Block" section

Updated ToTask implementation

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here