Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / VB

PLINQ is Coming Up Soon (PFX)

4.97/5 (15 votes)
24 Oct 2007CPOL4 min read 1  
Multicore hardware is spreading very fast, however the programming for Multicore is not an easy job, Parallel FX (PFX) comes to the rescue

Introduction

I've been hearing about PLINQ (Parallel LINQ) since the first days of announcing LINQ, the idea of making use of the new functional style programming provided in .NET 3.5 in order to give better performance on Multi Core machines. The idea sounds cool since the first day, and it now comes true in a new name Parallel FX or PFX.

Note: At the time this article was written, Microsoft didn't release any version of PFX. The article just gives you a future vision collected from many sources from inside and outside Microsoft. Once the CTP is released, the article is going to be updated, so keep the link in your bookmarks.

Background

LINQ (Language Integrated Queries) is a new feature in C# 3.0, VB 9 and .NET 3.5 that brings the concept of Queries as a first class citizen in the next versions of .NET programming languages. The idea is to provide a better abstraction in the way programs are handling data, so the compiler and runtime can help in many ways. One of these ways is optimizing for performance.

PLINQ is a way to make your code run on Multicore machines (which are very common now and in the future) without explicitly defining threads, locks, etc.

The programming model provided is quite simple and utilizes the same LINQ model, the new assembly is called System.Concurrency.dll which is the library that contains the new interface called IParallelEnumerable<T>. It also adds an extension method for all collections and arrays that implement old IEnumerable. The extension method is called AsParrallel<T> which converts any collection to a Parallel enabled collection of type IParallelEnumerable<T>.

Using the Code

Take a look at the following code:

C#
// or whatever complex data collection
IEnumerable<int> data = new int[] {1, 2, 3, 4, 5, 6}; 
var q = data.Where(x => x > 4).OrderBy(x=>x).Select(x => x);
foreach (var i in q) ....

This code is what you already write in C# 3.0. Now if you want to add PLINQ support, all that you have to do is add the AsParallel function call before using any query operation so the previous code will look like this:

C#
IEnumerable<int /> data = new int[] {1, 2, 3, 4, 5, 6}; // or whatever complex data collection
var q = data.AsParallel().Where(x => x > 4).OrderBy(x=>x).Select(x => x);
foreach (var i in q) ....

Or you can write it in the LINQ query style:

C#
IEnumerable<int /> data = new int[] {1, 2, 3, 4, 5, 6}; // or whatever complex data collection
var q = from i in data.AsParallel()
        where i > 4
        orderby i
        select i;
foreach(var i in q) .... 

Once you've added the AsParallel function call, PLINQ will be ready to execute transparently all the OrderBy, Where, Select, GroupBy ... etc. on all the available processors. You don't need to explicitly create threads, locks and manage concurrent execution (unless you are making something big). This doesn't mean that you can make use of the PLINQ power on anything other than Queries. The ParallelEnumerable class also adds some extra extension methods like the ForAll method. The ForAll method is useful if you are applying some kind of operation on all the members of a certain collection, so the ForAll function will do this operation in parallel for all the members of the collection.

C#
IEnumerable<int /> data = new int[] {1, 2, 3, 4, 5, 6}; // or whatever complex data collection
data.ForAll(i=>Console.WriteLine(i));

The previous code sample will print out all the members of the array, if you imagined calling a more complex function that does some heavy work on each array member, the ForAll will give you extra power to do the job faster by making use of the parallel data processing techniques. This is not all the new stuff introduced in the System.Concurrency library. The new Parallel class is also a nice addition. It provides some extra general purpose parallel execution, so it is not related to LINQ. The most important part is the Parallel.For function, which as you expected from the name executes a parallel loop. Check the following code:

C#
void ParMatrixMult(int size, double[,] m1, double[,] m2, double[,] result)
{
  Parallel.For( 0, size, delegate(int i) {
    for (int j = 0; j < size; j++) {
      result[i, j] = 0;
      for (int k = 0; k < size; k++) {
        result[i, j] += m1[i, k] * m2[k, j];
      }
    }
  });
}

This is an example I got from here. It illustrates a Matrix Multiplication using Parallel.For. As you can see, the Parallel.For method accepts the start index and the length, then a delegate to execute. There is also the Parallel.Aggregate function which can be used to aggregate a certain data item over a parallel loop safely. This is all that I could write in one post, however System.Concurrency contains more cool APIs.

Points of Interest

PLINQ = PFX. PLINQ was just a future vision of what LINQ can bring to software development. PFX is a bigger concept that PLINQ is just a subset of; it comes with lots of general purpose APIs that help in different concurrency problems that you might face.

Resources

Here come the resources for further readings. I originally wrote this article on my blog:

  1. Optimize Managed Code for Multi-Core Machines
  2. Running Queries on Multi-Core processors
  3. Channel9 Video Programming in the Age of Concurrency (Andres Hejlsberg and Joe Duffy)

History

  • 24th October, 2007: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)