Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Easy Programming for Multi-Core Processors

12 Feb 2008 1  
Multi-core processors are becoming ubiquitous, but due to the complexities of multithreaded programming few programmers exploit their potential. Jibu is a library for .NET, C++, Java and Delphi that makes concurrent and parallel programming easy for experts and beginners alike.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

This is a showcase review for our sponsors at The Code Project. These reviews are intended to provide you with information on products and services that we consider useful and of value to developers.

Multi-core computers are becoming ubiquitous and almost every new laptop, desktop or server machine is equipped with multiple cores. The shift from single-core to multi-core is mainly due to the difficulties of scaling processors to ever higher clock speeds but the impact on the programming community is enormous. Gone are the days when new processor generations automatically accelerated existing applications to run ever faster – the free lunch is over.

On the other hand the unprecedented rise in computing power afforded by simply adding additional cores on a chip also provides application developers with new opportunities if they can harness the power of multiple cores in their applications. There are lots of reasons why developers should consider writing applications able to exploit multiple cores:

  1. Get the work done faster or better. In computationally heavy areas like multimedia, AI, optimization, scientific computing, game programming, data mining etc. an application exploiting all available power will either get the work done faster or be able to deliver better results in the same time.
  2. In areas where lots of tasks are being run concurrently like embedded systems, robotics, GUI apps etc. exploiting multiple cores will yield more responsive and more effective systems.
  3. In apps that currently have no need for extra power, new features might be feasible with the extra processing power available on a multi-core machine.
  4. The competitive advantage. When you clients know that applications they buy are designed not only to run well on the dual-core or quad-core machines of today but also scales well to the 16-core machines of tomorrow they will rest easy.

The Barriers to Multithreading

The standard method for writing programs that can exploit multiple cores is multithreading. Multithreading has a reputation, and rightly so, for being very complicated and error prone compared to normal single thread programming. Problems such as deadlocks, live locks and data races only occur in multithreaded programs and can be extremely hard to indentify, reproduce and fix.

At the same time multithreaded programming is usually done with a set of very low-level constructs to manage threads, ensure exclusive access to data and synchronize different threads. These construct are not even standardized which means that multithreaded code is very hard to port to different platforms.

Introducing Jibu

Jibu is a library for .NET, Java, C++ and Delphi that makes multithreaded programming easy for experts and beginners alike.

It is equally suited to concurrent applications like robotics and embedded systems and to traditional parallel programs where it is essential that programs exploit the full processing power of the computer.

Jibu lets the programmer focus on what he wants done instead of how he wants it done. In most cases the programmer never has to think in terms of threads, monitors, locks, mutexes, semaphores, condition variables and signals, but can use a handful of high-level Jibu constructs to accomplish the same tasks with less code written and fewer errors made.

Central to Jibu is an advanced work stealing scheduler that at runtime ensures dynamic load balancing between available cores. The programmer never has direct dealings with the Jibu scheduler but can concentrate on specifying the tasks to be executed. Tasks are able to exchange data and synchronize using either channels or mailboxes. At the topmost level Jibu features constructs to make parallel for-loops and reductions very easy to implement

Making Programs Parallel

The best way to get an application to exploit concurrency is to rewrite it completely. This requires a lot of time and effort and many companies chose to shift their codebase gradually in order to support multiple cores. The high-level constructs in Jibu make it very easy to introduce concurrency into existing sequential programs by using the For, ForEach and Reduce constructs. To illustrate the parallel for-loop take a look at the following piece of C# code:

// Multiply square matrices
public static double[,] SeqMult(double[,] m1, double[,] m2)
{
    int size = m1.GetLength(0);
    double[,] m3 = new double[size, size];

    for (int i = 0; i < size; i++)
        for (int j = 0; j < size; j++)
            for (int k = 0; k < size; k++)
                m3[i,j] += m1[i, k] * m2[k, j];
    return m3;
}

The code multiplies two square matrices using three nested for-loops. On a quad-core machine this piece of code would only utilize a single core resulting in poor performance. Using the Jibu for-loop the corresponding code would look like this:

// Multiply square matrices

public static double[,] ParMult(double[,] m1, double[,] m2)
{
    int size = m1.GetLength(0);
    double[,] m3 = new double[size, size];

    Parallel.For(0, size, delegate(int i)
    {
        for (int j = 0; j < size; j++)
            for (int k = 0; k < size; k++)
                m3[i,j] += m1[i, k] * m2[k, j];
    });

    return m3;
}

Notice that the outermost for-loop is replaced with the Jibu Parallel. For that takes a delegate – the rest of the code is identical to the sequential version. So how about performance: The graph below shows the execution time for the sequential and parallel versions of the matrix multiplication. The test was done on a quad-core machine. As the size of the matrices increase so does the speedup achieved using Jibu. Getting a speedup of more than 3.6 by replacing a single line of code is not bad.

image001.gif

Task Based Parallelism and the Jibu Scheduler

Tasks are the central concept when writing programs using Jibu. The two types of tasks in Jibu are called Async and Future respectively. The only difference between an Async and a Future is that a Future returns a value when done whereas an Async does not. Here is what an Async looks like:

class DemoTask : Async
{
    public override void Run()
    {
        for (int i = 0; i < 100000; i++) ;
            //Do something
    }
}
    static void Main(string[] args)
    {
        DemoTask[]tasks = new DemoTask[1000000];
        for (int i = 0; i < tasks.Length; i++)
            tasks[i] = new DemoTask();
        Parallel.Run(tasks);
}

The small sample creates an Async task called DemoTask that executes a for-loop. The Main method creates a million new DemoTasks and executes them using the Parallel.Run construct from Jibu. Notice that there are no threading and synchronization constructs used in the code and yet this code will execute just as fast or faster than a much more complex thread based program.

The task scheduler continuously monitors the number of tasks lined up for execution, the number of threads currently executing tasks, the number of blocked tasks and the number of available cores in the system.

All the information is analyzed and the scheduler steps in when necessary to ensure a balanced workload, an optimum number of threads and a responsive program. The Jibu scheduler works in conjunction with the Jibu thread pool to ensure that threads are started when needed and stopped when they are no longer required.

If the example program with the 1 million DemoTasks was run on a dual-core machine only two threads would be created to execute the tasks, while 8 threads would have been created had the program been run on an octo-core machine.

More information about the Jibu scheduler

Task Communication

One of the biggest advantages of Jibu compared to standard threading and other high-level libraries is the simplicity with which different tasks can exchange data, coordinate events and synchronize with each other.

Only three different Jibu constructs (Channel, MailBox & Choice) are required in order to write concurrent programs with arbitrarily complex communication and synchronization patterns.

Every task has an integrated mailbox in which other tasks can put data of any type without worrying about synchronization or locking. Mailboxes are buffered and allows for easy one-to-one communication between tasks.

Channels abstract low level synchronization mechanisms and provide an extremely simple way for multiple tasks to exchange data. No need to be concerned with traditional constructs like locks, mutexes, semaphores, critical regions or conditions variables.

The Choice construct makes it easy to coordinate communication between multiple tasks. Choice enables both fair and prioritized communication between multiple tasks eliminating the risk of starvation. It also ensures that CPU cycles are not wasted while waiting for specific events to occur.

The structure of a Jibu program using tasks, channels, choices and mailboxes is shown below:

Jibu_diagram.gif

For a more detailed description of mailbox, channel and choice please visit the Jibu documentation center.

Jibu on Multiple Platforms

If you have ever had the misfortune to have to port a concurrent or parallel application from one programming language to another or from one operating system to another you know that even though threading and synchronization construct are available on most platform and in most programming languages they are very different with regard to syntax and semantics.

Jibu is currently available for .NET 2.0 or later, Java 5.0 or later, C++ for Windows and Delphi 2007 for Windows. C++ support for Linux and Solaris is in the works and support for additional languages is likely.

The Jibu API is as uniform as possible across the supported languages and platforms, which means that the concurrent and parallel parts of a Jibu application are easily ported to other Jibu supported languages and platforms.

Another advantage of the uniform API is that developers can prototype in one language and subsequently port easily to other languages. A developer proficient in C# might for example do a prototype of a Jibu application and then port it to C++, Delphi and Java without knowing any of the threading constructs native to those languages.

Download Jibu

A free trial download of Jibu, intended only for non-commercial use, is available at the download center.

The commercial versions of Jibu are available here.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here