Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / Java

Think Like a Bird for Better Parallel Programming

4.71/5 (12 votes)
12 Feb 2018CPOL12 min read 13.9K  
Avian Computing provides new tools for thinking and designing parallel programs

Introduction

Coding an application to run in parallel is hard, right? I mean, it must be hard or we’d see parallel programs everywhere. All we'd see are slick parallel apps that use every available core effortlessly. Instead multi-threaded apps are the exception instead of the rule.

It seems that there are two major obstacles to writing parallel programs:

  • Learning the constructs and/or conventions for parallel programming available in your chosen language

  • Visualizing how your parallel program will function

The first item seems obvious: take an afternoon off and learn the parallel features in your chosen programming language and then away you go - parallel programs will start popping out of your compiler. Except that afternoon usually turns into a couple of days which usually turns into a much longer period while the implications, complications, and ramifications of the parallel features of your chosen language are tamed.

The second item seems like it should be almost too trivial to mention. After all, the first step in developing a new program is to visualize its major components and how they will operate. Except we usually visualize our new programs as sequential code components that we’ll “bolt on” some parallel stuff onto later (if we have to). And it’s that mental framework puts us on the wrong path from the beginning.

Instead we need to think about programs like birds. Or rather, flocks of birds.

Background

The Avian Computing project was started as a way to improve parallel programming by improving how we think about parallel programming. Avian Computing encourages us to visualize the parallel program as a flock of birds where each bird (thread) acts independently and asynchronously but they work concurrently as a flock to accomplish the program’s goals.

We are all familiar with birds: they are hatched, they fly around looking for food, they lay eggs and hatch their own birds, and then they die. Avian Computing translates these basic bird behaviors into a coding framework that developers can leverage to rapidly prototype working parallel programs. The depth-of-knowledge and insights gained from building this working prototype can streamline the development of the final software product.

Implementation of Avian Computing

The Concurrency Explorer (ConcX) is a free open source implementation of the concepts of Avian Computing written in Java. ConcX provides the user with a GUI screen where new birds can be created and configured. Once configured, the bird(s) can be launched and their activities monitored on progress bars. Groups of configured birds can be saved as a “flock” that can be reloaded and run as desired.

Image 1Once hatched (started), each bird (thread) follows a standard lifecycle; it looks for food, digests any matching food found, stores any resulting food and then takes a little nap. Birds that were unsuccessful in finding food within their configured Stamina time limit will die from starvation. Birds that live too long will die of old age. Birds that eat successfully enough to meet a configurable setting will reproduce (duplicate themselves). This standard lifecycle allows the complexities of thread management to be handled with a simple and natural vocabulary that is conceptually understood almost intuitively.

ConcX depends on the Linda coordination language to put and retrieve objects from shared virtual associative memory known as tuplespaces. Linda originated in 1986 by Sudhir Ahuja, David Gelernter, and Nicholas Carriero. Linda was the basis for several major products, including Sun’s JavaSpaces, IBM’s TSpaces, and many more.

ConcX uses a simplified version of Linda tuplespaces called the TupleTree. Birds typically are configured to eat and store food pods in the TupleTree. For example, a RedPod can be eaten by any bird that was configured to eat RedPods and will be ignored by all other birds. After digesting its food pod, a bird will store the processed object back in the TupleTree, typically as a different kind of food pod (for example BluePod) that a different bird can eat.

Linda (and its derivative, the TupleTree) provides a safe, simple and robust message-passing mechanism. In ConcX, the TupleTree allows objects to be shared without any special code or considerations in the code that users write because the TupleTree synchronizes (locks or otherwise provides exclusive access to) all methods that interact with its underlying data store.

Making the TupleTree responsible for all locking ensures that any food pod (object) that a bird receives is owned exclusively by that bird without cluttering up user code with locking and mutexes, etc. This means that any bird that has received a food pod is free to make any changes to that food pod without contention or interference from any other birds. After making any necessary changes, that bird can put its food pod back into the TupleTree where a different kind of bird will eat it and make its changes.

A Simple Example of Thinking in Parallel the Avian Way

The Addx scenario provided with ConcX is a straightforward example of how the concepts of Avian Computing can produce simpler parallel code. The goal of the Addx scenario is to process a sequence of values by performing a series of math operations on each value until a final value is calculated and then saved. The math operations must always be performed in the same sequence.

In standard, sequential coding, we would visualize a loop that begins by getting the next value, performing the first math operation, then the second math operation, then the third math operation, etc, until the final value has been calculated, the value is saved and then the loop repeats. It seems relatively simple and will run as fast as is possible. Its maximum throughput will depend on the speed of the one processor executing its single thread.

To process more values more quickly, it is desireable to use multiple processors and that's when it gets tricky. The normal solution is to create a pool of threads with each thread performing the sequential code described above. But that complicates processing because the parallel code must insure that the input values are processed by only one thread and no input values are skipped, while guaranteeing that deadlocks and livelocks don't happen.  It is extremely difficult to imagine all possilbe ways that the threads can interfere with each other - too many times new and surprising failure modes are demonstrated at the client's site. Not to mention having to provide multiple versions of the app pre-configured and compiled for each run time environment (laptop vs mainframe, etc).

The Addx (Avian) solution is configure an arbitrarily large flock of birds where each bird eats only one type of food and performs only one math operation on that food before putting it back as a different type of food. By properly configuring the foods eaten and stored, the proper order of math operations is insured. For example, Bird1 eats Food1, performs its math operations on it, and stores it as Food2.  Bird2 eats Food2, performs its math operations on it, and stores it as Food3.  Bird3 eats Food3, and so on. This could also be expressed more concisely as:

Food1-->Bird1-->Food2-->Bird2-->Food3-->Bird3. . . .Foodn-->Birdn.

The following simplified diagram illustrates Avian parallelism.

  • In Lifecycle 1 below, all five birds start flying at about the same time but only Add1Bird finds any food. It performs its operation on the value and then puts it back into the TupleTree as a kind of food that only the Add2Bird eats.
  • In Lifecycle 2, both Add1Bird and Add2Bird find food so they both perform their respective operations and then store their foods back in the TupleTree. 
  • In Lifecycle 3, Add1Bird, Add2Bird, and Add3Bird all find their kind of food, process them, and put them back into the TupleTree.

Image 2

By about the fifth loop thru their life cycle (find food, digest it, store it, and nap) all five birds are concurrently eating from the TupleTree, processing their pod, and storing their updated pod back in the tree. Only 5 birds are diagrammed above, but it is simple to imagine scaling up this diagram to include 20 or 50 or 100 birds that would all be running (flying) concurrently, all following the same simple and natural pattern. And just like real birds, if any bird looks for food and doesn't find it, it just waits a little while and tries again. 

IMPORTANT: the birds do NOT operate in lockstep as shown in this simplified diagram.  Each bird lives its lifecycle at its own individual rate because every time it naps, it sleeps for a random length of time (within configurable limits). This means that a bird that randomly takes short naps will more quickly complete its lifecycles than a bird that randomly takes longer naps. Over time the nap lengths tend to even out so one that was fast for a while will be slower later on. In real life, it might take 3 or 10 or 15 cycles before the Add5Bird (or any other bird) begins to eat.

Configuring Birds

ConcX provides GUI screens to add, configure, and launch birds. The GUI screens also provide dynamic, real time status updating of the birds while they're flying. The following screenshot shows five birds running the Addx scenario. The progress bars on the right side of the screen display each bird's individual success in real time. The longer each bird's progress bar is, the more times the bird has successfully eaten.

Image 3

Because each bird has user selectable food types, it is simple to rearrange the order that the math operations are performed in. Just change their selected foods and re-run. If any bird is improperly configured, it won't be able to find food and its progress bar won't grow.

The Food Supply tab contains food pod progress bars that dynamically display the number of food pods available, again in real time. When the run is over, the TupleTree tab displays a timestamped list of food pods it contains as well as brief summaries of every food pod's contents as well as the transactions that were performed on each food pod and by which bird. 

The features described above were all incorporated into the Concurrency Explorer to allow you to interactively explore and develop parallel programs. And once you understand how to break your program down into substeps that can be run in parallel, you can code your application in the programming language of your choice.

Using the code

The complete Java code for the Add3Bird is shown below. It is only 43 lines long with almost half (19) of those lines either comments or whitespace (either blank or single curly braces for visual separation). The only code required by Add3Bird is an override of the afterDigestion method and all it does is add 3 to any non-empty food pod it found. Finding the pod and storing it back into the tree is all handled by the BasicBird framework, making life easier for the developer.

Java
public class <code>Add3Bird</code> extends BasicBird {
    /**
     * Overrides afterDigestion in BasicBird with the instructions on how to add
     * 3 to the value of the food that it ate. This bird expects the value being
     * summed to be kept in the pod’s desc instead of the pod’s seed kernel.
     * If mPod1 is non-null, it tries to add 3 to that one's desc. If mPod2 is non-null,
     * it tries to add 3 to that one’s desc.
     */
    @Override
    protected void afterDigestion(int eatResults) {
        int contentsValue;

        if(eatResults == PUT_BACK_1 || eatResults == PUT_BACK_FILE_1) {
            return;  //don't update the value
        }

        if (!mPod1.isEmpty()) {
            try {
                String localContents = mPod1.getDesc();
                contentsValue = Integer.parseInt(localContents);
            } catch (NumberFormatException nfe) {
                mPod1.addToPodHistory("Non-numeric value in contents");
                contentsValue = 0;
            }

            contentsValue += 3;
            mPod1.setDesc(Integer.toString(contentsValue));
        }

        if (!mPod2.isEmpty()) {
            try {
                String localContents = mPod2.getDesc();
                contentsValue = Integer.parseInt(localContents);
            } catch (NumberFormatException nfe) {
                mPod2.addToPodHistory("Non-numeric value in contents");
                contentsValue = 0;
            }

            contentsValue += 3;
            mPod2.setDesc(Integer.toString(contentsValue));
        }
    }

Most importantly about the above code, there is no locking or synchronizing or mutexes anywhere in this code because it is all handled by the TupleTree and the ConcX framework. All of the multi-threaded parallel code is managed in the background so the user can focus on how to divide up the main tasks into bite-sized pieces that birds can handle atomically in parallel.

While the above task may seem overly simplistic, it is really just a template for more complicated scenarios. For example, what if the Add1Bird was replaced by a bird that grabbed a 10 ms chunk of sound and Add2Bird was replaced by a bird that performed a Fast Fourier Transform on it and Add3Bird was replaced by a bird that tried to match the resulting output with other previously processed results, etc. And if the second bird couldn’t keep up with the first bird, instead of trying to modify it to run faster (and possibly introduce errors), the preferred Avian solution is to just add another instance (or 10 more instances) of the second bird.

Or what if the AddxBirds were replaced by Invoice birds that independently and concurrently found customers to bill, found their usage charges, found their recurring charges, found their billing address, and formatted those results into a single invoice. By making those steps separate with their interim results stored in food pods, it is now suddenly easy to perform invoicing in parallel.

Points of Interest

Avian Computing was developed to encourage users to visualize their parallel programs based on inherently parallel models such as flocks of birds. Hives of bees, schools of fish or herds of horses would also have worked as models because they all contain multiple actors that operate independently and asynchronously while also working together.

ConcX was created with the developer/experimenter in mind. It is an interactive environment that allows users to start and stop various combinations and  configurations of birds. While flying each bird has a progress bar that is continuously updated to show how successful it is and their available food supply. After flying, the flock’s results can be examined as well as each bird’s timestamped history of its events.

The concepts of Avian Computing and its ConcX implementation together are intended to be training wheels to help our single-threaded minds think and talk about parallel programs.

To learn more about the basic concepts of Avian Computing and why we need help to think about parallel programs, go to the Avian Computing Web Site. If you’re ready to try it out, you can download ConcX-2.x.zip (jar file, lib files, and flock files) from the Avian web site or from SourceForge. Be sure to download the “Getting Started with Avian Computing” user guide at the same time because it includes installation information as well as about a dozen parallel scenarios, such as calculating Pi in parallel, the Dining Philosopher’s Probelm, the BarberShop scenario, and more. 

History

1/17/2018: Original submittal

2/12/2018: Un-commented code sample

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)