(untagged)

Make Tasks in Thread Pool to Collaborate

Jek Platform

0.00/5 (No votes)

23 Jun 2005

This article discusses how to use thread pool to execute a job which consists of a list of collaborative tasks.

Introduction

I want to design a job execution program with the following goals:

Generic task execution scheduling kernel.
Its object model allows easy system analysis, and task job abstraction.
Isolate application layer code from task synchronization implementation, so that you can focus on implementing application rather than the thread synchronization technique.
The architecture should be flexible enough to adapt constant system configuration changes in the application layer.

Several articles in this site describe how to build thread pool to execute a job with only one execution. If a job needs to go through a sequence of executions, which is called task in this article, and these tasks need collaboration during execution, it becomes a different challenge.

Most of the thread pool design articles in this site have an assumption that job execution by a thread from the thread pool has only one step. In other words, once a job gets a chance to execute, it will produce the desired product. In the thread pool, a total of n (0 < n < maximum allowed threads in pool) jobs are active at the same time. In this case, a job�s life time can be summarized as:

Job is ready;
Job is submitted into thread pool;
Job is executed;
Job is done and desired result is produced. This is illustrated in figure 1.

Figure 1. A simple thread pool assumes job just needs one step of execution.

If a job needs to be executed by more than one thread to generate the desired results, and the execution sequence by these threads should be in the predefined order, a simple thread pool, which starts to execute a task whenever there are free threads available, is not enough.

Figure 2 is such an example. An �unprocessed� job needs to go through two steps of execution to become a �processed� job. There are two threads available to do step 1, but only one is available to do step 2. Threads 1.1 and 1.2 represent two identical resources A, and thread 2.1 presents resource B which is different from A.

Figure 2. Jobs need to be executed in two steps.

There are different implementations to solve this problem.

One such implementation is to use thread synchronization objects to accomplish the above project. For example, event object can be used to synchronize thread 1.1 (or thread 1.2) and thread 2.1. But such a methodology makes implementation of thread 1.1, thread 1.2 and thread 2.1 different, since thread 2.1 has to be notified by the threads 1.1, 1.2 when it is not busy. Therefore scalability of such implementation is very poor. Imagine there is step 3 with thread 3.1, 3.2, see figure 3. The implementation of thread 3.1 and 3.2 will be different from all the other threads.

Is it possible to make thread implementation uniform, so that you can focus on the tasks execution sequence, and implementation of each task, which are the core business logic of your project?

Figure 3. A more complicated job which needs three steps of executions.

Architecture

Based on the discussion above, we want to separate the application layer and the task execution layer. Figure 4 shows the architecture of a sample program, which consists of a generic job execution scheduler JobScheduler.dll, and an application AppJob.exe based on the generic job execution scheduler.

Figure 4. Application layer and job scheduler layer.

Job scheduler layer consists of two important base classes CExeSheduler and CTask, which can be sub-classed from the application layer. A job in the application layer consists of a list of ordered tasks. Job scheduler will observe the relationship between tasks while executing a job. We use thread pool, which is implemented in CThreadManager, as the foundation for job execution.

A task parallel property is introduced to define the relationship between tasks in the class CTask, so that CExeScheduler will know when to execute tasks in a job. It defines if two adjacent tasks in a job can be executed at same time.

typedef struct _TASK_BIND
{
    BOOL     m_bParallel;
} TASK_BIND, *PTASK_BIND;

class AFX_EXT_CLASS CTask : public CObject
{
    ...
    BOOL     IsParallel_Prev();
    BOOL     IsParallel_Next();
private:
    TASK_BIND    m_bPrevBind;
    TASK_BIND    m_bNextBind;

The main execution scheduling algorithm is implemented in the following function:

void CExeScheduler::ExecuteTask(long lLoopCount)
{
    ...
}

Application layer is designed to solve real problems, and it uses the object model provided by job scheduler layer. It derives its job scheduler CAppScheduler from the base class CExeScheduler. Also, it derives six custom tasks from the base class CTask. This is shown in figure 5.

Figure 5. Additional class association information.

Job execution sequence diagram illustrates how tasks in a job are executed. In this job scheduler kernel, task only has parallel property that is used by CExeScheduler during task execution. This property determines when a task in a job is executed. Job execution sequence diagram in figure 6 shows that the job scheduler kernel uses the task parallel property during execution.

Figure 6. Job execution sequence diagram.

Review the demo program

We already know JobScheduler.dll is a generic job scheduler. AppJob.exe is a real application build on top of the generic job scheduler. It is designed to execute a job for N loops. This job has six steps, and the job is considered to have executed 1 loop if the 6 steps are executed in order. The length of each step is different and is random. But these steps have the following constraints for better throughput:

Step 1, 2 can be parallel.
Step 3, 4, 5 cab be parallel.

In AppJob.exe, the relationship between tasks can be simply defined as following in the AppScheduler.cpp:

void CAppScheduler::CreateSampleTasks()
{
     CTask1* pTask1 = new CTask1(m_pDlg, 1, 
                TRUE);  //Parallel with next task

     AddTask(pTask1);
     CTask2* pTask2 = new CTask2(m_pDlg, 2, 
                FALSE);  //Not parallel with next task

     AddTask(pTask2);
     CTask3* pTask3 = new CTask3(m_pDlg, 3, 
                TRUE);  //Parallel with next task

     AddTask(pTask3);
     CTask4* pTask4 = new CTask4(m_pDlg, 4, 
                TRUE);  //Parallel with next task

     AddTask(pTask4);
     CTask5* pTask5 = new CTask5(m_pDlg, 5, 
                FALSE);  //Not parallel with next task

     AddTask(pTask5);
     CTask6* pTask6 = new CTask6(m_pDlg, 6, 
                FALSE);
     AddTask(pTask6);
     PrepareTasks();
}

The above code also implies that it is very easy to change the relationship between tasks.

In this demo program, the task activities in the application layer are shown on a dialog box, which is shown in figure 7.

Figure 7. Task activities.

The example demonstrates the following:

Separation of business problem in application layer and generic job execution layer. Application layer is AppJob.exe, job execution scheduling layer is JobScheduler.dll.
User can focus on implementing the application code in application layer rather than task execution scheduling. This can be observed in the project AppJob.exe. You can focus on modeling real problem in AppTask.cpp. AppJob.exe does not have any task execution scheduling code.
Job scheduling layer provides an object model allowing the user to model real problem. A user can use the object model to identify the job and its tasks. It also provides generic task execution scheduling, that the application layer has no knowledge of.
From the GUI dialog, you can see clearly how and when each task is executed after clicking the �Run� button. The task relationship defined in the application layer is guaranteed by the generic job scheduler. As indicated in the above code, the job scheduler object model allows easy modification of the task relationships in the application layer function CAppScheduler::CreateSampleTasks().

Conclusion

Job execution is to make tasks in a job to collaborate. The example in the article demonstrates this idea. Further discussion and improvement will be resource sharing between tasks and task priorities. Other areas of further discussions are more robust job scheduler object model. A well designed object model allows user to model real problem effectively.

Besides, more sophisticated design environment is needed for the user to model, analyze and simulate real complex problems. System analysis, system modeling and system simulation can be performed in such environment, so that the users will be more productive. This is beyond the scope of this article and could be discussed in future.

Future actions

Based on this article, a design tool is prototyped for readers with further interests and more advanced requirements. Please visit: http://www.jekplatform.com/. This site also presents more complex issues with solutions, which are implemented in C++.

Acknowledgement

The sample project uses CLabel written by norm.net in the article �Extended Use of CStatic Class - CLabel 1.6� published in Code Project.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here