Introduction
This project is a task processing server implements a processing pipeline to allow multiple handlers plugged in to handle task from tasks table acting as requesting queue. All task handlers are implemented a common Interface so that they can be added to task processing pipeline during runtime.
A simple RDLC (Microsoft SQL Client Report) handler is included in the project which makes task processing server acted like batch report generator.
Background
I started this project as users want to submit long processing reports on website in batch so that the webpage is not locked up to wait for completion.
As it turns out I created a generic batch processing server with design that can be extended to handle different jobs using customized handlers. Depending on this design, we can add Email, Zip and any task handler easily in future.
Using the code
Architechture
The server is a console program and can be added to Windows Task Scheduler to run as system startup. It takes task entry from Task table and passed it for processing using configured handler or handlers in each task setting and processing logs are written to TaskLog table.
Database Objects
As MS SQL tables are required, we need to create them using script DBSetupForTask.sql provied in download. The script also create a stored procedure for task allocation. As the task server application does not define separated connection string in connection strings section of application configuration, the RDLC report handler implemented shared the same database connection setting. That means you need to apply the script to the database that you want the report handler retrieves data from.
Task Table
I do not make much assumption how the task settings in the table stored. As the settings should be parsed by handler(s) configured in same task record, I choose to use XML data type to store the settings to allow task requester and handler define thier own format.
Although this project is initiated by requirement from website project, no restriction is putted on how task table entries is inserted. That means apart from website requester, any system which wants to offload batch jobs to this task processing server can simply insert tasks to TaskTable
Task Table Columns
Column |
Type |
Nullable |
PK/FK |
Comment |
Id |
int |
NO |
PK |
Auto generated task id |
Title |
nvarchar(255) |
NO |
|
Task title entered by requester |
SubmittedDT |
datetime |
NO |
|
Request submitted date time |
Status |
nvarchar(20) |
YES |
|
New task will be in null value |
StatusDT |
datetime |
YES |
|
Status date timee |
SubmittedUser |
nvarchar(50) |
NO |
|
User who submitted this request |
Handler |
nvarchar(255) |
NO |
|
.NET type full name with assembly for task server to load correct handler |
Settings |
xml(max) |
YES |
|
Handler specific settings in XML format. Please refer to RDLC Implementation for full example. |
ActualOutput |
nvarchar(max) |
YES |
|
The actual output information updated by task handler(s). For example, the physcial file generated from RDLC Handler. |
ModifiedDateTime |
datetime |
NO |
|
Record last updated date time |
CreatedDateTime |
datetime |
NO |
|
Record created date time |
In the Task table, Handler column will store the assigned handler(s) in full .NET type name. If multiple handlers are assigned, their type names are separated by verticle bar | character.
TaskLog Table Columns
Column |
Type |
Nullable |
PK/FK |
Comment |
Id |
int |
NO |
PK |
Auto generated log id |
TaskId |
int |
YES |
FK |
Task id this log record refferring. It can be null if no task referred by the log record. |
LogDateTime |
datetime |
NO |
|
Log entry date time |
CheckingItem |
nvarchar(255) |
NO |
|
The module or function that creates this log record |
EntryType |
int |
NO |
|
4 - Information, 2 - Warning, 1- Error |
Content |
nvarchar(max) |
NO |
|
Log details |
AppName |
nvarchar(128) |
NO |
|
Application name in connection |
UserName |
nvarchar(255) |
NO |
|
User name in connection |
SPID |
int |
NO |
|
Session id in connection |
HostName |
nchar(128) |
NO |
|
Host name in connection |
Task Allocation Stored Procedure
The task server is running as a multiple threads process so that blocking from one task to another should be minimized by allowing multiple tasks to be handled. In doing so, we need to avoid race condition when same task be allocated to more than one thread of handler. I have written stored procedure TaskAllocate
which uses OUTPUT
clause in UPDATE
statement to grep the allocated task id. Using the allocated task id, it returns to Task Server program it's allocated task record for processing.
CREATE PROCEDURE [dbo].[TaskAllocate]
AS
BEGIN
SET NOCOUNT ON;
declare @cnt int ;
declare @logType_Info int = 4, @logType_Error int = 1 ;
declare @procName nvarchar(128) = ISNULL(OBJECT_NAME(@@PROCID), 'TaskAllocate');
declare @allocTask table (
[Id] [int] NOT NULL
) ;
update top (1) dbo.task set [Status] = 'Allocated' , [StatusDT] = GETDATE()
, ModifiedDateTime = GETDATE()
output Inserted.Id into @allocTask
where [Status] is null ;
insert into dbo.TaskLog ( TaskId, CheckingItem, EntryType, Content)
select Id, @procName , @logType_Info, 'Task allocated to job for processing' from @allocTask
select * from dbo.task where id in (select id from @allocTask ) ;
END
Task Server Program Configuration
As task server program needs to access the Task
and TaskLog
table and allocation stored procedure, we need to enter correct connection string in the SForce.TaskServer
application configuration file.
Also, as shown in below sample configure, noOfServices
application setting controls number of threads to be started for processing Task
table entries simultaneously.
<configuration>
<configSections>
<section name="entityFramework" type="System.Data.Entity.Internal.ConfigFile.EntityFrameworkSection, EntityFramework, Version=6.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" requirePermission="false" />
</configSections>
<connectionStrings>
<add name="TaskContext" connectionString="data source=MYSERVER;initial catalog=MYDATABASE;integrated security=False;User Id=MYUSER;Password=MYPASSWORD;multipleactiveresultsets=True;Application Name=SF TASK SERVER;" providerName="System.Data.SqlClient" />
</connectionStrings>
<appSettings>
<add key="noOfServices" value="10" />
<add key="commandTimeout" value="1800" />
</appSettings>
</configuration>
Microsoft Report Definition Language Client-side RDLC Report Handler
After you entered correct application Configurations mentioned in previous section, you can run the server program right away.
To generate RDLC reports, the handler needs three things listed below. You can find relevent configurations in file SForce.TaskImplementation.xml
- First is the parameters with values. Setting
commonReportParameters
refers to the common parameters retrived using SQL.
- Second is the data set to be populated. Setting
statementFile
refers to the path of the data loading SQL statements file.
- Third is the report template to use. Setting
reportTemples
refers to the folder path of the report templates.
<settings>
<setting name="commonReportParameters">
<![CDATA[]]>
</setting>
<setting name="statementFile" value="D:\VSProjects\Visual Studio 2013\Projects\ReportWeb\Dev\ReportWeb\bin\DataStatement.xml" />
<setting name="reportTemples" value="D:\VSProjects\Visual Studio 2013\Projects\ReportWeb\Dev\ReportWeb\reports" />
<setting name="outfileNamePrependTaskId" value="Y" />
</settings>
Parameters
The RDLC report handler divides parameters into common and report specific parameters group. The common parameters are those parameters shared by all reports, e.g. currency format, date format, company name, etc. They should have same parameter names in multiple reports as standard naming convension is good design practice to follow and save us a lot of trouble and time.
Technically, there are two type of parameters, one for RDLC report and and another for SQL data statement. For RDLC report usage, I just combine them into single set of parameters and the RDLC handler tries to parse the report template and matches parameter with same paramter name passed in.
Data statement will only uses report specific parameters as common parameters are usually loaded from database table which data statement can access them directly. The data retrival Processing is carry out by DataLoadTaskHandler
which class is RDLC handler inherited from. DataLoadTaskHandler
parses SQL statement and tries to match statement parameter with same paramter name passed in before executing against database.
As whether any parameter passed to the handlers will be used for data statement and report depending on matching name, parameter naming convension is very important in this regard or else we need to add parameter binding machanism into the logics.
Common Parameters used by RDLC Report
To pass in the common parameter values, you put them in commonReportParameters
setting in in file SForce.TaskImplementation.xml
. They are defined using standard SQL statement with result mandated in name
and value
columns. My example below shows how to get the standard parameters from a fictitious datbase table and you can write you own SQL statement that fit your project setup.
Report Specific Parameters used by RDLC Report and Data Statement
For the report specific parameters, they are defined as xml setting
node value in Settings
column of Task
table record. This should be the most sensible place to put the values as each task entry record stores information only revelant to the report going to be produced.
Data statement parameter specified using MS SQL @parameter
convension is used. In addition, data loader Handler tries to parse the SQL text retrived and replaces token {parameter}
with parameter value of same name. This makes the SQL statement more dynamic as any part of the text can be assembled without restriction on normal parameter only parts.
Data Loading SQL Statements XML File
Data statement file location is configured in above mentioned SForce.TaskImplementation.xml
file at statementFile
setting and below shows an example of standard named sql statement entry.
Each data retrival SQL statement is assigned a name using name
attribute with a select-statement
node. You can put raw SQL statement with parameters as the node value or like below example by calling stored procedure to return data.
="1.0" ="utf-8"
<statements>
<select-statement name="FranchiseMemberTypeReport">
<![CDATA[]]>
</select-statement>
</statements>
Task Table Record Entry
To issue task request, requester application simply puts entry into Task
table with format described in previous Database Objects - Task Table section but there is more information you required to put report settings in Settings
column. This is because only Requester and Handlers should know what the settings to pass and how to parse it but not the task server. Below example shows task id 1
with RDLC request.
Id |
Title |
SubmittedDT |
Status |
StatusDT |
SubmittedUser |
1 |
Sample Report |
10/21/16 17:51 PM |
NULL |
10/21/16 17:51 PM |
PCD |
2 |
My Dummy Task |
10/21/16 17:51 PM |
Allocated |
NULL |
PCD |
Handler |
SForce.TaskImplementation.RDLCTaskHandler,SForce.TaskImplementation |
SForce.TaskImplementation.DummyHandler,SForce.TaskImplementation |
Settings |
ActualOutput |
ModifiedDateTime |
CreatedDateTime |
<settings> <setting name="ReportId" value="157" /> <setting name="ReportName" value="ReportFranchiseMemberTypeReport" /> <setting name="QueryType" value="Name" /> <setting name="QueryText" value="FranchiseMemberTypeReport" /> <setting name="OutputFormat" value="EXCELOPENXML" /> <setting name="OutputPath" value="D:\temp\Report_.xlsx" /> <setting name="parameters"> <setting name="cutOffDate" value="2016-10-08" type="System.DateTime" /> <setting name="shop_list" value="5103,5046,5069" type="System.String" /> <setting name="y_or_m" value="M" type="System.String" /> </setting> </settings> |
NULL |
10/21/16 17:51 PM |
10/21/16 17:51 PM |
<settings /> |
NULL |
10/21/16 17:51 PM |
10/21/16 17:51 PM |
To create a RDLC request, first, enter the .NET fullname for RDLC handler SForce.TaskImplementation.RDLCTaskHandler,SForce.TaskImplementation
in Handler
column. Second, enter the RDLC task specific settings in Settings
column. The RDLC request settings is must be with root node settings
and each setting
is specified as setting
node with name
and value
attributes.
The report specific parameters setting needs to be specified as child nodes under a setting
node with name
attribute "parameters" as shown in below example. This xml layout helps to specify multiple parameters.
<settings>
<setting name="ReportId" value="157" />
<setting name="ReportName" value="ReportFranchiseMemberTypeReport" />
<setting name="QueryType" value="Name" />
<setting name="QueryText" value="FranchiseMemberTypeReport" />
<setting name="OutputFormat" value="EXCELOPENXML" />
<setting name="OutputPath" value="D:\VSProjects\Visual Studio 2013\Projects\ReportWeb\Dev\ReportWeb\temp\PCD\ReportFranchiseMemberTypeReport_20161009_112925.xlsx" />
<setting name="parameters">
<setting name="cutOffDate" value="2016-10-08" type="System.DateTime" />
<setting name="shop_list" value="5103,5046,5069" type="System.String" />
<setting name="y_or_m" value="M" type="System.String" />
</setting>
</settings>
RDLC Settings Description
Setting Name |
Comment |
ReportId |
This is ignored by the handler. You can just put number 0 here. I use it to refer to report in our ERP system only. |
ReportName |
The report template without the .rdlc file extension |
QueryType |
Always put Name here. It means to find the data loading SQL statement by name. |
QueryText |
Eneter the SQL statement name here used to lookup the actual SQL statement in statement file mentioned in previous Section - Data Loading SQL Statements XML File . |
OutputFormat |
The report outout format supported by RDLC specification. |
OutputPath |
The generated report output path. Actual output file may added with the task id and specified in ActualOutput column in the task table record. |
parameters |
The report specific parameters described above. Parameters are specified as child nodes. |
Dummy Handlers for Testing
First, start the Task Server console program and you will see there are 10 threads running in parallel waiting for new tasks entered to Task
table.
Basic Requester Testing
There is a DummyHandler
task handler provided to test the health of the setup when task server is in placed. The setup script includes below SQL to insert a basic dummy task request to the Table
. After inserted, you will see the dummy task just echo passed in information and will loop forever. To exit, you can press X
.
insert into dbo.Task ( Title, SubmittedDT, Status, StatusDT, SubmittedUser, Handler, Settings, ModifiedDateTime, CreatedDateTime)
values ('My Dummy Task', getdate() , null, null, 'PCD', 'SForce.TaskImplementation.DummyHandler,SForce.TaskImplementation', N'<settings />', getdate(), getdate() ) ;
RDLC Requester Testing
The setup script includes below SQL to insert a testing RDLC request to Task
table. If you run the Task Server and insert this sample request, you will see the Excel file exported from RDLC task handler inside C:\Temp folder.
insert into dbo.Task ( Title, SubmittedDT, Status, StatusDT, SubmittedUser, Handler, Settings, ModifiedDateTime, CreatedDateTime)
values ('My Dummy RDLC Task', getdate() , null, null, 'PCD', 'SForce.TaskImplementation.RDLCTaskHandler,SForce.TaskImplementation', N'<settings>
<setting name="ReportId" value="0" />
<setting name="ReportName" value="ReportSample" />
<setting name="QueryType" value="Name" />
<setting name="QueryText" value="Sample" />
<setting name="OutputFormat" value="EXCELOPENXML" />
<setting name="OutputPath" value="C:\temp\Sample.xlsx" />
<setting name="parameters">
<setting name="id" value="20" type="System.Int32" />
<setting name="ReportTitle" value="MY SAMPLE REPORT" type="System.String" />
</setting>
</settings>', getdate(), getdate() ) ;
Points of Interest
TaskManager Class
The main control point is in TaskManager
class which retrieves task record from Task
table and invokes handler(s) configured for the task in serial. Below diagram shows the overall flow of this process.
Microsoft Unity Dependency Injection
Using Microsoft Unity Dependency Injection library's parent and child containers feature, allocated task in each thread runs with its own child container. The following codes are taken from AllocateAndProcess
method of TaskManager
class. This helps to better manage resources with shared Objects running in Unity
parent container while child Container controls objects using ContainerControlledLifetimeManager
with objects live no longer than task live span.
task = this.taskStore.Allocate();
while (task != null)
{
this.taskManagerLogger.SetTask(task, dummayTask.Handler);
using (var childContainer = this.container.CreateChildContainer())
{
var taskContext = new TaskContext();
childContainer.RegisterInstance(typeof(ILogger), new TaskLogger(this.connectionFactory, task), new ContainerControlledLifetimeManager());
childContainer.RegisterInstance(typeof(TaskContext), taskContext, new ContainerControlledLifetimeManager());
}
}
TaskHandler Class
Class TaskHandler
is ancestor of all handlers. To add a new customized handler, you need to inherite from this base class.
Looking codes in TaskHandler
constructor, there are three object instances passed in using Unity
DI mentioned above. The instances connectionFactory
and logger
are both quite self explainatory that they give database connection and logging facilities to the Implementation. The context
instance will be used for handlers information passing among them. As I mentioned earlier each task can be configured to be handled by multiple handlers with .NET names speparated by |
in value of handler
column, handler processs task in serial and context is for each handler to pass data to its followed handler(s).
public TaskHandler(IConnectionFactory connectionFactory, ILogger logger, TaskContext context)
{
this.connectionFactory = connectionFactory;
this.logger = logger;
this.context = context;
if (this.context.Items.ContainsKey(GlobalSettings.Context.SignalExit))
this.signalExit = this.context.Items[GlobalSettings.Context.SignalExit] as WaitHandle;
}
TaskContext
class is very simple and it has the Items
propety for storing generic data so that each handler can add or remove data from it and pass to its followed handlers.
public class TaskContext
{
public IDictionary<string, object=""> Items { get; private set; }
public TaskContext()
{
this.Items = new Dictionary<string, object="">(StringComparer.InvariantCultureIgnoreCase);
}
}
The heart of TaskHandler
is Process
method and all processing for task handling should be putted there. That means each customized handler at least overrides this Process
method, calls base.Process(task)
for initialization and inserts codes after.
virtual public bool Process(Task task)
{
if (this.ReceivedSignalExit(0))
{
this.logger.WriteLog(string.Format("Task id {0} for {1} received cancel request and exit!", task.Id, task.Title), LogType.Warning);
return false; }
this.taskSettings = task != null ? this.ParseTaskSettings(task.Settings) : new Dictionary<string, string="">();
return true; }
Moreover, if you review above codes in details, the Process
method return True
or False
at the end. It helps to inform Task Server whether next handler should be continued be invoked to handle this particular task.
In addition, there is a WaitHandle
putted inside the context Items Dictionary
. This is will signal the processing handler to stop and the signal is sent by Task Server after received Cancellation request.
Task Allocation and Cancellation Monitoring
Task allocation
Task allocation and Cancellation monitoring are the core functions of Task Server and need to have detailed discussion here. As new tasks are insered to Task
table directly by requester system, we can find them either by polling or using SQLDependency
- the .NET class for Microsoft Query Notification Service
.
Reviewing the codes below extracted from TaskManager
, class dbListener
resolved by Unity DI
determinates actual Implementation used. Actually, I use polling supported by DbPoller
class with setup in 5 seconds interval.
public TaskManager(..., IListener dbListener,...)
{
this.taskStore = taskStore;
}
public void ProcessTasks()
{
this.dbListener.Monitor(this.monitorCommand);
this.dbListener.Changed += DbListener_Changed;
}
private void DbListener_Changed(object sender, EventArgs e)
{
AllocateAndProcess();
}
private void AllocateAndProcess()
{
task = this.taskStore.Allocate();
while (task != null)
{
task = null; task = this.taskStore.Allocate(); }
}
Cancellation Monitoring
After the task is allocated and processed by Handler(s), we need to prepare anytime requester will send Cancellation request for the task. Using polling is not appropriate here as cancellation must be responsed immediately.
Class DbListener
implemented with SQLDependency
monitors Task
table cancellation request, then it raises Changed
event after received task cancellation notification.
After allocated a new task, TaskManager
starts processing with method PrepareExitNotification
shown below before invoking first handler. This method prepares cancellation signal by subscribing Changed
event of DbListener cancelNotification
instance with monitoring SQL statement - select Status from dbo.Task where Status='ReqCancel'. That means requester can update the status to ReqCancel
- Request Cancel after the task was updated to Allocated
or Processing
. Note that handler relies on ManualResetEvent
included in TaskContext
for signaling cancellation.
public TaskManager(..., [Dependency("NotificationService")] IListener cancelNotification, ...)
{
this.cancelNotification = cancelNotification;
string monitorText = string.Format("select Status from dbo.Task where Status='{0}'", TaskStatus.ReqCancel);
this.cancelNotification.Monitor(monitorText);
}
private WaitHandle PrepareExitNotification(Task task, TaskContext taskContext)
{
ManualResetEvent exitEvent = new ManualResetEvent(false);
taskContext.Items[GlobalSettings.Context.SignalExit] = exitEvent;
this.cancelNotification.Changed += delegate (object sender, EventArgs e)
{
task = this.taskStore.GetTask(task.Id);
if (TaskStatus.ReqCancel.Equals(task.Status))
exitEvent.Set();
};
return exitEvent;
}
If the task can be canceled before completion, status will be updated to Canceled
and message will be displayed to console as shown similar to below example.
Host Process with Multiple TaskManagers in Separated Threads
As we expected batch task requests are often long running jobs and TaskManager
is running at single-threaded and processing with handlers sequentially, it is more appropriate to start multiple TaskManagers
in separated threads so that multiple tasks can be handled in parallel.
The host process is a console program which starts multiple threads and initializes main Unity
container to resolve shared instances for all threads. In fact, each thread with separated TaskManager
owns separated Unity
child container as shown in below codes.
As the TaskServer
implemented in separated project SForce.TaskServerImplementation
, I expected developing Windows Service host instead of console host is not too much effort involved.
class Program
{
static void Main(string[] args)
{
for (int i = 0; i < taskManagerServices.Length; ++i)
{
taskManagerServices[i] = new System.Threading.Tasks.Task(delegate () {
using (var container = mainContainer.CreateChildContainer())
{
var mgr = container.Resolve<taskmanager>();
mgr.ProcessTasks();
signal.WaitOne();
mgr.Terminate();
}
});
}
}
}
CleanupManager Class
The cleanup job is delegate to CleanupManager
. It removes historical records and related physical output file(s) if any.
Security Consideration
This application should be running in controled envirnoment that means we need to restrict access rights to database tables and no authorized access to configuration files. This is because hacker altered configurations like SQL statements in data statement file will make desaster consequce.
Database Access Restriction
One suggestion to restrict accessing the database is to create a database user who only have rights to update Task
and TaskLog
tables, executing TaskAllocate
stored procedure and read only rights to selected tables. Then use the created user in task server connection string.
Security Handler
You can choose to implement a customized security handler and put the type name in systemHandlers
setting inside configuration file SForce.TaskServerImplementation.xml
as shown in below example
Also, the download has included a sample handler SecurityHandler.cs
is for starting your customized security checking. Handlers listed in systemHandlers
setting run for each task allocated. So, that is the best place to put a security Handler for filtering any unauthorized access.
<settings>
<setting name="pollingInterval" value="5000" />
<setting name="daysToKeepTask" value="180" />
<setting name="systemHandlers" value="SForce.TaskImplementation.SecurityHandler,SForce.TaskImplementation" />
</settings>
public class SecurityHandler : TaskHandler
{
private ILogger hostLogger;
public SecurityHandler(IConnectionFactory connectionFactory, ILogger logger, TaskContext context, [Dependency("host")] ILogger hostLogger)
:base(connectionFactory, logger, context)
{
this.hostLogger = hostLogger;
}
public override bool Process(Task task)
{
this.hostLogger.WriteLog(string.Format("{0} {1} processes {2}.", DateTime.Now, this.GetType().Name, task), LogType.Information);
return base.Process(task);
}
}
History
- 2016-10-23 Version 1 - Project posted to CodeProject.