Introduction
This article describes how to use the Task Parallel Library (TPL) and the Parallel Linq (PLINQ) features available in the .NET Framework. These features were introduced in .NET Framework 3.5 and have been updated to include more advanced functionality in .NET Framework 5.0. The latest of them being the async
and await
keywords but this article does not describe those keywords. Rather, it shows how to take advantage of the Task
class which is a higher level class for performing multithreaded operations in .NET.
Background
Executing long running processes or operations in a code sequentially or synchronously takes time and may hinder the performance of the overall application. That's why it's not advisable to run these time taking operations on a single thread in a sequential manner. Rather it's better to make use of the multithreaded capabilities of the processor and operating system. Since almost every processor available in the market today has multiple cores, it's useless to write code for performing long operations which does not harness these capabilities of the processor.
The concept of executing long operations parallely in multiple cores of the CPU is called Multithreading or Parallel Programming. The .NET Framework has had extensive support for multithreading since the beginning. The Thread
class residing in the System.Threading
namespace provides a very low level API to create multiple threads and execute different operations in each of them parallely. While it's very helpful in getting the job done, its extensive API makes the learning curve rather steep for intermediate users. A higher level class got introduced in the .NET Framework in version 3.5 which could make the job much easier. The Task
class residing in the System.Threading.Tasks
namespace provided a much more concise API which could perform multithreading operations very effectively. Additionally, a Parallel
class was also introduced to invoke methods in parallel fashion.
This article describes the use of these classes to perform time taking operations. The source code is attached to the article and can be downloaded separately.
Using the Code
We want to perform certain LINQ operations on collections. It's LINQ to Entities and the entity has not been populated from a real data source rather with example hard coded data but that is not the point of this article anyway.
The Plain Old CLR Object (POCO) class used as the data source is as follows:
public class Employee
{
public int empID { get; set; }
public string empName { get; set; }
public double salary { get; set; }
public int deptID { get; set; }
public string deptName { get; set; }
public Emplyee(int empID, string empName, double salary, int deptID, string deptName)
{
this.empID = empID;
this.empName = empName;
this.salary = salary;
this.deptID = deptID;
this.deptName = deptName;
}
}
Another class needs to added for storing the result of a group by LINQ operation:
public class GroupResult
{
public int key { get; set; }
public List<emplyee> emps { get; set; }
}
Now, let's add the collection which would be acting as the data source for the LINQ operations.
class Program
{
public List<emplyee> employees = null;
public Program()
{
employees = new List<emplyee>() {
new Emplyee(1000, "emp1", 3200, 10, "dept1"),
new Emplyee(1001, "emp2", 4400, 10, "dept1"),
new Emplyee(1002, "emp3", 2800, 10, "dept1"),
new Emplyee(1003, "emp4", 4500, 20, "dept2"),
new Emplyee(1004, "emp5", 5200, 20, "dept2"),
new Emplyee(1005, "emp6", 3800, 20, "dept2"),
new Emplyee(1006, "emp7", 2900, 30, "dept3"),
new Emplyee(1007, "emp8", 4100, 30, "dept3"),
new Emplyee(1008, "emp9", 4400, 30, "dept3"),
new Emplyee(1009, "emp10", 2700, 40, "dept4"),
new Emplyee(10010, "emp11", 3600, 40, "dept4"),
new Emplyee(10011, "emp12", 5100, 40, "dept4")
};
}
}
The actions or operations that need to be called asynchronously are mentioned below. Please append these inside the above class.
public List<Emplyee> getEmployees()
{
Func<List<Emplyee>> getEmps = () =>
{
return (from emp in employees.AsParallel() select emp).ToList<Emplyee>();
};
var emps = getEmps();
return emps;
}
public Emplyee getEmployee(int empID)
{
Func<int, Emplyee> getEmp = (int empid) =>
{
var foundEmp = from emp in employees.AsParallel()
where emp.empID == empid select emp;
return foundEmp.FirstOrDefault<Emplyee>();
};
var requestedEmp = getEmp(empID);
return requestedEmp;
}
public List<GroupResult> getEmployeeGroupDept()
{
Func<List<GroupResult>> getEmpGroupDept = () =>
{
var grpResult = from emp in employees.AsParallel()
group emp by emp.deptID into groups select new GroupResult
{
key = groups.Key,
emps = groups.ToList<Emplyee>()
};
return grpResult.ToList<GroupResult>();
};
var groupResult = getEmpGroupDept();
return groupResult;
}
public void ParallelAction1()
{
Action parallel_action1 = () =>
{
Thread.Sleep(3000);
Console.WriteLine("Parallel Action 1 invoked");
};
parallel_action1();
}
public string ParallelAction2()
{
Func<string> parallel_action2 = () =>
{
Thread.Sleep(3000);
return "Parallel Action 2 invoked";
};
var result = parallel_action2();
return result;
}
public int ParallelAction3(int no)
{
Func<int> factorial = () =>
{
var fact = 1;
for (int i = 1; i <= no ; i++)
{
Thread.Sleep(2000);
fact = fact * i;
}
return fact;
};
var factResult = factorial();
return factResult;
}
Func<string, string> dispStr = (string str) =>
{
Thread.Sleep(2000);
return string.Format("the entered string is: {0}", str);
};
Func<string, string> toUpperStr = (string str) =>
{
Thread.Sleep(2000);
return string.Format("the upper case of the string is: {0}", str.ToUpper());
};
Func<string, string, string> concatStr = (string str1, string str2) =>
{
Thread.Sleep(2000);
return string.Format("the concatenated string is: {0}", str1 + str2);
};
Func<object, bool> isString = (object str) =>
{
Thread.Sleep(2000);
return str is String;
};
The Main
function of the Program
class contains two Actions: runSynchronous
and runAsynchronous
. runSynchronous
calls the above mentioned functions synchronously while runAsynchronous
calls them asynchronously. Add the Main
function inside the Program
class which is described as follows:
static void Main(string[] args)
{
var watch = Stopwatch.StartNew();
var p = new Program();
Action runSynchronous = () =>
{
Console.WriteLine(p.dispStr("Lambda"));
Console.WriteLine(p.toUpperStr("lower case"));
Console.WriteLine(p.concatStr("Hello", "World"));
Console.WriteLine(p.isString(Convert.ToInt32(10)));
var employees = p.getEmployees();
Console.WriteLine("All Employees" + Environment.NewLine);
foreach (var emp in employees)
{
Console.WriteLine("EmpID: " + emp.empID + " EmpName: " +
emp.empName + " Salary: " + emp.salary + " DeptID: " +
emp.deptID + " DeptName: " + emp.deptName + Environment.NewLine);
}
var foundEmp = p.getEmployee(1002);
Console.WriteLine("Found Employee" + Environment.NewLine);
Console.WriteLine("EmpID: " + foundEmp.empID + " EmpName: " +
foundEmp.empName + " Salary: " + foundEmp.salary + " DeptID: " +
foundEmp.deptID + " DeptName: " + foundEmp.deptName + Environment.NewLine);
var grpResult = p.getEmployeeGroupDept();
Console.WriteLine("Group Employees By DeptID" + Environment.NewLine);
foreach (var grp in grpResult)
{
Console.WriteLine("Key: " + grp.key + Environment.NewLine);
foreach (var emp in grp.emps)
{
Console.WriteLine("EmpID: " + emp.empID + " EmpName: " +
emp.empName + " Salary: " + emp.salary + " DeptID: " +
emp.deptID + " DeptName: " + emp.deptName + Environment.NewLine);
}
}
p.ParallelAction1();
var result = p.ParallelAction2();
Console.WriteLine(result);
Console.WriteLine("The factorial is: " + p.ParallelAction3(Convert.ToInt32(4)));
};
Action runAsynchronous = () =>
{
Task t1 = new Task(() =>
{
Console.WriteLine(p.dispStr("Lambda"));
});
Task t2 = new Task(() =>
{
Console.WriteLine(p.toUpperStr("lower case"));
});
Task t3 = new Task(() =>
{
Console.WriteLine(p.concatStr("Hello", "World"));
});
Task<bool> t4 = new Task<bool>(() =>
{
return p.isString(Convert.ToInt32(10));
});
Task t5 = new Task(() =>
{
var employees = p.getEmployees();
Console.WriteLine("All Employees" + Environment.NewLine);
foreach (var emp in employees)
{
Console.WriteLine("EmpID: " + emp.empID + " EmpName: " +
emp.empName + " Salary: " + emp.salary + " DeptID: " +
emp.deptID + " DeptName: " + emp.deptName + Environment.NewLine);
}
});
Task t6 = new Task(() =>
{
var foundEmp = p.getEmployee(1002);
Console.WriteLine("Found Employee" + Environment.NewLine);
Console.WriteLine("EmpID: " + foundEmp.empID + " EmpName: " +
foundEmp.empName + " Salary: " + foundEmp.salary + " DeptID: " +
foundEmp.deptID + " DeptName: " + foundEmp.deptName + Environment.NewLine);
});
Task t7 = new Task(() =>
{
var grpResult = p.getEmployeeGroupDept();
Console.WriteLine("Group Employees By DeptID" + Environment.NewLine);
foreach (var grp in grpResult)
{
Console.WriteLine("Key: " + grp.key + Environment.NewLine);
foreach (var emp in grp.emps)
{
Console.WriteLine("EmpID: " + emp.empID + " EmpName: " +
emp.empName + " Salary: " + emp.salary + " DeptID: " +
emp.deptID + " DeptName: " + emp.deptName + Environment.NewLine);
}
}
});
t1.Start();
t2.Start();
t3.Start();
t4.Start();
t5.Start();
t6.Start();
t7.Start();
Task[] tArray = { t1, t2, t3, t4, t5, t6, t7 };
try
{
Task.WaitAll(tArray);
if (t4.IsCompleted)
{
var result = t4.Result.ToString();
Console.WriteLine(result);
}
Parallel.Invoke(() =>
{
p.ParallelAction1();
},
() =>
{
var result = p.ParallelAction2();
Console.WriteLine(result);
},
() =>
{
Console.WriteLine("The factorial is: " + p.ParallelAction3(Convert.ToInt32(4)));
});
}
catch (Exception e)
{
throw (new NotImplementedException(e.InnerException.ToString().Trim()));
}
};
try
{
runAsynchronous();
}
catch (Exception e)
{
Console.WriteLine(e.InnerException.ToString().Trim());
}
finally
{
watch.Stop();
Console.WriteLine("Execution time: " + watch.ElapsedMilliseconds);
}
Console.ReadKey();
}
Execute the code using runSynchronous
action and note the execution time, then do the same with the runAsynchronous
action. There will be a substantial amount of difference in the execution time between both the methodologies. The asynchronous one is much faster.
Performing operations asynchronously using the Task
class is preferable in most of the scenarios since it has a high level API to control the working of threads. If more low level and precise control is required, the Thread
class can be used. The Parallel
class also provides methods for invoking operations parallely but they are not preferred more because in case of the above Parallel.Invoke
the operating system decides by itself what would be the order of execution of the operations and that order is random. So it is not preferred in scenarios where we can encounter a deadlock because in that case we would have to deal with that manually. Anyway, this article just scratches the surface of the concept of Parallel Programming in .NET and there are much more advanced scenarios and methodologies to deal with them.
Points of Interest
When I started with Parallel programming in .NET, it took me a while to get my head around the concept. But now I know it's worth the time taken since it comes very handy while performing long running operations like file handling, network connection and asynchronous I/O.