Introduction
For the sake of an example, suppose you have a database in which you track employees, departments, and the relationships between employees and departments. The schema might look something like this:
An employee can be assigned to multiple departments, and each department can have multiple employees assigned to it.
It is easy to think of scenarios in which you might want a query that returns a data set that looks like this:
Departments per Employee
Alice | Accounting |
Bill | Development, Operations, Research |
Dinah | Operations, Research |
... or a data set that looks like this:
Employees per Department
Accounting | Alice |
Development | Bill |
Operations | Bill, Dinah |
Quality Assurance | |
Research | Bill, Dinah |
The SELECT
statement to return these types of data sets should be simple and the syntax should be intuitive. For example:
SELECT
Department.Name AS Department
, toolbox.ConcatenateText(Employee.Name, ', ') AS Employees
FROM
Department
LEFT JOIN Assignment ON Department.ID = Assignment.DepartmentID
LEFT JOIN Employee ON Assignment.EmployeeID = Employee.ID
GROUP BY
Department.Name
ORDER BY
Department.Name
Microsoft SQL Server does not provide an aggregate function to concatenate string
values, so you need to create your own.
Background
What is an Aggregate Function?
An aggregate function performs a calculation on a set of values and returns a single value. Except for COUNT
, aggregate functions ignore null
values. Aggregate functions are frequently used with the GROUP BY
clause of the SELECT
statement.
For more information on build-in aggregate functions, refer to this article in the MSDN documentation.
You can create a database object inside SQL Server that is programmed in a CLR assembly. Database objects that can leverage the rich programming model provided by the CLR include triggers, stored procedures, functions, aggregate functions, and types. Like the built-in aggregate functions provided in Transact-SQL, user-defined aggregate functions perform a calculation on a set of values and return a single value.
For general information on user-defined aggregate functions, refer to this article in the MSDN documentation.
Using the Code
Download and unzip the attached source code. Open the SQL script named Install.sql and change the variables DatabaseName
and AssemblyPath
to match your environment. Make sure SQLCMD
mode is enabled and then execute the script. The script enables CLR modules in your database, creates a schema named "toolbox
", and adds an aggregate function named ConcatenateText
.
The aggregate function accepts two parameters:
- The name of the input column
- The
string
to be used as a value-separating delimiter
For example:
SELECT
Employee.Name AS EmployeeName
, toolbox.ConcatenateText(Department.Name, ', ') AS DepartmentNames
FROM
Employee
INNER JOIN Assignment ON Employee.ID = Assignment.EmployeeID
INNER JOIN Department ON Assignment.DepartmentID = Department.ID
GROUP BY
Employee.Name
Here the separator is a comma followed by a blank space, so the aggregate function for a given employee name returns a comma-separated list of department names that looks like this:
Accounting, Engineering, Operations
As an added bonus, the aggregate function ensures your comma-separated output value is sorted and contains no duplicates.
Points of Interest
The C# code for the custom aggregate function is in the Concatenate
class. The class is small and relatively simple, containing only about 120 lines of code.
The Accumulate
method is the key to how the code works. This method is invoked for each of the values in your data set; it appends the values to a string
builder and adds a separator when needed.
public void Accumulate(SqlString value, SqlString separator)
{
if (!separator.IsNull && separator.Value.Length > 0)
_separator = separator.Value;
if (_accumulator.Length > 0)
_accumulator.Append(_separator);
if (!value.IsNull && value.Value.Length > 0)
{
_accumulator.Append(value.Value);
IsNull = false;
}
}
SQL Server invokes the Terminate
method after the accumulation is complete. At this stage, the code sorts the array of values and then eliminates duplicates and empty entries.
public SqlString Terminate()
{
if (IsNull)
return SqlString.Null;
String result = _accumulator.ToString();
String[] array = result.Split(new[] {_separator}, StringSplitOptions.RemoveEmptyEntries);
Array.Sort(array);
StringBuilder joinedWithoutDuplicates = new StringBuilder();
for (Int32 i = 0; i < array.Length; i++)
{
String currentItem = array[i];
if (i == 0 || !String.Equals(array[i - 1], currentItem, StringComparison.CurrentCulture))
{
if (i != 0)
joinedWithoutDuplicates.Append(_separator);
joinedWithoutDuplicates.Append(currentItem);
}
}
return new SqlString(joinedWithoutDuplicates.ToString());
}
History
- July 27, 2015: Posted the first version