Introduccion
A new feature that has the new version of SQL Server 2008, is the new GROUPING SETS operator, which already existed in other database engines such as Oracle, which allows to combine different clustering queries in a single query.
It's good to clarify that the GROUPING SETS operator is an extension of the standard clause GROUP BY. When not required all possible groupings that are generated using
ROLLUP
or CUBE
operator (which existed in SQL Server 2005), use GROUPING SETS to specify only the groups you want. So, thanks to GROUPING SETS obtain the desired levels of grouping and also it returns the subtotal for each subset of grouping.
Grouping Sets
In short we could say that GROUPING SET is more comprehensive and generic
ROLLUP
or CUBE
. Generally queries that use this type of operators are related to data analysis, reporting and everything related to the world Intelegencias Business.
I remind you that this new operator is not doing magic and solve a problem that could not solve, but rather it simplifies and optimizes our inquiries, that's what matters it, but I want to clarify that without using this operator we could generate or find same results in SQL Server, but not the performance that brings.
Using the code
Let's try a query, as always will use the test database AdventureWorks SQL Server for this example:
select costytd, Name, CountryRegionCode from Sales.SalesTerritory
The first thing that comes to mind to solve the grouping is almost always make a query using
UNION ALL
with three different queries, one for each group, would be something like this:
select SUM(costytd) AS Total, Name, CountryRegionCode from Sales.SalesTerritory
group by Name, CountryRegionCode
union all
select SUM(costytd), null, CountryRegionCode from Sales.SalesTerritory
group by Name, CountryRegionCode
union all
select SUM(costytd), null, null from Sales.SalesTerritory
group by Name, CountryRegionCode
But we could fix it using the GROUPING SET operator, would be something like this:
select SUM(costytd) AS Total, Name, CountryRegionCode from Sales.SalesTerritory
groyp by
grouping sets
(
(name, CountryRegionCode),
(CountryRegionCode),
()
)
order by grouping(name), grouping(CountryRegionCode)
Now as you can see in this last example, in addition to using the GROUPING SETS operator, I use the GROUPING
function in the ORDER BY
, this is to sort the result by Name and CountryRegionCode.
This function (which already existed in SQL Server 2005),
indicates whether a column expression specified in a GROUP BY
list is added or not.
GROUPING
returns 1 to 0 for no added and added, in the result set.
Points of Interest
The advantage of using GROUPING SETS is only given to simplify queries syntactically, but also on issues of performance.
I did some tests to monitor the use of resources (SET STATISTICS IO ON) with 90,000 records in my table and these were the results:
Using UNION ALL: Table 'Sales.SalesTerritory'. Scan count 3, logical reads 1728, physical reads 6, read-ahead reads 590, lob logical reads 0,
lob physical reads 0, lob read-ahead reads 0.
Using GROUPING SETS: Table 'Sales.SalesTerritory'. Scan count 1, logical reads 576, physical reads 6,
read-ahead reads 590, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
As you can see, becomes less GROUPING SETS
resource use I / O. This is because using the new feature of SQL Server, the engine needs less data pages read because it makes the calculation of higher-level
aggregation of lower level aggregations.