Introduction
This topic gives you a new approach on how to organize your data into tables that will help you generate large "Crosstab" Reports using SQL Server 2000/2005. This is a custom solution and does not use and PIVOT/UNPIVOT options which are available in SQL Server 2005.
Background
Recently, I came cross a new problem where I needed to generate a crosstab report that had to pull data for 48 months back in time from the current month, where the report had to calculate and give various amounts like the "Year To Date," "Advances," "Paid Amount," "Recovered Amount," "Adjustment Amount" and "Balance" for each month. This information had to be provided for each "Service Provider." In short, I needed to create a report with data spanning across 250 columns as a crosstab report. Please do not look at the numbers and sum the totals. They will not match. The data tables are given to support the concept and approach. Given below is a sample report format for a period of three months:
Service Providers' Total | YTD ($) | Adv. Amt. | Paid Amt. | Rec. Amt. | Adj. Amt. | Bal. Amt. | Adv. Amt. | Paid Amt. | Rec. Amt. | Adj. Amt. | Bal. Amt. |
| Periods | 1-Jan-03 | 1-Jan-03 | 1-Jan-03 | 1-Jan-03 | 1-Jan-03 | 1-Dec-07 | 1-Dec-07 | 1-Dec-07 | 1-Dec-07 | 1-Dec-07 |
P1001
| 1312 | 800 | 150 | 12 | 300 | 362 | 1300 | 500 | 100 | 300 | 600 |
P1002
| 4080 | 800 | 200 | 0 | 20 | 580 | 4000 | 700 | 350 | 0 | 3650 |
P1003 | 8605 | 800 | 300 | 300 | 20 | 780 | 9000 | 1900 | 200 | 0 | 7300 |
P1004
| 3325 | 800 | 100 | 0 | 75 | 625 | 3000 | 2500 | 1000 | 0 | 1500 |
P1005
| 8105 | 800 | 125 | 0 | 20 | 655 | 7000 | 100 | 0 | 900 | 6000 |
This basic requirement makes the report a bit complicated to generate, as we are still using SQL 2000 and the PIVOT/UNPIVOT option is not available out of the box. The other things that make this report complex are:
- Unavailability of XML data types and XML data manipulation
- Unavailability of Object data types as Column types, as available in Oracle 8 and above
The Solution
If not the XML data types, the Object data types kind of violate the First Normal Form (1NF) where a relation must contain only atomic values in each row and column. This relationship of an Object data type column with the primary key is a one to many relationship. We can also see this as a nested table scenario where the primary key maps to an in-memory object which is a table by itself.
Keeping the basics in mind, I kind of looked for several options to implement a solution without writing 1000 of lines of T-SQL Code. Finally I arrived at this solution below.
Let us see this approach through an example before we start to look at the actual SQL Tables, Stored Procedures and Functions that I created to generate this report.
The most important concept that helped solve this issue is "Pointers." So, what is a Pointer? A Pointer is an address of a memory location where the "Data" is present. In a "C/C++ Language" perspective this data is an Instance of a "Structure," a "Class" or any other basic data type.
Other basic concepts of RDBMS that helped solve this issue are the "Unique Index" and the "Primary Keys."
By tying these two programming constructs and database concepts together, I have implemented this approach in getting the "Crosstab" report.
Source Data for the Report
Given below is a summary of the structured data that needs to be presented as a Crosstab report as mentioned above. In reality the different amount/values come from three different sources/tables and not from a single table. To keep it simple, I have listed all different amount/ $value columns along with their provider and process period (2007/12 to 2007/07) in one single table. This table is a single table representation of tblClaims
table that you will see in the example.
S. No | Service Provider | Process Period | Advance | Paid | Received | Adjustment |
1 | P1001 | 200712 | $48.39 | $138.61 | $187.00 | $30.00 |
2 | P1002 | 200712 | $142.97 | $107.03 | $250.00 | $30.00 |
3 | P1003 | 200712 | $40.46 | $76.54 | $117.00 | $128.00 |
4 | P1004 | 200712 | $11.83 | $100.17 | $112.00 | $9.05 |
5 | P1005 | 200712 | $9.88 | $145.12 | $155.00 | $9.05 |
6 | P1001 | 200711 | $3.00 | $23.00 | $26.00 | $89.35 |
7 | P1002 | 200711 | $1,210.05 | $689.95 | $1,900.00 | $8.15 |
8 | P1003 | 200711 | $9.05 | $20.95 | $30.00 | $113.60 |
9 | P1004 | 200711 | $9.05 | $20.95 | $30.00 | $0.13 |
10 | P1005 | 200711 | $9.05 | $20.95 | $30.00 | $51.18 |
11 | P1001 | 200710 | $89.35 | $38.65 | $128.00 | $39.84 |
12 | P1002 | 200710 | $8.15 | $20.45 | $28.60 | $19.27 |
13 | P1003 | 200710 | $18.88 | $39.12 | $58.00 | $519.00 |
14 | P1004 | 200710 | $28.73 | $19.27 | $48.00 | $69.00 |
15 | P1005 | 200710 | $9.44 | $19.56 | $29.00 | $163.00 |
16 | P1001 | 200709 | $28.73 | $19.27 | $48.00 | $36.40 |
17 | P1002 | 200709 | $126.18 | $223.82 | $350.00 | $113.60 |
18 | P1003 | 200709 | $82.74 | $287.26 | $370.00 | $0.13 |
19 | P1004 | 200709 | $192.06 | $265.44 | $457.50 | $51.18 |
20 | P1005 | 200709 | $960.32 | $869.68 | $1,830.00 | $39.84 |
21 | P1001 | 200708 | $19.09 | $135.91 | $155.00 | $126.18 |
22 | P1002 | 200708 | $3.00 | $23.00 | $26.00 | $82.74 |
23 | P1003 | 200708 | $57.67 | $461.33 | $519.00 | $192.06 |
24 | P1004 | 200708 | $48.39 | $20.61 | $69.00 | $20.61 |
25 | P1005 | 200708 | $49.40 | $113.60 | $163.00 | $113.60 |
26 | P1001 | 200707 | $36.27 | $0.13 | $36.40 | $0.13 |
27 | P1002 | 200707 | $22.82 | $51.18 | $74.00 | $56.66 |
28 | P1003 | 200707 | $35.16 | $39.84 | $75.00 | $22.82 |
29 | P1004 | 200707 | $33.91 | $26.09 | $60.00 | $5.49 |
30 | P1005 | 200707 | $98.17 | $73.83 | $172.00 | $5.49 |
Implementation Details to this Approach
As mentioned in the problem definition, this report needs to present 6 different amount/$ values for a period of 48 months. It is not too hard to build a table with so many columns (250+) and pretty much most of the RDBMS like SQL Server support this and bring the same back into reports. But the basic idea is to PIVOT/Transform the data of "Rows of Process Period" into Data of "Process Period Columns." The way I have implemented this solution is by using "Concepts and Language Constructs," as I have mentioned above in the "The Solution" section of this article.
As part of the solution I had to create two tables. The First table is called CrosstabXRef
. The structure of this table is given below:
Table Name: CrosstabXRef
S. No | Column Name | Data type | Primary Key | Is Null |
1 | SvcProvider | char (10) | YES (Unique Index) | NO |
2 | RecGeneratedDate | datetime | NO | NO |
3 | Period48 | uniqueidentifier | NO | YES |
4 | Period47 | uniqueidentifier | NO | YES |
5 | Period46 | uniqueidentifier | NO | YES |
6 | … | uniqueidentifier | NO | YES |
7 | … | uniqueidentifier | NO | YES |
8 | … | uniqueidentifier | NO | YES |
9 | Period02 | uniqueidentifier | NO | YES |
10 | Period01 | uniqueidentifier | NO | YES |
Note the data type of the columns (Period48, Period47, Period46, ..., Period02, Period01). All these columns are of type uniqueidentifier
that will hold a unique value which maps to a record in the table CrosstabLookup
given below.
Table Name: CrosstabLookup
S. No | Column Name | Data type | Primary Key | Is Null |
1 | PeriodIdentifier | uniqueidentifier | YES | NO |
2 | Advance | numeric | NO | YES |
3 | Paid | numeric | NO | YES |
4 | Received | numeric | NO | YES |
5 | Adjustment | numeric | NO | YES |
6 | Balance | numeric | NO | YES |
So, how are these two tables related? The answer is on the data type of the Period##
columns that are there in these two tables. If you have guessed it correctly, yes the GUID
value that is returned as part of the =NEWID ()
method is stored as data values of these uniqueidentifier
data typed columns.
When a record is present as part of the "Claims" payment and adjustment selection of a given "Period" and "Service Provider" an entry will be made into this CrosstabLookup
table. And this process will be continued for the selected "48" months window. If there is no activity for a particular "Svc Provider" for a specific "Period" then no record will be inserted into this lookup table. But that situation is unlikely to happen as Advance is paid to the "Svc Provider" every month.
So in a nutshell, the data in the database will look as depicted in the diagram below:
For achieving this, I have to create two DB Objects, one Stored procedure to construct the above mentioned two tables and one function to return the data stored in the CrosstabLookup
based on the uniqueidentifier
and an Index.
The SQL code block is the Stored Procedure that I have created which will load the data into CrosstabLookup
and CrosstabXRef
. The name of this stored procedure is spProcessClaimsCrosstabMaping
.
CREATE PROCEDURE dbo.spProcessClaimsCrosstabMaping
AS
SET CONCAT_NULL_YIELDS_NULL OFF
Declare @SvcProv varchar(10)
Declare @Advance numeric
Declare @PaidAmt numeric
Declare @RecAmt numeric
Declare @AdjAmt numeric
Declare @GUID uniqueidentifier
Declare @DynamicSQL varchar(3000)
Declare @ProcessPeriod numeric
Declare @YearPart Varchar(4)
Declare @MonthPart Varchar(2)
Declare @CharYear Varchar(6)
Declare @StartDate datetime
Declare @EndDate datetime
DECLARE SvcProvCursor CURSOR READ_ONLY
FOR
Select distinct ServiceProvider from tblClaims
OPEN SvcProvCursor
FETCH NEXT FROM SvcProvCursor INTO @SvcProv
WHILE @@FETCH_STATUS = 0
BEGIN
SET @EndDate = GETDATE()
SET @StartDate = DATEADD(mm,-12, @EndDate)
SET @DynamicSQL = ' INSERT INTO dbo.CrosstabXRef ' + ' values (''' + _
@SvcProv + ''',''' + CAST(@StartDate AS Varchar(20)) + ''','
WHILE (@StartDate <= @EndDate)
BEGIN
SET @YearPart = CAST(DATEPART(yyyy, @StartDate) AS VARCHAR(4))
SET @MonthPart = CAST(DATEPART(mm, @StartDate) AS VARCHAR(2))
IF LEN(@MonthPart) = 1
SET @MonthPart = '0' + @MonthPart
SET @CharYear = @YearPart + @MonthPart
SET @ProcessPeriod = CAST(@CharYear AS numeric)
SET @GUID = NULL
Select @Advance = Advance ,
@PaidAmt = Paid , @RecAmt = Received,
@AdjAmt = Adjustment
from tblClaims tc where tc.ProcessDate = @ProcessPeriod and
ServiceProvider=@SvcProv
if @@ROWCOUNT = 1
BEGIN
SET @GUID = NEWID()
INSERT INTO dbo.CrosstabLookup
VALUES
(@GUID, @Advance, @PaidAmt, @RecAmt, @AdjAmt)
SET @DynamicSQL = @DynamicSQL + '''' + _
CAST(@GUID AS varchar(64)) + ''''
END
ELSE
BEGIN
SET @DynamicSQL = @DynamicSQL + 'NULL'
END
if (@StartDate <> @EndDate)
BEGIN
IF (@GUID <> NULL)
SET @DynamicSQL = @DynamicSQL + ''','
ELSE
SET @DynamicSQL = @DynamicSQL + ','
END
SET @StartDate = DATEADD(mm,1, @StartDate)
END
IF (@GUID <> NULL)
SET @DynamicSQL = @DynamicSQL + ''')'
ELSE
SET @DynamicSQL = @DynamicSQL + ')'
PRINT 'WORKING....!!'
EXEC (@DynamicSQL)
FETCH NEXT FROM SvcProvCursor INTO @SvcProv
END
CLOSE SvcProvCursor
PRINT 'FINISHED PROCESSING...!!'
GO
The second function that I have created to present the data is called fnClaimsCrosstabHpr
. This will help us get the $ Values based on the PeriodIdentifier
, which is indirectly mapped to a Service Provider.
CREATE FUNCTION dbo.fnClaimsCrosstabHpr(@PeriodId uniqueidentifier, @ColumnIndes int)
RETURNS float
AS
BEGIN
Declare @Amount float
if (@ColumnIndes = 1) Select @Amount=_
Advance from CrosstabLookup where PeriodIdentifier = @PeriodId
if (@ColumnIndes = 2) Select @Amount=_
Paid from CrosstabLookup where PeriodIdentifier = @PeriodId
if (@ColumnIndes = 3) Select @Amount=_
Received from CrosstabLookup where PeriodIdentifier = @PeriodId
if (@ColumnIndes = 4) Select @Amount=_
Adjustment from CrosstabLookup where PeriodIdentifier = @PeriodId
RETURN(@Amount)
END
The next step is to execute the Stored Procedure to populate the data into CrosstabLookup
and CrosstabXRef
tables. Once this is complete, the next step is to generate the report itself. Given below is a sample query that will help generate a table of data for the Crosstab
report. This data will be directly exported to Excel for further analysis.
select SvcProvider, RecGeneratedDate,
dbo.fnClaimsCrosstabHpr(Period48, 1) AS Advance_48,
dbo.fnClaimsCrosstabHpr(Period48, 2) AS Paid_48,
dbo.fnClaimsCrosstabHpr(Period48, 3) AS Received_48,
dbo.fnClaimsCrosstabHpr(Period48, 4) AS Adjustment_48,
dbo.fnClaimsCrosstabHpr(Period47, 1) AS Advance_47,
dbo.fnClaimsCrosstabHpr(Period47, 2) AS Paid_47,
dbo.fnClaimsCrosstabHpr(Period47, 3) AS Received_47,
dbo.fnClaimsCrosstabHpr(Period47, 4) AS Adjustment_47,
dbo.fnClaimsCrosstabHpr(Period01, 1) AS Advance_01,
dbo.fnClaimsCrosstabHpr(Period01, 2) AS Paid_01,
dbo.fnClaimsCrosstabHpr(Period01, 3) AS Received_01,
dbo.fnClaimsCrosstabHpr(Period01, 4) AS Adjustment_01
from CrosstabXRef
Attachments
Attached is a copy of the SQL Script and SQL Objects that are listed as part of this article for download.
Conclusion
For writing this article, I had to make few changes to the actual tables and SQL DB objects. I hope the changes I have made have not caused differences making this document unclear to read. If you have any questions please write to me. I will try to address them.
Enjoy learning by reading.