Recently, I answered one of the SQL based questions on CodeProject. Thanks to the participation, that question inspired me to write this blog post.
Question
He has a table (tblLeave
) with data like below:
PAYCODE LV_TYPE FROM_DATE TO_DATE LVALUE
5023 SL 14/12/2012 0:00 14/12/2012 0:00 1
5023 SL 15/12/2012 0:00 15/12/2012 0:00 1
5023 COF 16/12/2012 0:00 16/12/2012 0:00 1
5023 SL 19/12/2012 0:00 19/12/2012 0:00 1
5023 SL 22/12/2012 0:00 22/12/2012 0:00 1
5023 SL 23/12/2012 0:00 23/12/2012 0:00 1
5023 SL 24/12/2012 0:00 24/12/2012 0:00 1
5023 PL 28/12/2012 0:00 28/12/2012 0:00 1
5023 PL 29/12/2012 0:00 29/12/2012 0:00 1
5023 PL 30/12/2012 0:00 30/12/2012 0:00 1
5023 PL 31/12/2012 0:00 31/12/2012 0:00 1
And he wants to output the data as below:
PAYCODE LV_TYPE FROM_DATE TO_DATE LVALUE
5023 SL 14/12/2012 0:00 15/12/2012 0:00 2
5023 COF 16/12/2012 0:00 16/12/2012 0:00 1
5023 SL 19/12/2012 0:00 19/12/2012 0:00 1
5023 SL 22/12/2012 0:00 24/12/2012 0:00 3
5023 PL 28/12/2012 0:00 31/12/2012 0:00 4
Condition: If same type of leave is taken continuously, it should be merged in one row mentioning from_date
to to_date
.
Answer
When I saw that question for the first time, I didn’t go through in detail and just thought that was an easy grouping query. And I just gave the following answer:
SELECT LV_TYPE,LV_TYPE,MIN(FROM_DATE) AS FROM_DATE ,MAX(TO_DATE) AS TO_DATE, COUNT(LVALUEP) AS LVALUE
FROM tblLeave
GROUP BY PAYCODE,LV_TYPE
But that’s wrong, he made comments saying it doesn’t make sense and highlighted the condition he wants. (Thanks to him; he didn’t down vote my answer). Again, I read the question… Oh… that was a tricky question. He needs to group the leave by consecutive date. Isn’t that tricky?
To answer that, I use the DATEDIFF SQL
function:
DATEDIFF ( datepart , startdate , enddate )
http://msdn.microsoft.com/en-us/library/ms189794.aspx
Following is my answer and the output:
SELECT PAYCODE,LV_TYPE, MIN(FROM_DATE) AS FROM_DATE,
MAX(FROM_DATE) AS TO_DATE, COUNT('A') AS LVALUE
FROM (
SELECT PAYCODE,LV_TYPE,FROM_DATE,
DATEDIFF(D, ROW_NUMBER() OVER(ORDER BY FROM_DATE), FROM_DATE) AS Diff
FROM tblLeave) AS dt
GROUP BY PAYCODE,LV_TYPE, Diff
ORDER BY FROM_DATE
PAYCODE LV_TYPE FROM_DATE TO_DATE LVALUE
----------- ------- ----------------------- ----------------------- -----------
5023 SL 2012-12-14 00:00:00.000 2012-12-15 00:00:00.000 2
5023 COF 2012-12-16 00:00:00.000 2012-12-16 00:00:00.000 1
5023 SL 2012-12-19 00:00:00.000 2012-12-19 00:00:00.000 1
5023 SL 2012-12-22 00:00:00.000 2012-12-24 00:00:00.000 3
5023 PL 2012-12-28 00:00:00.000 2012-12-31 00:00:00.000 4
Query Explanation
Before explaining the logic, take a look at the following query and the output.
SELECT PAYCODE,LV_TYPE,FROM_DATE,
ROW_NUMBER() OVER(ORDER BY FROM_DATE) AS ROW_NUMBER,
DATEDIFF(D, ROW_NUMBER() OVER(ORDER BY FROM_DATE), FROM_DATE) AS Diff
FROM tblLeave
PAYCODE LV_TYPE FROM_DATE ROW_NUMBER Diff
----------- ------- ----------------------- -------------------- -----------
5023 SL 2012-12-14 00:00:00.000 1 41254
5023 SL 2012-12-15 00:00:00.000 2 41254
5023 COF 2012-12-16 00:00:00.000 3 41254
5023 SL 2012-12-19 00:00:00.000 4 41256
5023 SL 2012-12-22 00:00:00.000 5 41258
5023 SL 2012-12-23 00:00:00.000 6 41258
5023 SL 2012-12-24 00:00:00.000 7 41258
5023 PL 2012-12-28 00:00:00.000 8 41261
5023 PL 2012-12-29 00:00:00.000 9 41261
5023 PL 2012-12-30 00:00:00.000 10 41261
5023 PL 2012-12-31 00:00:00.000 11 41261
By seeing this, you will realize that above query generates the same Diff
value for all the consecutive dates. Now, you can easily group this and get the counts as you like.