As I noted in my comment your input data is not "clean" in that there are two check in times for 2016-04-11
EnrollNumber DateInOut
15045 2016-04-11 22:00:00
15045 2016-04-11 23:00:00
15045 2016-04-12 06:00:00
15045 2016-04-12 22:00:00
15045 2016-04-13 06:00:00
Given your expected results it would appear that you want to ignore the row highlighted above.
First you need to clean your data...I've made an assumption that any time difference of less than a quarter of the overall shift time is anomalous data. This query will produce a "cleaned" table of information:
DECLARE @CheckTime int = (SELECT DATEDIFF(hh, CAST(ToTime AS DateTime), CAST(FromTime AS DateTime)) FROM [shift] ) / 4
IF OBJECT_ID('tempdb..#cleaned') IS NOT NULL DROP TABLE #cleaned
;WITH clean AS
(
SELECT EnrollNumber, DateInOut AS DateIn,
LAG(DateInOut) OVER (PARTITION BY EnrollNumber ORDER BY EnrollNumber, DateInOut) DatePrev
FROM tblCheckInOut
)
SELECT DISTINCT EnrollNumber,
CASE WHEN DATEDIFF(hh, DatePrev, DateIn) < @CheckTime THEN DatePrev ELSE DateIn END AS DateInOut
INTO #cleaned
FROM clean T1
Table
#cleaned
now contains
EnrollNumber DateInOut
15045 2016-04-11 22:00:00
15045 2016-04-12 06:00:00
15045 2016-04-12 22:00:00
15045 2016-04-13 06:00:00
The following query will then present the data in the format you require:
;with q AS
(
SELECT EnrollNumber, DateInOut AS DateIn,
LEAD(DateInOut) OVER (PARTITION BY EnrollNumber ORDER BY EnrollNumber, DateInOut) DateOut,
ROW_NUMBER() OVER (PARTITION BY EnrollNumber ORDER BY EnrollNumber, DateInOut) rn
FROM #cleaned t
)
SELECT * FROM q
WHERE DateOut IS NOT NULL AND rn%2<>0
ORDER BY EnrollNumber, DateIn
Note that it will work for multiple
EnrollNumber
values in the source data, but it does assume that the first entry is a check-in-to-shift time.
If you do not have a version of SQL Server that handles LAG/LEAD (post SQL Server 2012, not the Express version, or 2014 Express) then you have to use row numbers and CTEs as follows:
To clean the data
DECLARE @CheckTime int = (SELECT DATEDIFF(hh, CAST(ToTime AS DateTime), CAST(FromTime AS DateTime)) FROM [shift] ) / 4
IF OBJECT_ID('tempdb..#cleaned') IS NOT NULL DROP TABLE #cleaned
;WITH clean AS
(
SELECT EnrollNumber, DateInOut AS DateIn,
ROW_NUMBER() OVER (PARTITION BY EnrollNumber ORDER BY EnrollNumber, DateInOut) rn
FROM tblCheckInOut
), clean2 AS
(
SELECT DISTINCT clean.EnrollNumber, clean.DateIn, prev.DateIn as DatePrev, clean.rn
FROM clean
LEFT OUTER JOIN clean prev ON prev.rn = clean.rn - 1
)
SELECT DISTINCT EnrollNumber, CASE WHEN DATEDIFF(hh, DatePrev, DateIn) < @CheckTime THEN DatePrev ELSE DateIn END AS DateInOut
INTO #cleaned
FROM clean2 T1
To get the results:
;WITH q AS
(
SELECT EnrollNumber, DateInOut AS DateIn
,ROW_NUMBER()
OVER(PARTITION BY EnrollNumber ORDER BY EnrollNumber, DateInOut) AS rn
FROM #cleaned
)
, q2 AS
(
SELECT DISTINCT q.EnrollNumber, q.DateIn, nxt.DateIn as DateOut, q.rn
FROM q
LEFT OUTER JOIN q nxt ON nxt.rn = q.rn + 1
)
SELECT *
FROM q2
WHERE DateOut IS NOT NULL AND rn%2<>0
[EDIT again .... more than 1 shift]
===============================================================
Some corrections to the above to handle the second set of shift data.
The
@CheckTime
"calculation" needs to change slightly
DECLARE @CheckTime int = (SELECT TOP 1 DATEDIFF(hh, CAST(ToTime AS DateTime), CAST(FromTime AS DateTime)) FROM [shift] ) / 4
Alternatively this value could be hard-coded.
The real problem is trying to tidy up the raw data
EnrollNumber DateInOut
15045 2016-04-11 22:00:00 - Clocks IN night shift
15045 2016-04-11 23:00:00 - Clocks OUT? Anomalous entry
15045 2016-04-12 06:00:00 - Clocks OUT night shift
15045 2016-04-12 22:00:00 - Clocks IN night shift
15045 2016-04-13 06:00:00 - Clocks OUT night shift
15045 2016-04-14 06:00:00 - Clocks IN day shift
15045 2016-04-14 22:00:00 - Clocks OUT day shift
15045 2016-04-15 06:00:00 - Clocks IN day shift
15045 2016-04-18 09:10:00 - Clocks OUT? Anomalous entry
15045 2016-04-18 18:00:00 - CLOCKS OUT? Anomalous entry
15045 2016-04-18 20:00:00 - CLOCKS OUT
We have no identifier as to what shift the employee is on, and we can't expect them to always clock in or out "on time" exactly, so in this method for cleaning the data (very similar to the one above) I've assumed that the first entry for any shift is the correct one - i.e. on 18-Apr-2016 they clocked out before the end of their shift.
IF OBJECT_ID('tempdb..#cleaned') IS NOT NULL DROP TABLE #cleaned
;WITH clean AS
(
SELECT EnrollNumber, DateInOut AS DateIn,
ROW_NUMBER() OVER (PARTITION BY EnrollNumber ORDER BY EnrollNumber, DateInOut) rn
FROM tblCheckInOut
)
select c1.EnrollNumber, c1.DateIn, c1.rn, c2.DateIn as prevDate, DATEDIFF(hh,C2.DateIn,C1.DateIn) AS diff
INTO #cleaned
from clean c1
LEFT OUTER JOIN clean c2 ON c2.rn = c1.rn - 1 AND c1.EnrollNumber=c2.EnrollNumber
WHERE ISNULL(DATEDIFF(hh, C2.DateIn, C1.DateIn),24) > @CheckTime
Then the query to get the data out is a little neater than my earlier version...
;WITH q AS
(
SELECT EnrollNumber, DateIn AS DateIn
,ROW_NUMBER()
OVER(PARTITION BY EnrollNumber ORDER BY EnrollNumber, DateIn) AS rn
FROM #cleaned
)
SELECT DISTINCT q.EnrollNumber, q.DateIn, nxt.DateIn as DateOut
FROM q
LEFT OUTER JOIN q nxt ON nxt.rn = q.rn + 1
WHERE nxt.DateIn IS NOT NULL AND q.rn % 2 = 1
ORDER BY EnrollNumber, DateIn
If you can find some way of not having anomalous entries in the raw data then just substitute your original table for #cleaned in the final query - it is the (lack of) quality data that is making it more complicated