Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database / SQL-Server

Spilt DateTime Ranges in SQL Server

1.00/5 (1 vote)
1 Jul 2023CPOL2 min read 9.6K   21  
Split date time ranges into days or possibly other ranges using CTE in SQL Server
This code snippet demonstrates how to convert a date range into a set of rows in SQL, including handling overlapping date ranges and splitting ranges into day-wise records. It provides two examples: one for splitting date ranges into individual days and another for splitting ranges into overlapping or related ranges. The code includes recursive CTEs and joins to achieve the desired results efficiently.

Background

I had a case where I had to convert a date range into a set of rows. For example, I had a record saying a product discount from 1st June till 10th June. This single row needs to be transformed into 10 rows, one for each day of sales. How could I do this in a fast and scalable manner, since I had thousands and thousands of records which might result in millions? That's not all.

  • There could be different types of products
  • Overlapping date ranges, leading to more discounts (sum)

Split Ranges to Day Wise Range

Here, we are going to convert the date range to possible day records.

Data

SQL
DECLARE @tblDateRange TABLE (
    Id INT IDENTITY(1, 1),
    TypeId INT,
    FromDateTime DATETIME,
    ToDateTime DATETIME
)
INSERT INTO @tblDateRange (TypeId, FromDateTime, ToDateTime)
VALUES
(1, '2023-01-01 10:00:00.000', '2023-01-01 10:00:00.000'),   --start and end is same
(2, '2023-02-02 00:00:00.000', '2023-02-04 23:59:59.000'),   --start and end is start 
                                                             --and end of a day
(3, '2023-03-05 10:00:00.000', '2023-03-06 23:59:59.000'),   --start in middle of 
                                                             --another date
(4, '2023-04-07 00:00:00.000', '2023-04-08 21:00:00.000'),   --end in middle of 
                                                             --another date
(5, '2023-05-09 11:00:00.000', '2023-05-11 11:00:00.000'),   --start and end in 
                                                             --middle of another date
(6, '2023-06-01 10:00:00.000', '2023-06-01 22:00:00.000');   --start and end in 
                                                             --middle of same date

Split Query

In this split process, we are using recursion to populate new rows inside a CTE.

SQL
WITH 
DateRanges(Id, LevelNo, [Date], DateWiseStartDateTime, DateWiseEndDateTime)
AS
(
    SELECT 
        p.Id,
        1,
        CAST(p.FromDateTime AS DATE),
        p.FromDateTime,
        IIF(DATEADD(SECOND, -1, DATEADD(DAY, DATEDIFF(DAY, 0, p.FromDateTime) + _
        1, 0)) < p.ToDateTime, DATEADD(SECOND, -1, _
        DATEADD(DAY, DATEDIFF(DAY, 0, p.FromDateTime) + 1, 0)),  p.ToDateTime)    
    FROM @tblDateRange p
    UNION ALL
    SELECT 
        c.Id,
        p.LevelNo + 1,
        CAST(DATEADD(DAY, DATEDIFF(DAY, 0, _
             DATEADD(DAY, 1, p.DateWiseStartDateTime)), 0) AS DATE),
        DATEADD(DAY, DATEDIFF(DAY, 0, DATEADD(DAY, 1, p.DateWiseStartDateTime)), 0),
        IIF(c.ToDateTime < DATEADD(DAY, 1, p.DateWiseEndDateTime), _
            c.ToDateTime, DATEADD(DAY, 1, p.DateWiseEndDateTime))
    FROM @tblDateRange c
    JOIN DateRanges p ON c.Id = p.Id
    WHERE DATEADD(DAY, DATEDIFF(DAY, 0, _
          DATEADD(DAY, 1, p.DateWiseStartDateTime)), 0) < _
                  c.ToDateTime    --date wise start datetime < range to datetime
)
SELECT 
    *
FROM DateRanges
ORDER BY Id, LevelNo
OPTION (MAXRECURSION 30000) -- default 100, max 32767, no limit 0

By default, it is going to support 100 recursive calls, and we don't need to use OPTION (MAXRECURSION 30000). If it requires more than that, we need to use that OPTION clause where the max value can be 32767. Use 0 for no limit.

Join Split Rows to Actual Rows

Let's join the split rows with actual data. Here is the new select query in the CTE:

SQL
SELECT 
    d.*,
    r.LevelNo,
    r.Date,
    r.DateWiseStartDateTime,
    r.DateWiseEndDateTime
FROM @tblDateRange d
JOIN DateRanges r ON d.Id = r.Id
ORDER BY d.Id, r.LevelNo
OPTION (MAXRECURSION 30000) -- default 100, max 32767, no limit 0

Limitations

Imagine we have ranges for 10 or 20 years. For each year, it's going to generate about 356 rows each year. The end query will be a bit slow. Another way is splitting them into small ranges based on overlapping and nonoverlapping ranges.

Split Ranges to Overlapping/Related Ranges

Here, we are going to convert the date range to all possible small ranges. Here is an idea:

1                                                                     12
|-------|-----------------------------|---------|-----------|--------|
        2                              6
        |-----------------------------|
                                                8            10
                                                |-----------|
Input
Range       Part
1-12        1
2-6         1
8-10        1


Expected Result
NewRange    SUM of Parts
1-2            1
3-6            2
7-8            1
9-10           2
11-12          1

As we can see, here overlapping and nonoverlapping ranges are available.

Data

SQL
DECLARE @tableDataRanges TABLE(
    Id INT IDENTITY(1, 1),
    TypeId INT,
    FromDateTime DATETIME,
    ToDateTime   DATETIME,
    Points    INTEGER
);
INSERT INTO @tableDataRanges (TypeId, FromDateTime, ToDateTime, Points)
VALUES      
(1, '2023-01-13', '2023-01-20 23:59:59', 20),            --no overlapping, no split
(1, '2023-02-10', '2023-02-20 23:59:59', 10),            --15-20 overlapping, 
                                                         --should be splited in two
(1, '2023-02-15', '2023-02-25 23:59:59', 10),            --15-20 overlapping, 
                                                         --should be splited in two

(2, '2023-02-10 12:00:00', '2023-02-20 23:00:00', 20),   --15-20 overlapping, 
                                                         --should be splited in two
(2, '2023-02-15 06:00:00', '2023-02-25 23:59:59', 20),   --15-20 overlapping, 
                                                         --should be splited in two

(3, '2023-02-10', '2023-02-20 23:59:59', 30),            --has overlapping, but no split
(3, '2023-02-05', '2023-02-25 23:59:59', 20);            --10-20 overlapping, 
                                                         --should be splited in three

Split Query

SQL
WITH 
PosibleStartDateTimes
AS 
(
    SELECT FromDateTime AS StartDateTime, TypeId, 1 AS Cover
    FROM   @tableDataRanges
    UNION ALL
    SELECT DATEADD(SECOND, 1, ToDateTime) AS StartDateTime, TypeId, -1 AS Cover
    FROM  @tableDataRanges
)
,AggregatedStartDateTimes
AS 
(
    SELECT StartDateTime, TypeId, SUM(Cover) AS Cover
    FROM PosibleStartDateTimes
    GROUP BY StartDateTime, TypeId
)
,StartDateTimeToRanges
AS 
(
    SELECT 
        StartDateTime AS FromDateTime,
        TypeId,
        LEAD(StartDateTime) OVER (PARTITION BY TypeId ORDER BY StartDateTime) _
                                  AS ToDateTime,
        SUM(Cover) OVER (PARTITION BY TypeId ORDER BY StartDateTime) AS NumberOfParts
    FROM AggregatedStartDateTimes
)
,PossibleRanges
AS
(
    SELECT 
        FromDateTime,
        DATEADD(SECOND, -1, ToDateTime) AS ToDateTime,
        TypeId,
        NumberOfParts
    FROM StartDateTimeToRanges
    WHERE NumberOfParts > 0
)
SELECT 
    *    
FROM PossibleRanges
--WHERE FromDateTime < ToDateTime --optional
ORDER BY TypeId, FromDateTime;

We have different TypeId data in the table. That's why in the OVER clause, we are using PARTITION BY TypeId partitioning by row type.

Join Split Rows to Actual Rows

Let's join the split range rows with actual data. Here is the new select query in the CTE:

SQL
SELECT 
    d.*,
    r.NumberOfParts,
    r.FromDateTime AS RangeFromDateTime,
    r.ToDateTime AS RangeToDateTime
FROM @tableDataRanges d
JOIN PossibleRanges r ON d.TypeId = r.TypeId _
     AND (d.FromDateTime <= r.FromDateTime AND d.ToDateTime >= r.ToDateTime)
ORDER BY d.Id, r.FromDateTime;

Others

If there is no type difference, simply we can do a query without using PARTITION BY:

SQL
DECLARE @tableRanges TABLE(
    Id INT IDENTITY(1, 1),
    FromDateTime DATETIME,
    ToDateTime   DATETIME,
    Points    INTEGER
);
INSERT INTO @tableRanges (FromDateTime, ToDateTime, Points)
VALUES      
('2018-01-01', '2018-01-31 23:59:59', 80),
('2018-01-07', '2018-01-10 23:59:59', 10),
('2018-01-07', '2018-01-31 23:59:59', 10),
('2018-01-11', '2018-01-31 23:59:59', 5),
('2018-01-25', '2018-01-27 23:59:59', 5),
('2018-02-02', '2018-02-23 23:59:59', 100);

WITH 
PosibleStartDateTimes
AS 
(
    SELECT FromDateTime AS StartDateTime, Points, 1 AS Cover
    FROM   @tableRanges
    UNION ALL
    SELECT DATEADD(SECOND, 1, ToDateTime) AS StartDateTime, -1 * Points, -1 AS Cover
    FROM  @tableRanges
)
,AggregatedStartDateTimes
AS 
(
    SELECT StartDateTime, SUM(Points) AS Points, SUM(Cover) AS Cover
    FROM PosibleStartDateTimes
    GROUP BY StartDateTime
)
,StartDateTimeToRanges
AS 
(
    SELECT 
        StartDateTime AS FromDateTime,
        LEAD(StartDateTime) OVER (ORDER BY StartDateTime) AS ToDateTime,
        SUM(Points) OVER (ORDER BY StartDateTime) AS Points,
        SUM(Cover) OVER (ORDER BY StartDateTime) AS NumberOfParts
    FROM AggregatedStartDateTimes
)
,PossibleRanges
AS
(
    SELECT 
        FromDateTime,
        DATEADD(SECOND, -1, ToDateTime) AS ToDateTime,
        Points,
        NumberOfParts
    FROM StartDateTimeToRanges
    WHERE NumberOfParts > 0
)
SELECT 
    *    
FROM PossibleRanges
--WHERE FromDateTime < ToDateTime --optional
ORDER BY FromDateTime;

References

Conclusion

Nonrecursive PARTITION BY query was the fastest of them all

Code/Environment

Find the same code .sql file as the zip, if things not working as expected let me know. Tested code in Microsoft SQL Server 2019 (RTM) - 15.0.2000.5 (X64).

History

  • 1st July, 2023: Initial version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)