This article describes a workaround for AT TIME ZONE performance issues in SQL Server.
Introduction
In SQL Server 2016, a new AT TIME ZONE
operator was introduced. It allows to convert datetime
to the corresponding value in the target time zone. Queries with AT TIME ZONE
run smoothly on small tables. But once you apply AT TIME ZONE
on table column (not a scalar expression) on a huge table with millions of records, you'll get performance issues like those described in Jonathan Kehayias's article.
In my case, the problem was with long-running SQL Server views, not a dynamic SQL.
Custom views filtered millions of records by one or more datetime criteria - datetime
column value must be within a floating time range. For example, Before_Now
, After_Now
, Today
, Tomorrow
, Yesterday
, CurrentWeek
, CurrentMonth
, CurrentYear
, Next_X_Hours
, Previous_X_Hours
, etc.
Those custom views could be divided into two types:
- time zone is specified as a scalar expression. For example,
AT TIME ZONE 'Pacific Standard Time'
; - time zone is stored in a separate table. For example,
AT TIME ZONE (SELECT store_tz FROM [Stores] WHERE store_id = 123)
.
A typical CTE query used for pagination for the second type took 10x times more than for the first type. Some queries took even a few minutes.
I managed to decrease second type query time from minutes to hundreds of milliseconds by implementing a workaround described below.
Workaround Tips
First, let's define a Time Zone Period as period of time in a particular time zone during which offset to UTC is unchanged:
- Time zone that observes Daylight Saving Time usually will have three periods per year (non-DST⇒DST⇒non-DST or DST⇒non-DST⇒DST).
- It's better to merge neighboring periods with same offset into one long period.
- Time zone that does not observe Daylight Saving Time will have one period that covers a few or more years.
We need time zone period information for correct filtering across date ranges especially when Daylight Saving Time starts or ends.
Tips/Tricks
- Cache time zone period information in a local table. Populate information not only for one current period but for the whole current year, previous and next years. The number of years depends on:
- how often do you want to update the cache table (each year/month/release, etc.)
- what
datetime
criteria do you use? For example, criteria 'datetime column value must be within next X years' requires to have cached time zone periods for current and X following years
- The cache table should have at least the following columns:
- Time zone name (
timezone_id
) - Period Start Date in UTC (
period_start_date_utc
) - Period End Date in UTC (
period_end_date_utc
) - UTC Offset of specified period (
offset_in_minutes
). For example, -420 minutes for 'Pacific Standard Time' time zone during Daylight Saving Time period - UTC Offset difference between specified period and preceding period (
delta_previous_offset_in_minutes
). For example, -60/0/+60 minutes - UTC Offset difference between following period and specified period (
delta_next_offset_in_minutes
). For example, +60/0/-60 minutes - ...
Two or more preceding/following periods may be stored in JSON datatype column.
- Additionally, store
CHECKSUM(timezone_id)
in separate [timezone_id_checksum] [int] NOT NULL
column.
Create CLUSTERED INDEX
on this timezone_id_checksum
column.
This gives the best performance when filtering records by timezone -
[timezone_id_checksum] = CHECKSUM(store_tz)
is faster than [timezone_id] = [store_tz]
because integer
datatype has low space requirement (Integer
vs Varchar(50)
) -
In WHERE
statement combine criteria for neighboring periods of time. For example, criteria "processing_date
must be within Today
or Tomorrow
".
WHERE
(Today 12:00AM <= processing_date) AND
(processing_date <= Today 23:59:59PM)
OR
(Tomorrow 12:00AM <= processing_date) AND
(processing_date <= Tomorrow 23:59:59PM)
may be rewritten as:
WHERE (Today 12:00AM <= processing_date) AND
(processing_date <= Tomorrow 23:59:59PM)
- If
_timezone_period_cache
table has hundreds of records, it's better to place the time zone information subquery before table join
s.
SELECT O.*
FROM
(
...
) STORE_TZ_INFO
INNER JOIN [Orders] O (nolock) ON O.store_fk = STORE_TZ_INFO.store_pk
Test Tables
I had the following tables in my test database:
- _timezone_period_cache with records for one 'Pacific Standard Time' time zone
- Orders table with 2 mln records
- Stores table with one record for 'Pacific Standard Time' time zone
- Other tables:
Clients
, Products
and OrderItems
, but they were not used in my SQL Server Views
CREATE TABLE [dbo].[_timezone_period_cache](
[timezone_id] [varchar](50) NOT NULL,
[offset_in_minutes] [int] NOT NULL,
[timezone_id_checksum] [int] NOT NULL,
[period_start_date_utc] [datetime2](7) NOT NULL,
[period_end_date_utc] [datetime2](7) NOT NULL,
[delta_previous_offset_in_minutes] [int] NOT NULL,
[delta_next_offset_in_minutes] [int] NOT NULL
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX idx_timezone_id_checksum
ON [_timezone_period_cache] (timezone_id_checksum);
Populate _timezone_period_cache table
When evaluating Time Zone Periods, it's important not to take the exact datetimeoffset
when clocks turn backward (usually Daylight Saving Time ends), but a microsecond before and add later that microsecond to datetimeoffset
in UTC. On the screenshot below, SQL Server returned two datetimeoffset
s and their difference - it is 1 hour and 1 microsecond, not just 1 microsecond.
For my test, I chose Pacific Standard Time time zone. Since I wanted to run tests for dates when Daylight Saving Time ends & starts, I populated records for 2022/2023 years in a script.
INSERT INTO [dbo].[_timezone_period_cache]
([timezone_id]
,[offset_in_minutes]
,[timezone_id_checksum]
,[period_start_date_utc]
,[period_end_date_utc]
,[delta_previous_offset_in_minutes]
,[delta_next_offset_in_minutes]
)
VALUES
('Pacific Standard Time'
, -7*60
, CHECKSUM(CONVERT(varchar(50), 'Pacific Standard Time'))
,
DATEADD(
HOUR,
3,
CONVERT(datetime2, DATEFROMPARTS(2022, 03, 13))
) AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC'
,
DATEADD(
MICROSECOND,
1,
DATEADD(
MICROSECOND,
-1,
DATEADD(HOUR,
2,
CONVERT(datetime2, DATEFROMPARTS(2022, 11, 6))
)
) AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC'
)
, 60
, -60
)
GO
INSERT INTO [dbo].[_timezone_period_cache]
([timezone_id]
,[offset_in_minutes]
,[timezone_id_checksum]
,[period_start_date_utc]
,[period_end_date_utc]
,[delta_previous_offset_in_minutes]
,[delta_next_offset_in_minutes]
)
VALUES
('Pacific Standard Time'
, -8*60
, CHECKSUM(CONVERT(varchar(50), 'Pacific Standard Time'))
, DATEADD(
MICROSECOND,
1,
DATEADD(
MICROSECOND,
-1,
DATEADD(HOUR,
2,
CONVERT(datetime2, DATEFROMPARTS(2022, 11, 6))
)
) AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC'
)
, DATEADD(
HOUR,
3,
CONVERT(datetime2, DATEFROMPARTS(2023, 03, 12))
) AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC'
, -60
, +60
)
GO
INSERT INTO [dbo].[_timezone_period_cache]
([timezone_id]
,[offset_in_minutes]
,[timezone_id_checksum]
,[period_start_date_utc]
,[period_end_date_utc]
,[delta_previous_offset_in_minutes]
,[delta_next_offset_in_minutes]
)
VALUES
('Pacific Standard Time'
, -7*60
, CHECKSUM(CONVERT(varchar(50), 'Pacific Standard Time'))
,
DATEADD(
HOUR,
3,
CONVERT(datetime2, DATEFROMPARTS(2023, 03, 12))
) AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC'
,
DATEADD(
MICROSECOND,
1,
DATEADD(
MICROSECOND,
-1,
DATEADD(HOUR,
2,
CONVERT(datetime2, DATEFROMPARTS(2023, 11, 5))
)
) AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC'
)
, +60
, -60
)
GO
INSERT INTO [dbo].[_timezone_period_cache]
([timezone_id]
,[offset_in_minutes]
,[timezone_id_checksum]
,[period_start_date_utc]
,[period_end_date_utc]
,[delta_previous_offset_in_minutes]
,[delta_next_offset_in_minutes]
)
VALUES
('Pacific Standard Time'
, -8*60
, CHECKSUM(CONVERT(varchar(50), 'Pacific Standard Time'))
, DATEADD(
MICROSECOND,
1,
DATEADD(
MICROSECOND,
-1,
DATEADD(HOUR,
2,
CONVERT(datetime2, DATEFROMPARTS(2023, 11, 5))
)
) AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC'
)
, DATEADD(
HOUR,
3,
CONVERT(datetime2, DATEFROMPARTS(2024, 03, 10))
) AT TIME ZONE 'Pacific Standard Time' AT TIME ZONE 'UTC'
, -60
, +60
)
GO
CREATE TABLE [dbo].[Orders](
[order_pk] [int] IDENTITY(1,1) NOT NULL,
[client_fk] [int] NOT NULL,
[store_fk] [int] NOT NULL,
[created_date_store_local] [datetime2](7) NOT NULL,
[shipping_date_store_local] [datetime2](7) NULL,
[delivered_date_client_local] [datetime2](7) NULL,
CONSTRAINT [PK_Orders] PRIMARY KEY CLUSTERED
(
[order_pk] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON, _
OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
CREATE TABLE [dbo].[Stores](
[store_pk] [int] IDENTITY(1,1) NOT NULL,
[store_tz] [varchar](50) NOT NULL,
[store_name] [nvarchar](100) NOT NULL,
CONSTRAINT [PK_Stores] PRIMARY KEY CLUSTERED
(
[store_pk] ASC
) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON,
OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
Original View
For store located in San-Francisco ('Pacific Standard Time' time zone) v_Original
view filters orders by DateTime
criteria - column value should be in a time range from 24h before now to 24 hours after now (to get different values of GETUTCDATE()
, I changed local SQL SERVER system time).
CREATE VIEW v_Original AS
SELECT O.*
FROM [Orders] O
INNER JOIN Stores S ON S.store_pk = O.store_fk
WHERE
O.store_fk = 1
AND
[created_date_store_local]
AT TIME ZONE S.store_tz < DATEADD(day, +1, GETUTCDATE())
AND
[created_date_store_local]
AT TIME ZONE S.store_tz > DATEADD(day, -1, GETUTCDATE())
Get Total Record Number Query
SELECT COUNT(*) FROM v_Original
CTE with Pagination
DECLARE @PageNumber int = 10
DECLARE @PageSize int = 100
;WITH v_Original AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY order_pk) AS RowNumber
FROM v_Original
)
SELECT * FROM v_Original
WHERE RowNumber BETWEEN ((@PageNumber - 1) * @PageSize) + 1
AND (@PageNumber * @PageSize);
Optimized View
CREATE VIEW v_Optimized AS
SELECT O.*
FROM
(
SELECT tz_name
, utc_plus_tzOffset
, DATEADD(minute
, IIF(
utc_plus_tzOffset_minus_24h <
period_start_date_utc
, -1 * delta_previous_offset_in_minutes
, 0
)
, utc_plus_tzOffset_minus_24h
) AS utc_plus_tzOffset_minus_24h_adjusted
, DATEADD(minute
, IIF(
utc_plus_tzOffset_plus_24h >
period_end_date_utc
, delta_next_offset_in_minutes
, 0
)
, utc_plus_tzOffset_plus_24h
) AS utc_plus_tzOffset_plus_24h_adjusted
, store_pk
FROM
(
SELECT
tz_name
, utc_plus_tzOffset
, store_pk
, DATEADD(day, -1, utc_plus_tzOffset)
AS utc_plus_tzOffset_minus_24h
, DATEADD(day, +1, utc_plus_tzOffset)
AS utc_plus_tzOffset_plus_24h
, period_start_date_utc, delta_previous_offset_in_minutes
, period_end_date_utc, delta_next_offset_in_minutes
FROM
(
SELECT timezone_id AS tz_name
, DATEADD(MINUTE, [offset_in_minutes],
GETUTCDATE()) AS utc_plus_tzOffset
, store_pk
, period_start_date_utc, delta_previous_offset_in_minutes
, period_end_date_utc, delta_next_offset_in_minutes
FROM
(
SELECT timezone_id, [offset_in_minutes], store_pk
, TZP.period_start_date_utc, _
TZP.delta_previous_offset_in_minutes
, TZP.period_end_date_utc, _
TZP.delta_next_offset_in_minutes
FROM [_timezone_period_cache] TZP (nolock)
INNER JOIN Stores (nolock) ON (1=1)
WHERE
-> CHECKSUM(CONVERT(varchar(50), store_tz_unicode))
TZP.[timezone_id_checksum] = CHECKSUM(store_tz)
AND [period_start_date_utc] < GETUTCDATE()
AND [period_end_date_utc] > GETUTCDATE()
) TZI
) TZI2
) TZI3
) STORE_TZ_INFO
INNER JOIN [Orders] O (nolock) ON O.store_fk = STORE_TZ_INFO.store_pk
WHERE
store_fk = 1
AND
[created_date_store_local] >= utc_plus_tzOffset_minus_24h_adjusted
AND
[created_date_store_local] <= utc_plus_tzOffset_plus_24h_adjusted
Get Total Record Number Query
SELECT COUNT(*) FROM v_Optimized
CTE with Pagination
DECLARE @PageNumber int = 10
DECLARE @PageSize int = 100
;WITH cte_optimized AS
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY order_pk) AS RowNumber
FROM v_Optimized
)
SELECT * FROM cte_optimized
WHERE RowNumber BETWEEN ((@PageNumber - 1) * @PageSize) + 1
AND (@PageNumber * @PageSize);
Test Results
For the test, I used a desktop PC with 4 core CPU, 16 GB of RAM, and HDD. I changed the system time to November 5 and 6, 2022 respectively (days before and after DST change in 'Pacific Standard Time' time zone). As expected, all queries returned the same resultsets
- it confirms the workaround is a completely working solution across date ranges.
First, I run queries when Orders
table had primary key (PK) only. Later, I added a nonclustered
index:
CREATE NONCLUSTERED INDEX [idx_Orders_created_date]
ON [dbo].[Orders] ([store_fk],[created_date_store_local])
Here are the results:
SQL Server View \ query time | count(*) | CTE with pagination |
PK only | PK + nonclustered index | PK only | PK + nonclustered index |
Original (uses AT TIME ZONE operator) | 9082 ms | 8358 ms | 318 ms | 324 ms |
Optimized (uses _timezone_period_cache table) | 1514 ms | 23 ms | 52 ms | 53 ms |
Conclusion
By using Time Zone Period cache table, SQL query performance increased from 6 to 360 times. With the proposed workaround, we can have the exact same resultset
as by using AT TIME ZONE
operator. This approach may be used not only in SQL Server views but in dynamic SQL also. Each set of fixed or floating time range criteria will have its own tricky solution with complex subqueries.
History
- 6th March, 2023: Initial version