Time Zones are a Drag ... Seriously

Time Zones
Time Zones
Time Zones were definitely being a drag today. I got an email from one of the developers at work asking about the performance difference between 2 queries. The only difference between the 2 queries is that one of them uses the AT TIME ZONE clause that was added in SQL Server 2016. I have not played around with this particular clause, but we do store quite a bit of data in the datetimeoffset data type. In the table in the developer’s queries, the data is all stored in the Eastern time zone, but they are considering storing it in additional time zones and will want to be able to display it in the Eastern time zone even if not stored that way. Thus, AT TIME ZONE.

When the developer was testing the conversion function, he noticed that the query slowed waaaayyyyy down when he added AT TIME ZONE. Before adding AT TIME ZONE to the query, STATISTICS TIME for the query was: CPU time: 145549 ms, elapsed time: 21693 ms.. It returned 8,996 rows, but if I removed the DISTINCT, it returned over 72M rows. That’s a lot of clams … er, data.

Query 1:

After adding AT TIME ZONE, STATISTICS TIME for the query was: CPU time: 3596391 ms, elapsed time: 470998 ms.. STATISTICS IO was the same for both queries, but the time went from less than 22 seconds to almost 8 minutes.

And the plans look almost identical. No scans instead of seeks, no extra operators. Nothing that would lead you to expect the performance to be that different.

Estimated Plans
Estimated Plans (click to enlarge)

So what could make it run so much slower? Maybe there’s a clue on Books Online the docs page for AT TIME ZONE. Right near the top is a note that says:

AT TIME ZONE implementation relies on a Windows mechanism to convert datetime values across time zones.

SQL Server is not performing the time zone conversion, it is shelling out to the OS and having it perform the conversion. And if you know anything about how functions in the SELECT clause are handled, the conversion is done once per row. It is shelling out to the OS 72M+ times to convert the data. YOWZA!

I queried sys.dm_exec_session_wait_stats after running the query expecting to see some PREEMPTIVE_xxxx waits, but I did not. It shows that most of the rows were waiting on CXPACKET waits:

Session Wait Stats
Session Wait Stats (click to enlarge)

I thought if I could get it to perform the DISTINCT before it performed the conversion, it would be much faster. Testing that theory proved better than I anticipated. I moved the entire CONVERT() and AT TIME ZONE outside of the DISTINCT, and the whole query was actually 2 seconds faster than the original query even with the AT TIME ZONE clause.

Normally, as a best practice, I would say not to use functions in the SELECT list if you can avoid it, but quite often you cannot. And quite often the recordset returned is small enough that it doesn’t matter. If you have to have functions in the SELECT list (especially with a function as bad as AT TIME ZONE), then do what you can to reduce the number of rows on which it has to operate.

That of course led me to this final question for the developer: do you really need to return this much data? He answered that he does not, but he wanted a simple query with which to test the clause.

EDIT: Submitted a Connect item for this after talking to Adam Machanic (blog | @AdamMachanic): https://connect.microsoft.com/SQLServer/feedback/details/3144414.

Reposted with permissions from SQLSoldier.come.

54321
(0 votes. Average 0 of 5)