I'm currently running an aggregate query, summing amounts sold (say) on a given date.
select convert(date, datetimesold), sum(amountsold) from tblSold
group by convert(date, datetimesold)
where datetimesold is a datetime value.
The convert(date,...)
gets rid of the time value, so the group by
can group by whole day.
Already this isn't efficient, as it requires a table scan for the convert per row - a better approach would be to add a 'datesold' column containing just the date value, indexed, and include this value on insert each time. But this would lose precision on that column, and that is important, because...
datetimesold
is a UTC datetime. So my question is: say I wanted to group by day, but on Eastern US time. I would have to add an offset in hours to datetimesold before doing the convert in the group by - group by convert(date, dateadd(hours, -5, datetimesold))
- but even then this wouldn't be always accurate due to daylight savings - EDT -4 hours, EST -5 hours.
Do I have any efficient options to do this in SQL? Are there any timezone-aware functions that I can use here?
EDIT: To further clarify, I'm operating on Azure SQL databases.
You already noticed that it is difficult to convert from UTC to local time zone correctly. In fact, it is very difficult, because rules for daylight savings change. You need to maintain a historical database of timezones to do it properly.
I store two timestamps - in UTC and in local time zone. In some reports we need UTC, in some local.
Usually it is easy to convert between UTC and local time zone when the row is inserted and the OS of the client computer, which generates the data is in correct local time zone. At that moment OS knows both local and UTC time. But, if you have historical data from previous years it becomes much more difficult to perform such conversion.
SQL Server 2016 promises to add somewhat better support for time zones, see: AT TIME ZONE.
As for your concern about table scan - you'll always have to scan the whole table to calculate the SUM
, so extra CONVERT
to date
doesn't really matter.
On the other hand,
If you have a separate column that stores just date
, not datetime
, the query will be a bit more efficient, because date
takes less bytes than datetime
, so less bytes to read from disk.
If you add an index on (datesold, amountsold)
, then the GROUP BY
would not have to do an extra sort, which also makes a query more efficient.
So, in the current version of SQL Server I'd add an indexed date
column which would contain a date in the time zone that you need for your reports. If there is a need for reports in UTC and Eastern US time zones, I'd add two separate date
columns.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With