I've just spent an hour in despair with the discrepancy in these results of these two expressions:
db=# SELECT '2012-01-18 1:0 CET'::timestamptz AT TIME ZONE 'UTC'
,'2012-01-18 1:0 Europe/Vienna'::timestamptz AT TIME ZONE 'UTC';
timezone | timezone
---------------------+---------------------
2012-08-18 00:00:00 | 2012-08-17 23:00:00
Obviously, the second expression deducts two hours according to DST rules, where the first one only uses the standard offset.
I checked the catalogs for these two time zone names. They are both there and look just the same:
db=# SELECT * FROM pg_timezone_names WHERE name IN ('CET', 'Europe/Vienna');
name | abbrev | utc_offset | is_dst
---------------+--------+------------+--------
Europe/Vienna | CEST | 02:00:00 | t
CET | CEST | 02:00:00 | t
I consulted the PostgreSQL manual about time zones:
PostgreSQL allows you to specify time zones in three different forms:
A full time zone name, for example America/New_York. The recognized time zone names are listed in the pg_timezone_names view (see Section 45.67). PostgreSQL uses the widely-used zoneinfo time zone data for this purpose, so the same names are also recognized by much other software.
A time zone abbreviation, for example PST. Such a specification merely defines a particular offset from UTC, in contrast to full time zone names which can imply a set of daylight savings transition-date rules as well. The recognized abbreviations are listed in the pg_timezone_abbrevs view (see Section 45.66). You cannot set the configuration parameters timezone or log_timezone to a time zone abbreviation, but you can use abbreviations in date/time input values and with the AT TIME ZONE operator.
Bold Emphasis mine.
So why the difference?
PostgreSQL 9.1.4 on Debian Squeeze (standard squeeze-backports from http://backports.debian.org/debian-backports)
Local timezone
setting defaults to the system locale de_AT.UTF-8
, but should be irrelevant for the example.
SELECT version();
version
-------------------------------------------------------------------------------------------------------
PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit
SHOW timezone_abbreviations;
timezone_abbreviations
------------------------
Default
.. which (I assume) loads abbreviations from this file: /usr/share/postgresql/9.1/timezonesets/Default
I am at a loss where the time zone name CET
comes from. But obviously it is there in my installations. A quick test on sqlfiddle shows the same result.
I tested on two different servers with similar setup. Also with PostgreSQL 8.4. Found 'CET' as time zone name in pg_timezone_names
in all of them.
Right after I posted this, I ran another query to check on a suspicion:
SELECT * FROM pg_timezone_abbrevs
WHERE abbrev IN ('CEST', 'CET');
abbrev | utc_offset | is_dst
--------+------------+--------
CEST | 02:00:00 | t
CET | 01:00:00 | f
As it turns out, there is also a time zone abbreviation named CET
(which makes sense, "CET" being an abbreviation). And it seems that PostgreSQL picks the abbreviation over the full name. So, even though I found CET
in the time zone names, the expression '2012-01-18 1:0 CET'::timestamptz is interpreted according to the subtly different rules for time zone abbreviations.
SELECT '2012-01-18 1:0 CEST'::timestamptz(0)
,'2012-01-18 1:0 CET'::timestamptz(0)
,'2012-01-18 1:0 Europe/Vienna'::timestamptz(0);
timestamptz | timestamptz | timestamptz
------------------------+------------------------+------------------------
2012-01-18 00:00:00+01 | 2012-01-18 01:00:00+01 | 2012-01-18 01:00:00+01
SELECT '2012-08-18 1:0 CEST'::timestamptz(0)
,'2012-08-18 1:0 CET'::timestamptz(0)
,'2012-08-18 1:0 Europe/Vienna'::timestamptz(0);
timestamptz | timestamptz | timestamptz
------------------------+------------------------+------------------------
2012-08-18 01:00:00+02 | 2012-08-18 02:00:00+02 | 2012-08-18 01:00:00+02
I find 10 cases of time zone abbreviations in the time zone names and fail to understand why those are there. What's the purpose?
Among those, the time offset (utc_offset
) disagrees in four cases due to the DST setting:
SELECT n.*, a.*
FROM pg_timezone_names n
JOIN pg_timezone_abbrevs a ON a.abbrev = n.name
WHERE n.utc_offset <> a.utc_offset;
name | abbrev | utc_offset | is_dst | abbrev | utc_offset | is_dst
------+--------+------------+--------+--------+------------+--------
CET | CEST | 02:00:00 | t | CET | 01:00:00 | f
EET | EEST | 03:00:00 | t | EET | 02:00:00 | f
MET | MEST | 02:00:00 | t | MET | 01:00:00 | f
WET | WEST | 01:00:00 | t | WET | 00:00:00 | f
In these cases, people may be fooled (like I was), looking up the tz name and finding a time offset that is not actually applied. That is an unfortunate design - if not a bug, at least a documentation bug.
I fail to find anything in the manual about how ambiguities between time zone names and abbreviations are resolved. Obviously abbreviations take precedence.
Appendix B.1. Date/Time Input Interpretation mentions the lookup for time zone abbreviations, but it remains unclear how time zone names are identified and which of them has priority in case of an ambiguous token.
If the token is a text string, match up with possible strings:
Do a binary-search table lookup for the token as a time zone abbreviation.
Well, there is a slight hint in this sentence that abbreviations come first, but nothing definitive. Also, there is a column abbrev
in both tables, pg_timezone_names
and pg_timezone_abbrevs
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With