Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time zone names with identical properties yield different result when applied to timestamp

I've just spent an hour in despair with the discrepancy in these results of these two expressions:


db=# SELECT '2012-01-18 1:0 CET'::timestamptz AT TIME ZONE 'UTC'
           ,'2012-01-18 1:0 Europe/Vienna'::timestamptz AT TIME ZONE 'UTC';
      timezone       |      timezone
---------------------+---------------------
 2012-08-18 00:00:00 | 2012-08-17 23:00:00

Obviously, the second expression deducts two hours according to DST rules, where the first one only uses the standard offset.

I checked the catalogs for these two time zone names. They are both there and look just the same:

db=# SELECT * FROM pg_timezone_names WHERE name IN ('CET', 'Europe/Vienna');
     name      | abbrev | utc_offset | is_dst
---------------+--------+------------+--------
 Europe/Vienna | CEST   | 02:00:00   | t
 CET           | CEST   | 02:00:00   | t

I consulted the PostgreSQL manual about time zones:

PostgreSQL allows you to specify time zones in three different forms:

A full time zone name, for example America/New_York. The recognized time zone names are listed in the pg_timezone_names view (see Section 45.67). PostgreSQL uses the widely-used zoneinfo time zone data for this purpose, so the same names are also recognized by much other software.

A time zone abbreviation, for example PST. Such a specification merely defines a particular offset from UTC, in contrast to full time zone names which can imply a set of daylight savings transition-date rules as well. The recognized abbreviations are listed in the pg_timezone_abbrevs view (see Section 45.66). You cannot set the configuration parameters timezone or log_timezone to a time zone abbreviation, but you can use abbreviations in date/time input values and with the AT TIME ZONE operator.

Bold Emphasis mine.

So why the difference?

My setup (more details added)

  • PostgreSQL 9.1.4 on Debian Squeeze (standard squeeze-backports from http://backports.debian.org/debian-backports)

  • Local timezone setting defaults to the system locale de_AT.UTF-8, but should be irrelevant for the example.

SELECT version();
                                                version
-------------------------------------------------------------------------------------------------------
 PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 64-bit

SHOW timezone_abbreviations;

 timezone_abbreviations
------------------------
 Default

.. which (I assume) loads abbreviations from this file: /usr/share/postgresql/9.1/timezonesets/Default

I am at a loss where the time zone name CET comes from. But obviously it is there in my installations. A quick test on sqlfiddle shows the same result.

I tested on two different servers with similar setup. Also with PostgreSQL 8.4. Found 'CET' as time zone name in pg_timezone_names in all of them.

like image 200
Erwin Brandstetter Avatar asked Aug 20 '12 17:08

Erwin Brandstetter


1 Answers

Right after I posted this, I ran another query to check on a suspicion:

SELECT * FROM pg_timezone_abbrevs
WHERE  abbrev IN ('CEST', 'CET');

 abbrev | utc_offset | is_dst
--------+------------+--------
 CEST   | 02:00:00   | t
 CET    | 01:00:00   | f

As it turns out, there is also a time zone abbreviation named CET (which makes sense, "CET" being an abbreviation). And it seems that PostgreSQL picks the abbreviation over the full name. So, even though I found CET in the time zone names, the expression '2012-01-18 1:0 CET'::timestamptz is interpreted according to the subtly different rules for time zone abbreviations.

SELECT '2012-01-18 1:0 CEST'::timestamptz(0)
      ,'2012-01-18 1:0 CET'::timestamptz(0)
      ,'2012-01-18 1:0 Europe/Vienna'::timestamptz(0);

      timestamptz       |      timestamptz       |      timestamptz
------------------------+------------------------+------------------------
 2012-01-18 00:00:00+01 | 2012-01-18 01:00:00+01 | 2012-01-18 01:00:00+01


SELECT '2012-08-18 1:0 CEST'::timestamptz(0)
      ,'2012-08-18 1:0 CET'::timestamptz(0)
      ,'2012-08-18 1:0 Europe/Vienna'::timestamptz(0);

      timestamptz       |      timestamptz       |      timestamptz
------------------------+------------------------+------------------------
 2012-08-18 01:00:00+02 | 2012-08-18 02:00:00+02 | 2012-08-18 01:00:00+02

I find 10 cases of time zone abbreviations in the time zone names and fail to understand why those are there. What's the purpose?

Among those, the time offset (utc_offset) disagrees in four cases due to the DST setting:

SELECT n.*, a.*
FROM   pg_timezone_names n 
JOIN   pg_timezone_abbrevs a ON  a.abbrev = n.name
WHERE  n.utc_offset <> a.utc_offset;

 name | abbrev | utc_offset | is_dst | abbrev | utc_offset | is_dst
------+--------+------------+--------+--------+------------+--------
 CET  | CEST   | 02:00:00   | t      | CET    | 01:00:00   | f
 EET  | EEST   | 03:00:00   | t      | EET    | 02:00:00   | f
 MET  | MEST   | 02:00:00   | t      | MET    | 01:00:00   | f
 WET  | WEST   | 01:00:00   | t      | WET    | 00:00:00   | f

In these cases, people may be fooled (like I was), looking up the tz name and finding a time offset that is not actually applied. That is an unfortunate design - if not a bug, at least a documentation bug.

I fail to find anything in the manual about how ambiguities between time zone names and abbreviations are resolved. Obviously abbreviations take precedence.

Appendix B.1. Date/Time Input Interpretation mentions the lookup for time zone abbreviations, but it remains unclear how time zone names are identified and which of them has priority in case of an ambiguous token.

If the token is a text string, match up with possible strings:

Do a binary-search table lookup for the token as a time zone abbreviation.

Well, there is a slight hint in this sentence that abbreviations come first, but nothing definitive. Also, there is a column abbrev in both tables, pg_timezone_namesand pg_timezone_abbrevs ...

like image 121
Erwin Brandstetter Avatar answered Oct 08 '22 17:10

Erwin Brandstetter