Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Timestamps and time zone conversions in Java and MySQL

I'm developing a Java application with a MySQL database on a server located in a different time zone from mine, and I am trying to decide between using DATETIME or TIMESTAMP on my database.

After reading questions like Should I use field 'datetime' or 'timestamp'?, and the MySQL documentation, I decided TIMESTAMP was better for me as it converts values to UTC for storage, and back to the current time zone for retrieval.

Also, as user Jesper explains in this thread, java.util.Date objects are internally only a UTC timestamp (i.e. number of milliseconds since the Epoch), and when you do a toString() it is displayed according to your current time zone.

For me, that looks like a good practice: storing datetimes as UTC timestamps, and then displaying them according to the current time zone.

I was about to do it like that, but then I found this from the Java documentation for Prepared Statements and got very confused:

void setTimestamp(int parameterIndex, Timestamp x, Calendar cal) throws SQLException

Sets the designated parameter to the given java.sql.Timestamp value, using the given Calendar object. The driver uses the Calendar object to construct an SQL TIMESTAMP value, which the driver then sends to the database. With a Calendar object, the driver can calculate the timestamp taking into account a custom timezone. If no Calendar object is specified, the driver uses the default timezone, which is that of the virtual machine running the application.

Before this, I thought timestamps were by convention always in UTC. Why on earth would anyone want a localized timestamp instead of a localized representation of it? Wouldn't that be very confusing for everyone?

How do these conversions work? If Java takes an UTC timestamp and converts it to an arbitrary time zone, how can it tell MySQL in which timezone it is?

Won't MySQL assume that this timestamp is in UTC and then retrieve an incorrect localized value?

like image 607
e18r Avatar asked Sep 26 '14 18:09

e18r


2 Answers

Date-Time Handling Is A Mess

The first paragraph in the answer by Teo is quite insightful and correct: Date-time handling in Java is a mess. Ditto for all other languages & development environments that I know of. Date-time work is difficult and tricky, especially error-prone and frustrating because we think it of date-time intuitively. But "intuitively" does not cut it when it comes to data types, databases, serialization, localization, adjusting across time zones, and all the other formalities that come with computer programming.

Unfortunately, the computer industry basically chose to ignore this problem of date-time work. Just as Unicode took too long to be invented given the obvious need, so too has the industry kicked the can down the road on solving the problem of date-time handling.

Do Not Rely On Count-Since-Epoch

But I must disagree with its conclusion. Working with a count-since-epoch is not the best solution. Using count-since-epoch is inherently confusing and error-prone and incompatible.

  • Humans cannot read a long number and decipher that as a date-time. So verifying data and debugging becomes complicated, to say the least.
  • What "count" would you use? The milliseconds used by java.util.Date and by Joda-Time? The microseconds used by Postgres, MySQL, and other databases? The nanoseconds used by the new java.time package in Java 8?
  • Which epoch would you use? The Unix epoch of the beginning of 1970 in UTC is common, but far from singular. Almost two dozen epochs have been used by various computer systems.

We create numeric data types for doing math rather than using bits. We create string classes to handle the nitty-gritty details of handling text rather than bare octets. So too we should create data-types and classes to handle date-time values.

The early Java teams (and IBM & Taligent before them) made an attempt with the java.util.Date and java.util.Calendar and related classes. Unfortunately, the attempt was inadequate. While date-time is inherently confusing, these classes have added even more confusion.

Joda-Time

As far as I know, the Joda-Time project was the first project to take on date-time in a thorough, competent, and successful manner. Even so, the creators of Joda-Time were not entirely satisfied. They went on to create the java.time package in Java 8, and extend that work with the threeten-extra project. Joda-Time and java.time share similar concepts but are distinct, each having some advantages.

Database Problems

Specifically, the java.util.Date & .Calendar classes lack date-only values without time-of-day and time zone. And they lack time-only values without date and time zone. Before Java 8, the Java team added the hacks known as the java.sql.Date and java.sql.Time classes which is a date-time value masquerading as a date-only. Both Joda-Time and java.time rectify that by offering LocalDate and LocalTime classes.

Another specific problem is that java.util.Date has a resolution of milliseconds, but databases frequently use microseconds or nanoseconds. In an ill-advised attempt to bridge this disparity, the early Java team created another hack, the java.sql.Timestamp class. While technically a java.util.Date subclass, it also tracks the fractional seconds to nanosecond resolution. So when converting in and out of this type you may losing or gaining the finer fractional seconds granularity without being conscious of that fact. So that might mean that values you expect to be equal are not.

Another source of confusion is the SQL data type, TIMESTAMP WITH TIME ZONE. That name is a misnomer as the time zone info is not stored. Think of the name as TIMESTAMP WITH RESPECT FOR TIME ZONE as any passed time zone offset info is used in converting the date-time value to UTC.

The java.time package with its nanosecond resolution has some specific features to better communicate date-time data with a database.

I could write much more, but such information can be gleaned from searching StackOverflow for words such as joda, java.time, sql timestamp, and JDBC.

Example using Joda-Time with JDBC with Postgres. Joda-Time uses immutable objects for thread-safety. So rather than alter an instance ("mutate"), we create a fresh instance based on the values of the original.

String sql = "SELECT now();";
…
java.sql.Timestamp now = myResultSet.getTimestamp( 1 );
DateTime dateTimeUtc = new DateTime( now , DateTimeZone.UTC );
DateTime dateTimeMontréal = dateTimeUtc.withZone( DateTimeZone.forID( "America/Montreal" ) );

Focus On UTC

Before this, I thought timestamps were by convention always in UTC. Why on earth would anyone want a localized timestamp instead of a localized representation of it? Wouldn't that be very confusing for everyone?

Indeed. The SQL standard defines a TIMESTAMP WITHOUT TIME ZONE which ignores and strips away any included time zone data. I cannot imagine the usefulness of that. This Postgres expert, David E. Wheeler, says as much in recommending always using TIMESTAMP WITH TIME ZONE. Wheeler cites one narrow technical exception (partitioning) and even then says to convert all the values to UTC yourself before saving to the database.

The best practice is to work and store data in UTC while adjusting to localized time zones for presentation to the user. There may be times when you want to remember the original date-time data in its localized time zone; if so, save that value in addition to converting to UTC.

Guidelines

The first steps to better date-time handling are avoiding java.util.Date & .Calendar, using Joda-Time and/or java.time, focusing on UTC, and learning the behavior of your specific JDBC driver and your specific database (databases vary widely in their date-time handling despite the SQL standard).

MySQL

Caveat: I don’t use MySQL (I'm a Postgres kind of guy).

According to the version 8 documentation, the two types DATETIME and TIMESTAMP differ in that the first one lacks any concept of time zone or offset-from-UTC. The second one uses any indication of time zone or offset-from-UTC accompanying an input to adjust that value to UTC, then stores it, and discards the zone/offset info.

So these two types seem to be akin to the standard SQL types:

  • MySQL DATETIME ≈ SQL-standard TIMESTAMP WITHOUT TIME ZONE
  • MySQL TIMESTAMP ≈ SQL-standard TIMESTAMP WITH TIME ZONE

For MySQL DATETIME, use the Java class LocalDateTime. That class, like that data type, purposely lacks any concept of time zone or offset-from-UTC. Use this type and class for either:

  • When you mean any zone or all zones, such as “Christmas starts on first moment of December 25, 2018”. That translates to different moments in different places as a new day dawns earlier in the east than in the west.
  • When scheduling appointments or events far enough out in the future that politicians may change the offset of the time zone, for which politicians around the world have shown a proclivity. In this usage, you must at runtime apply a time zone to dynamically calculate, but not store, a moment for display on a calendar. That way, a 15:00 dental appointment in 8 months remains at 15:00 even if politicians redefine the clock to be minutes/hours ahead or behind.

For MySQL TIMESTAMP, use the Java class Instant, as shown above. Use this type and class for moments, specific point on the timeline.

JDBC 4.2

As of JDBC 4.2 and later, we can directly exchange java.time objects with the database. Use getObject & setObject methods.

myPreparedStatement.setObject( … , Instant.now() ) ;

Retrieval.

Instant instant = myResultSet.getObject( … , Instant.class ) ;

The JDBC 4.2 specification requires a driver to support OffsetDateTime but strangely does not require support for the more common types Instant and ZonedDateTime. But converting between types is quite easy.

OffsetDateTime odt = myResultSet.getObject( … , OffsetDateTime.class ) ;
Instant instant = odt.toInstant() ;

You can then adjust that UTC value in Instant to a specific time zone for presentation to a user.

ZoneId z = ZoneId.of( "Pacific/Auckland" ) ;
ZonedDateTime zdt = instant.atZone( z ) ;

About java.time

The java.time framework is built into Java 8 and later. These classes supplant the troublesome old legacy date-time classes such as java.util.Date, Calendar, & SimpleDateFormat.

The Joda-Time project, now in maintenance mode, advises migration to the java.time classes.

To learn more, see the Oracle Tutorial. And search Stack Overflow for many examples and explanations. Specification is JSR 310.

You may exchange java.time objects directly with your database. Use a JDBC driver compliant with JDBC 4.2 or later. No need for strings, no need for java.sql.* classes.

Where to obtain the java.time classes?

  • Java SE 8, Java SE 9, Java SE 10, and later
  • Built-in.
  • Part of the standard Java API with a bundled implementation.
  • Java 9 adds some minor features and fixes.
  • Java SE 6 and Java SE 7
  • Much of the java.time functionality is back-ported to Java 6 & 7 in ThreeTen-Backport.
  • Android
  • Later versions of Android bundle implementations of the java.time classes.
  • For earlier Android (<26), the ThreeTenABP project adapts ThreeTen-Backport (mentioned above). See How to use ThreeTenABP….

The ThreeTen-Extra project extends java.time with additional classes. This project is a proving ground for possible future additions to java.time. You may find some useful classes here such as Interval, YearWeek, YearQuarter, and more.

like image 129
Basil Bourque Avatar answered Sep 20 '22 19:09

Basil Bourque


Your question is spot on a problem which i think is huge these days. Both DB (via SQL) and server side itself (via programming languages such as Java) offer a compendium of ways of dealing with date and time. I would qualify the status-quo as highly non-standardized and a bit chaotic (personal opinion :)

My answer is partial but i'll explain why.

You're correct, Java's Date (and Calendar) store time as milliseconds since the Unix Epoch (which is great). It happens not only in Java but in other programming languages as well. In my opinion the perfect time-keeping architecture emerges naturally from this: the Unix Epoch is January 1st, 1970, midnight, UTC. Therefore if you choose to store time as milliseconds since the Unix Epoch you have a lot of benefits:

  • architecture clarity: server side works with UTC, client side shows the time through its local timezone
  • database simplicity: you store a number (milliseconds) rather than complex data structures like DateTimes
  • programming efficiency: in most programming languages you have date/time objects capable of taking milliseconds since Epoch when constructed (which as you said, allows for automatic conversion to client-side timezone)

I find code and architecture is much simpler and more flexible when using this approach. I stopped trying to understand things like DateTime (or Timestamp) and only deal with them when i have to fix legacy code.

like image 31
Sandman Avatar answered Sep 17 '22 19:09

Sandman