Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In the database, why can't we just use "Long" integers for dates (millis since epoch)

I would like to use a Long datatype in the database to represent dates (as millis since epoch). The reason why, is that storing dates is so complex with the jdbc driver and Oracle engine. If you submit the wrong datatype in preparedStatement it casts a timestamp to a date (or vice versa) blowing your index, resulting in full table scans in the worst case scenario. I can't remember the details, but I know that there are details to remember. I don't want to have to remember details. It seems like just storing dates as long (millis since epoch) would work here just fine and I have nothing to remember.

Note, I feel that a time zone is merely presentational. It should never be stored in the first place. Most companies have a policy of only using UTC, but once again, that is just more information to know. Let's all just store the number of milliseconds since epoch, and upon display show the user the millis formatted to THEIR particular time zone.

EDIT: It just seems like millions of dollars in bugs and wasted productivity/confusion revolve around time zones and date formatting, along with crazy jdbc driver date conversion/casting. Let's do away with it once and for all.

I realize now that another reason against doing what I suggest, is that we may want to save space in the db. In that case we can have decaseconds or even simply "seconds" since epoch. To whichever degree you would like to save space.

like image 886
JChristopher Avatar asked Jul 17 '10 23:07

JChristopher


1 Answers

You can use integers or some other appropriate type to represent date/time values in your database. However it does introduce a couple of problems:

  • All of your current and future (!!!) applications (Java or otherwise) that use the database tables need to convert between database integers and date/time values.

  • If your SQL queries, SQL stored procedures or SQL triggers involve the relevant columns, they may need to do the conversions. That makes them more complicated / hard to get right. Furthermore, the extra complexity could cause the query engine to fall back to a slower way of executing your query ... because the query optimizer doesn't understand what you are doing.

  • Using integers to represent dates will make it harder for people to do ad-hoc SQL queries against the table that involve the date columns.

EDIT

Whether you save space by representing dates as integers is database specific. IIRC, Oracle stores date/time values as a primitive integers. And I expect any decent RDBMS would do the same ... both for storage space and query efficiency.

Storing date/time values as integers does not address timezone issue. This is a fundamentally complicated problem.

  • In some cases you want a date or date/time to represent an absolute point in time.
  • In other cases, you want it to represent a point in time relative to the geographical location / timezone in which some event occurred.
  • In other cases, you want to represent the time relative to the location of the system, or of the user.

Then there is the question of granularity. (Using ISO date syntax for clarity) does "1999" mean the same thing as "1999-01-01" or "1999-01-01T00:00:00"? And what about sub-second precision?

The point is that no simple SQL date/time (or integer) representation can deal with all of this. The root cause of the bugs/issues/difficulties we see is programmers taking shortcuts in their implementation and (more importantly) being sloppy in their thinking.

EDIT 2

There is still the issue that many problems still arise due to differences in the way that various databases and their respective JDBC drivers handle date literals and assumptions about timezones. I'm not an expert on JDBC and the SQL standard(s) but the root problems seem to be:

  • Database vendors have not fully implemented the datetime types as specified in (for example) SQL-92.
  • The SQL standards don't specify a single syntax for datetime literals, but allow database vendor specific syntaxes.
  • The SQL standards allow the timezone to be defaulted, without any standard way to specify (or even find out) what it defaults to. More vendor-specific solutions.
  • The JDBC specification doesn't provide a way for an application to get or set the default timezone, or select date-time literal syntaxes.

All of this adds up to a mess of portability and configuration issues for the Java developer. However, I don't think it is significantly worse than the portability and configuration problems we get with other aspects of Java / RDBMS applications. And by ditching datetime types entirely, you'd be buying into a bunch of other problems; see above.

like image 199
Stephen C Avatar answered Sep 18 '22 08:09

Stephen C