I searched a lot on Internet but couldn't find the answer. Here is my question:
I'm writing some queries in Hive. I have a UTC timestamp and would like to change it to UTC time, e.g., given timestamp 1349049600, I would like to convert it to UTC time which is 2012-10-01 00:00:00. However if I use the built in function from_unixtime(1349049600)
in Hive, I get the local PDT time 2012-09-30 17:00:00.
I realized there is a built in function called from_utc_timestamp(timestamp, string timezone)
. Then I tried it like from_utc_timestamp(1349049600, "GMT")
, the output is 1970-01-16 06:44:09.6 which is totally incorrect.
I don't want to change the time zone of Hive permanently because there are other users. So is there any way I can get a UTC timestamp string from 1349049600 to "2012-10-01 00:00:00"? Thanks a lot!!
Unix_timestamp in Hive If we want to convert the seconds to readable timestamp format, we can use from_unixtime() function to do that. The function from_unixtime() is convert the seconds from unix epoch (1970-01-01 00:00:00 UTC) to a timestamp string.
Hive uses the default timezone of the JVM. Currently the only way to change the timezone used by Hive is to change the default timezone of the JVM.
(GMT-5:00) Eastern Time (US & Canada)Add the local time offset to the UTC time. For example, if your local time offset is -5:00, and if the UTC time is shown as 11:00, add -5 to 11. The time setting when adjusted for offset is 06:00 (6:00 A.M.).
Hive from_unixtime() is used to get Date and Timestamp in a default format yyyy-MM-dd HH:mm:ss from Unix epoch seconds. Specify the second argument in pattern format to return date and timestamp in a custom format.
As far as I can tell, from_utc_timestamp()
needs a date string argument, like "2014-01-15 11:21:15"
, not a unix seconds-since-epoch value. That might be why it is giving odd results when you pass an integer?
The only Hive function that deals with epoch seconds seems to be from_unixtime()
which gives you a timestamp string in the server timezone, which I found in /etc/sysconfig/clock
- "America/Montreal"
in my case.
So you can get a UTC timestamp string via to_utc_timestamp(from_unixtime(1389802875),'America/Montreal')
, and then convert to your target timezone with from_utc_timestamp()
It all seems very torturous, particularly having to wire your server TZ into your SQL. Life would be easier if there was a from_unixtime_utc()
function or something.
Update: from_utc_timestamp()
does deal with a milliseconds argument as well as a string, but then gets the conversion wrong.
When I try from_utc_timestamp(1389802875000, 'America/Los_Angeles')
it gives "2014-01-15 03:21:15"
which is wrong.
The correct answer is "2014-01-15 08:21:15"
which you can get (for a server in Montreal) via from_utc_timestamp(to_utc_timestamp(from_unixtime(1389802875),'America/Montreal'), 'America/Los_Angeles')
Hey just wanted to add a little here, I'd suggest trying to "automate" the system timezone. So instead of statically
#STATIC TZ deceleration
to_utc_timestamp(from_unixtime(1389802875),'America/Montreal')
Give this a shot
#DYNAMIC TZ
select to_utc_timestamp(from_unixtime(1389802875), from_unixtime(unix_timestamp(), "z"));
This just uses the string output format of "from_unixtime
" to return the timezone string (lowercase z)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With