I'm using SQL in Impala to write this query. I'm trying to convert a date string, stored in YYYYMMDD format, into a date format for the purposes of running a query like this:
SELECT datadate,
session_info
FROM database
WHERE datadate >= NOW() - INTERVAL 5 DAY
ORDER BY datadate DESC;
Since the >= NOW() - INTERVAL 5 DAY
code won't work with the YYYYMMDD string, I'd like to find a way to convert that into a date format that will work with this type of query. My thought is that it should look something like this (based on similar questions about other SQL query editors), but it's not working in Impala:
SELECT datadate,
session_info,
convert(datetime, '20141008', 102) AS session_date
FROM database
WHERE session_date >= NOW() - INTERVAL 5 DAY
ORDER BY session_date DESC;
Anyone know how to do this in Impala?
EDIT:
I finally found a working solution to the problem. None of the attempts using configurations of CAST
or CONVERT
would work in Impala, but the below query solves the problem and is fully operational, allowing date math to be performed on a column containing string values:
SELECT datadate,
session_info
FROM database
WHERE datadate >= from_unixtime(unix_timestamp(now() - interval 5 days), 'yyyyMMdd')
GROUP BY datadate
ORDER BY datadate DESC;
For string to date/time conversions, any separator character in the pattern string would match any separator character in the input expression . For example, CAST(“20191010” AS DATE FORMAT “YYYY-MM-DD”) returns an error, but CAST("2019-.;10 10" AS DATE FORMAT "YYYY-MM-DD") succeeds.
The DATE type is available in Impala 3.3 and higher.
Use a combination of unix_timestamp and from_unixtime and cast the resultant to timestamp to use the date_add function.
See Timestamp Literals on [Link Updated 2020-08-24]:
https://docs.cloudera.com/cdp-private-cloud-base/7.1.3/impala-sql-reference/topics/impala-literals.html
You need to add the dashes to your string so Impala will be able to convert it into a date/timestamp. You can do that with something like:
concat_ws('-', substr(datadate,1,4), substr(datadate,5,2), substr(datadate,7) )
which you can use instead of datadate
in your expression.
Native way:
to_timestamp(cast(date_number AS STRING), 'yyyyMMdd')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With