Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Athena date_parse for date with optional millisecond field

I have date in S3 using which I created an Athena table. I have some date entries in S3 in json format which Athena is not accepting as either Date or timestamp when am running the queries.

Using AWS Athena which uses Prestodb as query engine

Example json :


    {"creationdate":"2018-09-12T15:49:07.269Z", "otherfield":"value1"}
    {"creationdate":"2018-09-12T15:49:07Z", "otherfield":"value2"}

AWS Glue is taking both the fields as string and when am changing them to timestamp and date respectively the queries around timestamp are not working giving ValidationError on the timestamp field.

Anyway, I found a way to use prestodb date_parse function but its not working either since some fields have milliseconds while other not.


    parse_datetime(creationdate, '%Y-%m-%dT%H:%i:%s.%fZ')
    parse_datetime(creationdate, '%Y-%m-%dT%H:%i:%sZ')

Both are failing because of different entries present i.e. one with millisecond %f and one without Is there a way to provide a parser, regex so that am able to convert these strings into Date during sql query execution?

like image 393
Scorpion Avatar asked Feb 03 '23 21:02

Scorpion


1 Answers

Instead of providing the timestamp format, you can use the from_iso8601_timestamp function.

This way, all timestamps get parsed.

select from_iso8601_timestamp(creationdate) from table1;
like image 74
jens walter Avatar answered Feb 20 '23 22:02

jens walter