I am trying to load the DynamoDb export file which is taken from Amazon Dynamodb Web Console with "Import/Export" tool into Hive. But I couldn't map the fields properly because DynamoDB Web Console "Export" tool is using "ETX" "STX".
Below is an example row ended with [LF]
Elapsed[ETX]{"n":"1477"}[STX]Device[ETX]{"n":"3"}[STX]Date[ETX]{"s":"2014-03-05T12:13:00.852Z"}[STX]Duration[ETX]{"n":"8075"}[LF]
What should be the query for this?
CREATE EXTERNAL TABLE IF NOT EXISTS TableNameHere (creationDate string, device bigint, duration bigint, elapsed bigint)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ???This is where i got stuck???
LOCATION 's3://abcdefg/ino/2015-05-28_12.22';
UPDATE
I have updated query but it didn't work again.
'\002' for STX
'\012' for LF
CREATE EXTERNAL TABLE IF NOT EXISTS TableNameHere (creationDate string, device bigint, duration bigint, elapsed bigint)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\002'
LINES TERMINATED BY '\012'
LOCATION 's3://abcdefg/ino/2015-05-28_12.22';
Result for the query:
Elapsed{"n":"0"} Device{"n":"3"} Duration{"n":"1073876"} Date{"s":"2014-01-27T00:52:25.491Z"}
So, now how can i parse this data? I need to map fields. Should i use a custom SerDe?
I think there's already a SerDe for that: https://github.com/lyft/dynamodb-hive-serde
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With