Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to map fields in Hive for DynamoDb Amazon Console export?

I am trying to load the DynamoDb export file which is taken from Amazon Dynamodb Web Console with "Import/Export" tool into Hive. But I couldn't map the fields properly because DynamoDB Web Console "Export" tool is using "ETX" "STX".

Below is an example row ended with [LF]

Elapsed[ETX]{"n":"1477"}[STX]Device[ETX]{"n":"3"}[STX]Date[ETX]{"s":"2014-03-05T12:13:00.852Z"}[STX]Duration[ETX]{"n":"8075"}[LF]

What should be the query for this?

CREATE EXTERNAL TABLE IF NOT EXISTS TableNameHere (creationDate string, device bigint, duration bigint, elapsed bigint)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' ???This is where i got stuck??? 
LOCATION 's3://abcdefg/ino/2015-05-28_12.22';

UPDATE

I have updated query but it didn't work again.

'\002' for STX
'\012' for LF

CREATE EXTERNAL TABLE IF NOT EXISTS TableNameHere  (creationDate string, device bigint, duration bigint, elapsed bigint)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\002' 
                     LINES TERMINATED BY '\012'
LOCATION 's3://abcdefg/ino/2015-05-28_12.22';

Result for the query:

Elapsed{"n":"0"}   Device{"n":"3"}   Duration{"n":"1073876"}   Date{"s":"2014-01-27T00:52:25.491Z"}     

So, now how can i parse this data? I need to map fields. Should i use a custom SerDe?

like image 265
Barbaros Alp Avatar asked Oct 31 '22 03:10

Barbaros Alp


1 Answers

I think there's already a SerDe for that: https://github.com/lyft/dynamodb-hive-serde

like image 136
Roberto Congiu Avatar answered Nov 08 '22 09:11

Roberto Congiu