From a CSV file (with a header and a pipe delimiter) I've got the following content which contains a JSON column (with a collection inside), like this:
ProductId|IngestTime|ProductOrders
9180|20171025145034|[{"OrderId":"299","Location":"NY"},{"OrderId":"499","Location":"LA"}]
8251|20171026114034|[{"OrderId":"1799","Location":"London"}]
What I need is to create a SELECT Hive query which returns:
ProductId IngestTime OrderId OrderLocation
9180 20171025145034 299 NY
9180 20171025145034 499 LA
8251 20171026114034 1799 London
So far, I tried many combinations by using 'explode', 'get_json_object' and so on, but I still haven't found the right SQL query.
Have you got a solution ?
Thanks a lot for your help :-)
The explode function explodes the dataframe into multiple rows.
Apache Hive There is another interesting rather unconventional method to handle JSON data in HIVE. json_tuple and LATERAL VIEW. Table here only has one column that loads JSON data as a single string. json_tuple() is a User defined Table Function ( UDTF ) introduced in Hive 0.7.
I was having similar kind of requirement. The solution from this link helped me solve it. BTW, below is the query for your requirement assuming all the columns in your DB_TABLE are of type 'String'.
SELECT ProductId,
IngestTime,
split(split(results,",")[0],':')[1] AS OrderId,
regexp_replace(split(split(results,",")[1],':')[1], "[\\]|}]", "") AS OrderLocation
FROM
(SELECT ProductId,
IngestTime,
split(translate(ProductOrders, '"\\[|]|\""',''), "},") AS r
FROM DB_TABLE) t1 LATERAL VIEW explode(r) rr AS results
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With