I am trying to upload the data from this json:
JSON-A
[
{"name": "james", "id": 41},
{"name": "scott", "id": 62},
{"name": "abhi", "id": 16},
{"name": "kevin", "id": 53},
{"name": "beau", "id": 12},
{"name": "shally", "id": 35},
{"name": "jude", "id": 53},
{"name": "jason", "id": 77},
{"name": "hongjian", "id": 35},
{"name": "madhur", "id": 6}
]
If it had the parent "data" key like
JSON-B
["data":{"name": "james","id": 41"},{.....}]
then I know I can do like :
CREATE EXTERNAL TABLE IF NOT EXISTS test.test (
`data` array<struct<`name`:string,`id`:bigint>>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = '1'
) LOCATION 's3://test-bucket/';
But what should be the way to do for JSON-A?
I do not think this will work, because of the Hive/SerDe assumption that your text file contain one record per line. From the org.openx.data.jsonserde.JsonSerDe documentation:
IMPORTANT!!! READ THIS BELOW!! Json records must be one per line, that is, the serde WILL NOT WORK with multiline Json.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With