I am getting the below error while querying the table created in Amazon Athena.
Error
HIVE_CURSOR_ERROR: Row is not a valid JSON Object - JSONException: A JSONObject text must end with '}' at 2
The sample file which I am using and the query to create table is below. The table gets successfully created by the below query but when I am fetching the result from table I am getting the error. Please provide your valuable suggestion.
Note Sample Data Create table
AWS Athena does not support multi-line JSON.
Athena knowledge center
Make sure your JSON record is on a single line
Athena doesn't currently support multi-line JSON records.
I'm abusing the answer field to have more space and to have my thoughts on this a little bit structured. I hope this is useful input to anyone using Athena.
I'm using Athena to create two tables. Single-line-json-based and multi-line-json-based reports in two separate bucket folders and two according tables.
Single-line reports in JSON are straightforward. For table creation I use this driver:
ROW FORMAT serde 'org.apache.hive.hcatalog.data.JsonSerDe'
However if you use this exact one to consume multi-line you will get an error PLUS there are three more things (I, II and III) to be aware of. First of all the different driver:
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
then
I the (invalid) JSON format. My multi-line files look like this:
{"key": "value"}
{"key": "value"}
Notice there is one JSON object per line and objects are NOT comma separated! You can read about this here:
Store multiple elements in json files in AWS Athena
II Create Statement
Contrary to your screenshot for table creation @abc I used it without nested structs:
CREATE EXTERNAL TABLE IF NOT EXISTS internal_reports (
`Date` Date,
impressions INT,
reach INT,
follower INT,
...
)
III Date property
I came across this whole driver-experimenting thing because the Date property caused issues in my single-line table because my values for Date had time information not equal to midnight. For the other multi-line reports my Dates were ok (the time was exactly midnight). You can read about this here:
https://developer.ibm.com/answers/questions/177238/timestamp-format-must-be-yyyy-mm-dd-hhmmssffffffff/ Athena - DATE column correct values from JSON
Cheers
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With