Amazon athena can't read S3 JSON Object files and Athena select query returns empty result sets for JSON key columns

Question

I create a table in Athena with below structure

CREATE EXTERNAL TABLE s3_json_objects (
    devId string,
    type string,
    status string
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
WITH SERDEPROPERTIES ( 'ignore.malformed.json' = 'true')
LOCATION 's3://mybucket/folder1/data/athena_test/';

S3 bucket objects contains JSON structure like this
{
    "devId": "00abcdef1122334401",
    "type": "lora",
    "status": "huihuhukiyg"
}

However below SQL working correctly and return the correct result for only count

SELECT count(*) as total_s3_objects FROM "athena_db"."s3_json_objects"

BUT whenever I query below SQL select statement to fetch the JSON values from S3, It's returns result sets with empty values for columns
SELECT devid FROM "athena_db"."s3_json_objects"
SELECT json_extract(devid , '$.devid') as Id FROM "athena_db"."s3_json_objects"
SELECT * FROM "athena_db"."s3_json_objects"

Also, I review these links before post this question on StackOverflow and AWS Athena doc

Can't read json file via Amazon Athena

AWS Athena json_extract query from string field returns empty values

Any comments or suggestions would be much appreciated.

Tom · Accepted Answer

The JSON must be in a single line, as mentioned in this page of the AWS Athena documentation. You can have multiple JSON objects on separate lines, but each complete object must only span one line.

Example (this could all be in one S3 object):

{"devId": "a1", "type": "b1", "status": "c1"}
{"devId": "a2", "type": "b2", "status": "c2"}

Tanveer Uddin · Answer

Glue can read multi-line json objects because it has spark engine under the hood. One workaround is, transform those json objects to parquet using glue if you can't easily make those json objects on line.

Amazon athena can't read S3 JSON Object files and Athena select query returns empty result sets for JSON key columns

Tags:

json

amazon-web-services

amazon-athena

presto

SamDev

2 Answers

Tom

Tanveer Uddin

Recent Activity

Donate For Us

Amazon athena can't read S3 JSON Object files and Athena select query returns empty result sets for JSON key columns

Tags:

json

amazon-web-services

amazon-athena

presto

SamDev

2 Answers

Tom

Tanveer Uddin

Related questions

Recent Activity

Donate For Us