Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find the source of athena query result

We have thousands of files stored in S3. These files are exposed to athena so that we can query on them. While doing debugging i found that athena shows multiple blank lines when queries on a specific id. Given that there are thousands of files, I am not sure where that data is coming from.

Is there a way that i can see the source file for respective rows in athena result?

like image 207
Em Ae Avatar asked Jan 01 '23 13:01

Em Ae


1 Answers

There is a hidden column exposed by Presto Hive connector: "$path" This column exposes the path of the file particular row has been read from.

Note: the column name is actually $path, but you need to "-quote it in SQL. This is because $ is otherwise illegal in an identifier.

like image 200
Piotr Findeisen Avatar answered Jan 13 '23 14:01

Piotr Findeisen