I have a firehose that stores data in s3 in the default directory structure: "YY/MM/DD/HH" and a table in athena with these columns defined as partitions:
year: string, month: string, day: string, hour: string
after running
msck repair table clicks
I only receive:
Partitions not in metastore: clicks:2017/08/26/10
I can add these partitions manually and everything works however, I was wondering why msck repair does not add these partitions automatically and update the metastore?
For future reference, aside from the two tips mentioned in this article: https://aws.amazon.com/premiumsupport/knowledge-center/athena-aws-glue-msck-repair-table/
You also need to set the TableType
attribute to a non-null value. In my case, it was EXTERNAL_TABLE
.
To use Athena MSCK REPAIR
with S3 you need to use key-value pairs as path prefix:
clicks/year=2017/month=08/day=26/hour=10/
instead of: clicks/2017/08/26/10/
Alternatively, update the partitions directly in Glue (manually or use a crawler).
Found this here: https://forums.aws.amazon.com/message.jspa?messageID=789078
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With