I have an external table that has data partitioned by date. The data gets updated everyday for new set of files for that day. This is how i execute the job in airflow.
Is there a way to call the above command to operate only on the new file that got added for the current day so basically if i get a file for dt=2018-06-21, I can update only that partition.
Thanks!
You can add partitions manually - that's an example from Athena manual:
ALTER TABLE orders ADD
PARTITION (dt = '2016-05-14', country = 'IN') LOCATION 's3://mystorage/path/to/INDIA_14_May_2016'
PARTITION (dt = '2016-05-15', country = 'IN') LOCATION 's3://mystorage/path/to/INDIA_15_May_2016';
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With