I'm trying to fetch last modified timestamp of a table in Hive.
Here there is already an answer for how to see last modified date for a hive table. I am just sharing how to check last modified date for a hive table partition.
Connect to hive cluster to run hive queries. In most of the cases, you can simply connect by running hive command : hive
DESCRIBE FORMATTED <database>.<table_name> PARTITION(<partition_column>=<partition_value>);
In the response you will see something like this : transient_lastDdlTime 1631640957
SELECT CAST(from_unixtime(1631640957) AS timestamp);
With the help of above answers I have created a simple solution for the forthcoming developers.
time_column=`beeline --hivevar db=hiveDatabase --hivevar tab=hiveTable --silent=true --showHeader=false --outputformat=tsv2 -e 'show create table ${db}.${tab}' | egrep 'transient_lastDdlTime'`
time_value=`echo $time_column | sed 's/[|,)]//g' | awk -F '=' '{print $2}' | sed "s/'//g"`
tran_date=`date -d @$time_value +'%Y-%m-%d %H:%M:%S'`
echo $tran_date
I used beeline alias. Make sure you setup alias properly and invoke the above script. If there are no alias used then use the complete beeline command(with jdbc connection) by replacing beeline above. Leave a question in the comment if any.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With