I am trying to get the last modification time of each file present in azure data lake.
files = dbutils.fs.ls('/mnt/blob')
for fi in files: print(fi)
Output:-FileInfo(path='dbfs:/mnt/blob/rule_sheet_recon.xlsx', name='rule_sheet_recon.xlsx', size=10843)
Here i am unable to get the last modification time of the files. Is there any way to get that property.
I tries this below shell command to see the properties,but unable to store it in python object.
%sh ls -ls /dbfs/mnt/blob/
output:- total 0
0 -rw-r--r-- 1 root root 13577 Sep 20 10:50 a.txt
0 -rw-r--r-- 1 root root 10843 Sep 20 10:50 b.txt
We can use os
package to get the information. For example in pyspark
import os
def get_filemtime(filename):
return os.path.getmtime(filename)
You can pass the absolute path of the filename like dbfs:/mnt/adls/logs/ehub/app/0/2021/07/21/15/05/40.avro
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With