Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Add modification of file time stamp column while using glob

I have multiple files in a folder that get modified by users at different times. Every week I go and consolidate them all in to one master file but I need to keep track when the file was modified last time. This is a manual process which I am trying to automate.

I wrote the glob code but cant seem to be able to add a column that would provide modification times from each file in to the master file

all_data = pd.DataFrame()
for f in glob.glob("..\Python_Practice\Book*.xlsx"):
    df = pd.read_excel(f)
    all_data = all_data.append(df, ignore_index=True)
all_data.head()


all_data[time] = time.strftime('%m%d%H%M', os.path.gmtime('file')

It doesn't really work and cant find anything on the forums that might do something like it

like image 878
Arc89 Avatar asked Sep 16 '25 13:09

Arc89


1 Answers

you are close, but you need to loop through your files and pass the os.path.getmtime into a list. you can then pass these to the index.

The following will

  • Find all .xlsx files
  • merge them into one list
  • get the last modified unix time
  • convert the unix time into a datetime
  • concat the dataframes into a single one and pass the datetime into the index.
  •     from datetime import datetime 
        allFiles = glob.glob('*.xlsx')
        dfs = [pd.read_excel(f) for f in allFiles]
        keys = [datetime.fromtimestamp(os.path.getmtime (f)).strftime('%Y-%m-%d %H:%M:%S') for f in allFiles]
        frame = pd.concat(dfs, keys=keys)
    
    like image 71
    Umar.H Avatar answered Sep 19 '25 04:09

    Umar.H