My python code works correctly in the below example. My code combines a directory of CSV files and matches the headers. However, I want to take it a step further - how do I add a column that appends the filename of the CSV that was used?
import pandas as pd import glob globbed_files = glob.glob("*.csv") #creates a list of all csv files data = [] # pd.concat takes a list of dataframes as an agrument for csv in globbed_files: frame = pd.read_csv(csv) data.append(frame) bigframe = pd.concat(data, ignore_index=True) #dont want pandas to try an align row indexes bigframe.to_csv("Pandas_output2.csv")
Use DictReader DictWriter to add a column in existing csv file. Python's csv module provides two other class for reading and writing contents in the csv file i.e. DictReader & DictWriter. It performs all the operations using dictionaries instead of lists.
This should work:
import os for csv in globbed_files: frame = pd.read_csv(csv) frame['filename'] = os.path.basename(csv) data.append(frame)
frame['filename']
creates a new column named filename
and os.path.basename()
turns a path like /a/d/c.txt
into the filename c.txt
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With