Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas add Filename Column CSV

Tags:

My python code works correctly in the below example. My code combines a directory of CSV files and matches the headers. However, I want to take it a step further - how do I add a column that appends the filename of the CSV that was used?

import pandas as pd import glob  globbed_files = glob.glob("*.csv") #creates a list of all csv files  data = [] # pd.concat takes a list of dataframes as an agrument for csv in globbed_files:     frame = pd.read_csv(csv)     data.append(frame)  bigframe = pd.concat(data, ignore_index=True) #dont want pandas to try an align row indexes bigframe.to_csv("Pandas_output2.csv") 
like image 846
specmer Avatar asked Jan 25 '17 17:01

specmer


People also ask

How do you add a column value to a CSV file in Python?

Use DictReader DictWriter to add a column in existing csv file. Python's csv module provides two other class for reading and writing contents in the csv file i.e. DictReader & DictWriter. It performs all the operations using dictionaries instead of lists.


1 Answers

This should work:

import os  for csv in globbed_files:     frame = pd.read_csv(csv)     frame['filename'] = os.path.basename(csv)     data.append(frame) 

frame['filename'] creates a new column named filename and os.path.basename() turns a path like /a/d/c.txt into the filename c.txt.

like image 156
Mike Müller Avatar answered Sep 17 '22 18:09

Mike Müller