Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Import multiple csv files into pandas and concatenate into one DataFrame

I would like to read several csv files from a directory into pandas and concatenate them into one big DataFrame. I have not been able to figure it out though. Here is what I have so far:

import glob import pandas as pd  # get data file names path =r'C:\DRO\DCL_rawdata_files' filenames = glob.glob(path + "/*.csv")  dfs = [] for filename in filenames:     dfs.append(pd.read_csv(filename))  # Concatenate all data into one DataFrame big_frame = pd.concat(dfs, ignore_index=True) 

I guess I need some help within the for loop???

like image 327
jonas Avatar asked Jan 03 '14 15:01

jonas


People also ask

How do I concatenate pandas DataFrame?

Use pandas. concat() to concatenate/merge two or multiple pandas DataFrames across rows or columns. When you concat() two pandas DataFrames on rows, it creates a new Dataframe containing all rows of two DataFrames basically it does append one DataFrame with another.


2 Answers

If you have same columns in all your csv files then you can try the code below. I have added header=0 so that after reading csv first row can be assigned as the column names.

import pandas as pd import glob  path = r'C:\DRO\DCL_rawdata_files' # use your path all_files = glob.glob(path + "/*.csv")  li = []  for filename in all_files:     df = pd.read_csv(filename, index_col=None, header=0)     li.append(df)  frame = pd.concat(li, axis=0, ignore_index=True) 
like image 197
Gaurav Singh Avatar answered Oct 01 '22 21:10

Gaurav Singh


An alternative to darindaCoder's answer:

path = r'C:\DRO\DCL_rawdata_files'                     # use your path all_files = glob.glob(os.path.join(path, "*.csv"))     # advisable to use os.path.join as this makes concatenation OS independent  df_from_each_file = (pd.read_csv(f) for f in all_files) concatenated_df   = pd.concat(df_from_each_file, ignore_index=True) # doesn't create a list, nor does it append to one 
like image 32
Sid Avatar answered Oct 01 '22 20:10

Sid