Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

creating pandas data frame from multiple files

Tags:

python

pandas

I am trying to create a pandas DataFrame and it works fine for a single file. If I need to build it for multiple files which have the same data structure. So instead of single file name I have a list of file names from which I would like to create the DataFrame.

Not sure what's the way to append to current DataFrame in pandas or is there a way for pandas to suck a list of files into a DataFrame.

like image 426
Abhi Avatar asked May 11 '12 05:05

Abhi


People also ask

Can pandas read multiple files?

Sometimes you may need to read or import multiple CSV files from a folder or from a list of files and convert them into pandas DataFrame. You can do this by reading each CSV file into DataFrame and appending or concatenating the DataFrames to create a single DataFrame with data from all files.

How do I read multiple CSV files from a directory in Python?

Here, the glob module helps extract file directory (path + file name with extension), Lines 10–13: We create a list type object dataFrames to keep every csv as a DataFrame at each index of that list. Line 15: We call pd. concat() method to merge each DataFrame in the list by columns, that is, axis=1 .

Can we create a DataFrame with multiple data types in Python?

You can create a DataFrame from multiple Series objects by adding each series as a columns. By using concat() method you can merge multiple series together into DataFrame.


1 Answers

The pandas concat command is your friend here. Lets say you have all you files in a directory, targetdir. You can:

  1. make a list of the files
  2. load them as pandas dataframes
  3. and concatenate them together

`

import os
import pandas as pd

#list the files
filelist = os.listdir(targetdir) 
#read them into pandas
df_list = [pd.read_table(file) for file in filelist]
#concatenate them together
big_df = pd.concat(df_list)
like image 102
zach Avatar answered Sep 19 '22 03:09

zach