Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Append Multiple Excel Files(xlsx) together in python

import pandas as pd
import os
import glob


all_data = pd.DataFrame()
for f in glob.glob("output/test*.xlsx")
    df = pd.read_excel(f)
    all_data = all_data.append(df, ignore_index=True)

I want to put multiple xlsx files into one xlsx. the excel files are in the output/test folder. The columns are the same, in all but I want concat the rows. the above code doesn't seem to work

like image 234
user3821872 Avatar asked Oct 25 '17 10:10

user3821872


2 Answers

Let all_data be a list.

all_data = []
for f in glob.glob("output/test/*.xlsx"):
    all_data.append(pd.read_excel(f))

Now, call pd.concat:

df = pd.concat(all_data, ignore_index=True)

Make sure all column names are the same, otherwise this solution won't work.


You could also use a map version of the for loop above:

g = map(pd.read_excel, glob.glob("output/test/*.xlsx"))
df = pd.concat(list(g), ignore_index=True)

Or the list comprhension method as shown in the other answer.

like image 71
cs95 Avatar answered Sep 18 '22 16:09

cs95


Use list comprehension + concat:

all_data = [pd.read_excel(f) for f in glob.glob("output/test/*.xlsx")]
df = pd.concat(all_data, ignore_index=True)
like image 25
jezrael Avatar answered Sep 19 '22 16:09

jezrael