pandas Combine Excel Spreadsheets

Tags:

I have an Excel workbook with many tabs. Each tab has the same set of headers as all others. I want to combine all of the data from each tab into one data frame (without repeating the headers for each tab).

So far, I've tried:

import pandas as pd
xl = pd.ExcelFile('file.xlsx')
df = xl.parse()

Can use something for the parse argument that will mean "all spreadsheets"? Or is this the wrong approach?

Thanks in advance!

Update: I tried:

a=xl.sheet_names
b = pd.DataFrame()
for i in a:
    b.append(xl.parse(i))
b

But it's not "working".

522

asked Mar 11 '16 21:03

Dance Party2

1 Answers

This is one way to do it -- load all sheets into a dictionary of dataframes and then concatenate all the values in the dictionary into one dataframe.

import pandas as pd

Set sheetname to None in order to load all sheets into a dict of dataframes and ignore index to avoid overlapping values later (see comment by @bunji)

df = pd.read_excel('tmp.xlsx', sheet_name=None, index_col=None)

Then concatenate all dataframes

cdf = pd.concat(df.values())

print(cdf)

101

answered Oct 04 '22 05:10

daedalus

Related questions
                            
                                Find previous occurrence of an element
                            
                                Adjusting space around figure with subplots
                            
                                ImportError at / No module named quickstart in django rest framework
                            
                                How to pass a function as a function parameter in Python
                            
                                Can I embed plotly graphs (offline) in my PyQt4 application?
                            
                                How to get one field from model in django
                            
                                Write thread-safe to file in python
                            
                                How to use matplotlib set_yscale
                            
                                Placing plot on Tkinter main window in Python
                            
                                how to render template in flask without using request context
                            
                                multiprocessing: TypeError: 'int' object is not iterable
                            
                                Nohup for Python script not working when running in the background with &
                            
                                Draw Marker in Image
                            
                                pandas.read_csv moves column names over one
                            
                                How can i filter queryset in nested serializer in django
                            
                                Why SQLalchemy create_all() can be reused?
                            
                                Python finite boundary Voronoi cells
                            
                                Django is throwing TypeError: _wrapped_view() missing 1 required positional argument: 'request'
                            
                                Retrieving identity of most recent insert in Oracle DB 12c
                            
                                Add a percent sign to a dataframe column in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas Combine Excel Spreadsheets

Tags:

python

excel

Dance Party2

People also ask

1 Answers

daedalus

Recent Activity

Donate For Us