Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Putting many python pandas dataframes to one excel worksheet

It is quite easy to add many pandas dataframes into excel work book as long as it is different worksheets. But, it is somewhat tricky to get many dataframes into one worksheet if you want to use pandas built-in df.to_excel functionality.

# Creating Excel Writer Object from Pandas   writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')    workbook=writer.book worksheet=workbook.add_worksheet('Validation')  df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)    another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)  

The above code won't work. You will get the error of

 Sheetname 'Validation', with case ignored, is already in use. 

Now, I have experimented enough that I found a way to make it work.

writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')   # Creating Excel Writer Object from Pandas   workbook=writer.book df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)    another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)  

This will work. So, my purpose of posting this question on stackoverflow is twofold. Firstly, I hope this will help someone if he/she is trying to put many dataframes into a single work sheet at excel.

Secondly, Can someone help me understand the difference between those two blocks of code? It appears to me that they are pretty much the same except the first block of code created worksheet called "Validation" in advance while the second does not. I get that part.

What I don't understand is why should it be any different ? Even if I don't create the worksheet in advance, this line, the line right before the last one,

 df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)   

will create a worksheet anyway. Consequently, by the time we reached the last line of code the worksheet "Validation" is already created as well in the second block of code. So, my question basically, why should the second block of code work while the first doesn't?

Please also share if there is another way to put many dataframes into excel using the built-in df.to_excel functionality !!

like image 893
nyan314sn Avatar asked Oct 05 '15 20:10

nyan314sn


People also ask

How do I write multiple DataFrames to the same sheet in Excel?

To write to multiple sheets it is necessary to create an ExcelWriter object with a target file name, and specify a sheet in the file to write to. Multiple sheets may be written to by specifying unique sheet_name . With all data written to the file it is necessary to save the changes.

How do I combine multiple DataFrames into one in Python?

The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.


2 Answers

To create the Worksheet in advance, you need to add the created sheet to the sheets dict:

writer.sheets['Validation'] = worksheet

Using your original code:

# Creating Excel Writer Object from Pandas   writer = pd.ExcelWriter('test.xlsx',engine='xlsxwriter')    workbook=writer.book worksheet=workbook.add_worksheet('Validation') writer.sheets['Validation'] = worksheet df.to_excel(writer,sheet_name='Validation',startrow=0 , startcol=0)    another_df.to_excel(writer,sheet_name='Validation',startrow=20, startcol=0)  

Explanation

If we look at the pandas function to_excel, it uses the writer's write_cells function:

excel_writer.write_cells(formatted_cells, sheet_name, startrow=startrow, startcol=startcol) 

So looking at the write_cells function for xlsxwriter:

def write_cells(self, cells, sheet_name=None, startrow=0, startcol=0):     # Write the frame cells using xlsxwriter.     sheet_name = self._get_sheet_name(sheet_name)     if sheet_name in self.sheets:         wks = self.sheets[sheet_name]     else:         wks = self.book.add_worksheet(sheet_name)         self.sheets[sheet_name] = wks 

Here we can see that it checks for sheet_name in self.sheets, and so it needs to be added there as well.

like image 98
Adrian Avatar answered Oct 06 '22 07:10

Adrian


user3817518: "Please also share if there is another way to put many dataframes into excel using the built-in df.to_excel functionality !!"

Here's my attempt:

Easy way to put together a lot of dataframes on just one sheet or across multiple tabs. Let me know if this works!

-- To test, just run the sample dataframes and the second and third portion of code.

Sample dataframes

import pandas as pd import numpy as np  # Sample dataframes     randn = np.random.randn df = pd.DataFrame(randn(15, 20)) df1 = pd.DataFrame(randn(10, 5)) df2 = pd.DataFrame(randn(5, 10)) 

Put multiple dataframes into one xlsx sheet

# funtion def multiple_dfs(df_list, sheets, file_name, spaces):     writer = pd.ExcelWriter(file_name,engine='xlsxwriter')        row = 0     for dataframe in df_list:         dataframe.to_excel(writer,sheet_name=sheets,startrow=row , startcol=0)            row = row + len(dataframe.index) + spaces + 1     writer.save()  # list of dataframes dfs = [df,df1,df2]  # run function multiple_dfs(dfs, 'Validation', 'test1.xlsx', 1) 

Put multiple dataframes across separate tabs/sheets

# function def dfs_tabs(df_list, sheet_list, file_name):     writer = pd.ExcelWriter(file_name,engine='xlsxwriter')        for dataframe, sheet in zip(df_list, sheet_list):         dataframe.to_excel(writer, sheet_name=sheet, startrow=0 , startcol=0)        writer.save()  # list of dataframes and sheet names dfs = [df, df1, df2] sheets = ['df','df1','df2']      # run function dfs_tabs(dfs, sheets, 'multi-test.xlsx') 
like image 38
TomDobbs Avatar answered Oct 06 '22 06:10

TomDobbs