Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python: update dataframe to existing excel sheet without overwriting contents on the same sheet and other sheets

Struggling for this for hours so I decided to ask for help from experts here:

I want to modify existing excel sheet without overwriting content. I have other sheets in this excel file and I don't want to impact other sheets.

I've created sample code, not sure how to add the second sheet that I want to keep though.

t=pd.date_range('2004-01-31', freq='M', periods=4)
first=pd.DataFrame({'A':[1,1,1,1],
             'B':[2,2,2,2]}, index=t)
first.index=first.index.strftime('%Y-%m-%d')
writer=pd.ExcelWriter('test.xlsx')
first.to_excel(writer, sheet_name='Here')
first.to_excel(writer, sheet_name='Keep')

#how to update the sheet'Here', cell A5:C6 with following without overwriting the rest?
#I want to keep the sheet "Keep"
update=pd.DataFrame({'A':[3,4],
                     'B':[4,5]}, index=pd.date_range('2004-04-30', 
                                                     periods=2,
                                                     freq='M'))

I've researched SO. But not sure how to write a dataframe into the sheet.

Example I've tried:

import openpyxl
xfile = openpyxl.load_workbook('test.xlsx')
sheet = xfile.get_sheet_by_name('test')
sheet['B5']='wrote!!'
xfile.save('test2.xlsx')
like image 743
Lisa Avatar asked Aug 19 '16 23:08

Lisa


2 Answers

Figured it out by myself:

#Prepare the excel we want to write to
t=pd.date_range('2004-01-31', freq='M', periods=4)
first=pd.DataFrame({'A':[1,1,1,1],
             'B':[2,2,2,2]}, index=t)
first.index=first.index.strftime('%Y-%m-%d')
writer=pd.ExcelWriter('test.xlsx')
first.to_excel(writer, sheet_name='Here')
first.to_excel(writer, sheet_name='Keep')

#read the existing sheets so that openpyxl won't create a new one later
book = load_workbook('test.xlsx')
writer = pandas.ExcelWriter('test.xlsx', engine='openpyxl') 
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)

#update without overwrites
update=pd.DataFrame({'A':[3,4],
                     'B':[4,5]}, index=(pd.date_range('2004-04-30', 
                                                     periods=2,
                                                     freq='M').strftime('%Y-%m-%d')))

update.to_excel(writer, "Here", startrow=1, startcol=2)

writer.save()
like image 126
Lisa Avatar answered Oct 12 '22 13:10

Lisa


I'd suggest you update to the 2.4 (either the beta or a checkout) of openpyxl and use the built in support fro dataframes. These can now easily be converted by openypxl into rows that you do what you want with.

See http://openpyxl.readthedocs.io/en/latest/pandas.html for details.

like image 32
Charlie Clark Avatar answered Oct 12 '22 11:10

Charlie Clark