I have a list of almost 30 xlsx files of the same format. I have a working data cleaning code for and I want to clean them all and save them on individual sheets in a single book. I figured a loop would be best to do the job, but something is missing. I have seen some functions that save multiple sheets to a work book, but I am hoping to read_excel, clean data frame, save to sheet and delete dataframe. What is happening is that it only returns the final sheet on the list in the new excel document.
BOX = [
"aa1",
"aa2",
"aa3"]
for B in BOX:
filename = B+".xls"
#create data frame
BDF = pd.read_excel(r'C:\Projects\BOXES\\' + filename)
#clean data frame
BDF = BDF.dropna(how="all")
BDF['Total Cost'] = BDF['Total Cost'].str.replace('.', '')
BDF.columns = ['LVL', 'PN', 'Leadtime', 'Description', 'Ext QTY']
BDF.PN = BDF.PN.str.strip()
sheetname=B
#save to sheet
with pd.ExcelWriter(r'C:\Projects\BOXES\BOXED.xlsx') as writer:
BDF.to_excel(writer, sheet_name=B, index=False)
#delete data frame before repeating
del(BDF)
del(B)
You should put the with outside the loop, since it automatically opens and closes the file in each for loop. The following should work:
BOX = [
"aa1",
"aa2",
"aa3"]
with pd.ExcelWriter(r'C:\Projects\BOXES\BOXED.xlsx') as writer:
for B in BOX:
filename = B+".xls"
#create data frame
BDF = pd.read_excel(r'C:\Projects\BOXES\\' + filename)
#clean data frame
BDF = BDF.dropna(how="all")
BDF['Total Cost'] = BDF['Total Cost'].str.replace('.', '')
BDF.columns = ['LVL', 'PN', 'Leadtime', 'Description', 'Ext QTY']
BDF.PN = BDF.PN.str.strip()
sheetname=B
#save to sheet
BDF.to_excel(writer, sheet_name=B, index=False)
#delete data frame before repeating
del(BDF)
del(B)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With