Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

File corruption while writing using Pandas

I am reading data from a perfectly valid xlsx file and processing it using Pandas in Python 3.5. At the end I am writing the final dataframe to an Excel file using :

writer = pd.ExcelWriter(os.path.join(DATA_DIR, 'Data.xlsx'), 
engine='xlsxwriter', options={'strings_to_urls': False})
manual_labelling_data.to_excel(writer, 'Sheet_A', index=False)
writer.save()

While trying to open the Data.xlsx, I am getting the error : We found a problem with some content in 'Data.xlsx'... On proceeding the file loads into Excel with info : Removed Records: Formula from /xl/worksheets/sheet1.xml part

I cannot find out what the problem is.

like image 957
Aroonalok Avatar asked Jan 08 '19 14:01

Aroonalok


1 Answers

Thanks a lot to @jmcnamara for the help in comment. The issue was that some strings in the data were wrongly being interpreted as formulas. The corrected code is :

options = {}
options['strings_to_formulas'] = False
options['strings_to_urls'] = False
writer = pd.ExcelWriter(os.path.join(DATA_DIR, 'Data.xlsx'),engine='xlsxwriter',options=options)
manual_labelling_data.to_excel(writer, 'Sheet_A', index=False)
writer.save()
like image 95
Aroonalok Avatar answered Oct 06 '22 01:10

Aroonalok