Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove illegal characters so a dataframe can write to Excel

I am trying to write a dataframe to an Excel spreadsheet using ExcelWriter, but it keeps returning an error:

openpyxl.utils.exceptions.IllegalCharacterError

I'm guessing there's some character in the dataframe that ExcelWriter doesn't like. It seems odd, because the dataframe is formed from three Excel spreadsheets, so I can't see how there could be a character that Excel doesn't like!

Is there any way to iterate through a dataframe and replace characters that ExcelWriter doesn't like? I don't even mind if it simply deletes them.

What's the best way or removing or replacing illegal characters from a dataframe?

like image 350
user4896331 Avatar asked Feb 17 '17 20:02

user4896331


People also ask

How do you remove an index from a DataFrame while writing in Excel?

We can remove the index column in existing dataframe by using reset_index() function. This function will reset the index and assign the index columns start with 0 to n-1. where n is the number of rows in the dataframe.

How do I write pandas Dataframes to an existing Excel spreadsheet?

You can write any data (lists, strings, numbers etc) to Excel, by first converting it into a Pandas DataFrame and then writing the DataFrame to Excel. To export a Pandas DataFrame as an Excel file (extension: . xlsx, . xls), use the to_excel() method.

How do you write to an existing Excel file without overwriting data using pandas?

To write to an existing Excel file without overwriting data using Python Pandas, we can use ExcelWriter . to create the ExcelWriter instance with the Excel file path. And then we call save to save the changes.


3 Answers

Based on Haipeng Su's answer, I added a function that does this:

dataframe = dataframe.applymap(lambda x: x.encode('unicode_escape').
                 decode('utf-8') if isinstance(x, str) else x)

Basically, it escapes the unicode characters if they exist. It worked and I can now write to Excel spreadsheets again!

like image 188
user4896331 Avatar answered Oct 16 '22 17:10

user4896331


The same problem happened to me. I solved it as follows:

  1. install python package xlsxwriter:
pip install xlsxwriter
  1. replace the default engine 'openpyxl' with 'xlsxwriter':
dataframe.to_excel("file.xlsx", engine='xlsxwriter')
like image 31
mathsyouth Avatar answered Oct 16 '22 16:10

mathsyouth


try a different excel writer engine solved my problem.

writer = pd.ExcelWriter('file.xlsx', engine='xlsxwriter')
like image 23
Jialin Zou Avatar answered Oct 16 '22 15:10

Jialin Zou