Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to preserve Excel text formatting when reading/writing Excel files with Pandas?

Tags:

python

pandas

I have a script that basically reads in a template Excel file to a Pandas dataframe, adds extra columns and then writes it to an output Excel file.

One of the columns in the Excel file contains text with special formatting (subscript, superscript, strikethrough, ...). When writing that column back to the output file, all this formatting is lost and converted to plain text.

Is there a way to preserve text formatting between input & output?

like image 695
wwwald Avatar asked Feb 09 '16 09:02

wwwald


People also ask

How do you write to an Excel file using pandas?

You can write any data (lists, strings, numbers etc) to Excel, by first converting it into a Pandas DataFrame and then writing the DataFrame to Excel. To export a Pandas DataFrame as an Excel file (extension: . xlsx, . xls), use the to_excel() method.

What is the method of reading Excel files in pandas?

We can use the pandas module read_excel() function to read the excel file data into a DataFrame object. If you look at an excel sheet, it's a two-dimensional table. The DataFrame object also represents a two-dimensional tabular data structure.

How do I get pandas to read xlsx files?

Use pandas. read_excel() function to read excel sheet into pandas DataFrame, by default it loads the first sheet from the excel file and parses the first row as a DataFrame column name. Excel file has an extension . xlsx.


1 Answers

The best direct control over Excel formatting is from the win32 library. Here is how you install it:

https://anaconda.org/anaconda/pywin32

Here is an example of how to format cells in Excel:

https://pythonexcels.com/python/2009/10/05/python-excel-mini-cookbook

The most powerful part of pywin32 (which includes the win32com.client) is that you have access to all the underlying VBA commands. So, when looking for documentation on win32com.client interactions with Excel, you need to look at the object model for Excel, which is in the drop down menu here:

https://docs.microsoft.com/en-us/office/vba/api/overview/excel

To address the issue of subscripts, you can look here:

https://docs.microsoft.com/en-us/office/vba/api/excel.font.subscript

You should be able to write back to the Excel document with win32com.client without changing the formatting, but it is a heavier lift than a function as simple as pandas.to_excel()

The code below writes "apples2" to a workbook that is already open that is called "Sample Workbook". It also makes the last character a superscript.

import win32com.client
# app = win32com.client.dynamic.Dispatch("Excel.Application")
app = win32com.client.gencache.EnsureDispatch("Excel.Application")
app.Interactive = True
app.Visible = True
wb = app.Workbooks("Sample Workbook.xlsx")
ws = wb.Worksheets("Sheet1")

ws.Cells(1,1).Value="Apples2"

ws.Range("A1").Font.Bold = True
ws.Range("A1").GetCharacters(7,1).Font.Superscript = True
like image 79
Foggy Avatar answered Nov 15 '22 01:11

Foggy