Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Conditionally formatting text strings from Excel using python

I would like to format a spreadsheet (xls or xlsx) so that any cells containing a word or ending with a certain string are formatted by filling the background with a specific color.

For example, if the cell contains the word 'deleted', fill it black and paint the text white. if the cell ends with '.pf', paint the cell red.

I found a similar question from several years ago that suggested the following:

import xlrd 
import xlutils.copy 

inBook = xlrd.open_workbook('input.xls', formatting_info=True) 
outBook = xlutils.copy.copy(inBook) 

def _getOutCell(outSheet, colIndex, rowIndex): 
    """ HACK: Extract the internal xlwt cell representation. """ 
    row = outSheet._Worksheet__rows.get(rowIndex) 
    if not row: return None 
    cell = row._Row__cells.get(colIndex) 
    return cell 

def setOutCell(outSheet, col, row, value): 
    """ Change cell value without changing formatting. """ 
    # HACK to retain cell style. 
    previousCell = _getOutCell(outSheet, col, row) 
    # END HACK, PART I 
    outSheet.write(row, col, value) 
    # HACK, PART II 

    if previousCell: 
        newCell = _getOutCell(outSheet, col, row) 
    if newCell:
        newCell.xf_idx = previousCell.xf_idx 
    # END HACK 


outSheet = outBook.get_sheet(0) 
setOutCell(outSheet, 5, 5, 'Test') 
outBook.save('output.xls') 

While this does copy the values from input.xls to output.xls, this does not seem to transfer the formatting (the test values from input.xls are no longer formatted when opening output.xls, nor are the conditional formatting rules present under "manage rules" in excel.

"if" statements for number values seem to work, but again, I am looking for a way to format cells containing certain strings. thanks!

like image 941
user2363458 Avatar asked May 08 '13 19:05

user2363458


1 Answers

Preserve the original input.xls formatting when you open it:

from xlrd import open_workbook

input_wb = open_workbook('input.xls', formatting_info=True)

Create a new workbook based on this template:

from xlutils.copy import copy as copy_workbook

output_wb = copy_workbook(input_wb)

Define some new cell styles:

from xlwt import easyxf

red_background = easyxf("pattern: pattern solid, fore_color red;")
black_with_white_font = easyxf('pattern: pattern solid, fore_color black; font: color-index white, bold on;")

Evaluate and modify your cells:

input_ws = input_wb.sheet_by_name('StackOverflow')
output_ws = output_wb.get_sheet(0)

for rindex in range(0, input_ws.nrows):
   for cindex in range(0, input_ws.ncols):
       input_cell = input_ws.cell(rindex, cindex)
       if input_cell.value[ input_cell.value.rfind('.'): ] == 'pf':
           output_ws.write(rindex, cindex, input_cell.value, red_background)
       elif input_cell.value.find('deleted') >= 0:
           output_ws.write(rindex, cindex, input_cell.value, black_with_white_font)
       else:
           pass  # we don't need to modify it

Save your new workbook

output_wb.save('output.xls')

Using the above example, unmodified cells should have their original formatting intact.

Should you need to alter the cell content AND would like to preserve the original formatting (i.e. NOT use your custom easyxf instance), you may use this snippet:

def changeCell(worksheet, row, col, text):
    """ Changes a worksheet cell text while preserving formatting """
    # Adapted from https://stackoverflow.com/a/7686555/1545769
    previousCell = worksheet._Worksheet__rows.get(row)._Row__cells.get(col)
    worksheet.write(row, col, text)
    newCell = worksheet._Worksheet__rows.get(row)._Row__cells.get(col)
    newCell.xf_idx = previousCell.xf_idx

# ...

changeCell(worksheet_instance, 155, 2, "New Value")

For the comparisons, you can use the string methods find and rfind (which searches from the right). They return the index of the position of the substring within the string. They return -1 if the substring is not found. Ergo, you see above input_cell.value.find('deleted') >= 0 to evaluate whether or not the substring 'deleted' exists. For the .pf comparison, I used rfind as well as something in Python called slicing.

like image 152
pztrick Avatar answered Sep 28 '22 22:09

pztrick