Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

openpyxl python3 -- formatting whole rows ellicits strange behavior

I'm working with pivot-tabled XLSX files and writing a script to parse them out in to a new file per tab.

Since openpyxl doesn't support pivot tables by default I need to do some work to reinsert the pivot 'style' that is lost during copy.

To do this I'm iterating through each row and col, looking for the value Total in column 0. Once found, the row should change to be all bold=True.

Instead, I receive erratic behavior that sometimes bolds ALL cells after the first None. My print('bolding totals') message shows that it is evaluating each row/cell correctly. Am I bone-headed and screwing up my indents on the loops, perhaps?

from openpyxl import Workbook
from openpyxl import load_workbook
from copy import deepcopy

wb = load_workbook(filename=r'input.xlsx')

# Print 1
sheetlist = wb.get_sheet_names()
print(sheetlist)

for i in range(len(sheetlist)-1):
    dest_filename = r''+sheetlist[i]+'.xlsx'
    new_wb = Workbook()
    ws = wb.get_sheet_by_name(sheetlist[i])
    new_wb.add_sheet(ws, 0)

    for k in range(0, new_wb.worksheets[0].get_highest_row()):
        print(new_wb.worksheets[0].cell(row=k, column=0).value)
        # ignore empty cells
        if new_wb.worksheets[0].cell(row=k, column=0).value is not None:
            if 'Total' in new_wb.worksheets[0].cell(row=k, column=0).value:
                for j in range(0, new_wb.worksheets[0].get_highest_column()):
                    print('bolding totals, '+str(k), str(j))
                    new_wb.worksheets[0].cell(
                        row=k, column=j).style.font.bold = True
            elif 'Total' not in new_wb.worksheets[0].cell(row=k, column=0).value:
                for j in range(0, new_wb.worksheets[0].get_highest_column()):
                    print('not bolding anything')
                    new_wb.worksheets[0].cell(
                        row=k, column=j).style.font.bold = False

    # remove the blank sheet created in new_wb by openpyxl
    new_wb.remove_sheet(new_wb.get_sheet_by_name('Sheet'))
    print(new_wb.get_sheet_names())
    new_wb.save(dest_filename)
    break  # set to break after one sheet for testing

print('finished')

At this point I suspect this is a bug in the openpyxl handling of styles. I ran another very simple edit and got strange behavior.

If we have a simple layout with bold/nonbold cells. Then we run this simple command to change one cell:

>>> new_wb.worksheets[0].cell(row=10,column=0).style.font.bold = False

The output changes for the entire column, not the cell alone.

like image 873
Matt D Avatar asked Jul 30 '13 19:07

Matt D


1 Answers

Prior to openpyxl 2.0 cell styles were shared between cells: this was a holdover from the implementation using pointers in the source XML: two (or more) cells would both use style "1". Changing this style for one cell would mean changing it for all cells, which sounds like the behaviour being observed here.

Since then, while there have been various changes in implementation, there are no longer any side-effects when changing the style for one cell. An important change is that formatting objects, such as Font are directly available and do not have to be wrapped in a style.

There have been some other changes: worksheets cannot be copied between workbooks because they are dependent upon data stored in the parent workbook.

Without the original file it's difficult to be sure but the following code should work with openpyxl >= 2.2

from openpyxl import Workbook
from openpyxl import load_workbook

wb = load_workbook(filename='input.xlsx', read_only=True)

for sheet in wb.sheetnames:

    dest_filename = '{0}.xlsx'.format(sheet)
    new_wb = Workbook()
    del new_wb["Sheet"]

    ws1 = wb[sheet]
    ws2 = new_wb.create_sheet(sheet)

    for row in ws1:
        ws2.append([c.value for c in row])
        first = row[0]
        if first.data_type == "s" and "Total" in first.value:
            for idx in range(len(row)):
                cell = ws2.cell(row=ws2.max_row, column=idx+1)
                bolded = cell.font.copy(bold=True)
                cell.font = bolded

    new_wb.save(dest_filename)
    print("saving {0}".format(dest_filename))

print('finished')
like image 158
Charlie Clark Avatar answered Oct 01 '22 13:10

Charlie Clark