Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas xlsxwriter to write dataframe to excel and implementing column-width and border related formatting

Background:

I am using Pandas and have a dataframe 'df' which I intend to write into an Excel sheet. I use the code below and get the output Excel sheet as shown in attached snapshot 'Present.JPG':Present.JPG

import pandas as pd
import xlsxwriter

writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
writer.save()

Problem description: I would like to write the dataframe to Excel and incorporate the following changes.

1) Get rid of the first column indicating the index
2) Implement text wrapping on all columns (to auto-size each column width)
3) Sketch thick border A1 to C4, D1 to F4 and column G

Eventually, I would like the Excel sheet to look like as shown in snapshot 'Desired.JPG':Desired.JPG

Tried till now: I tried the following commands but they over-write the border on to the content of the cells. Furthermore, I am not able to figure out how to extend the border (and text wrapping) beyond a single cell.

writer = pd.ExcelWriter("output.xlsx", engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook=writer.book
worksheet= writer.sheets['Sheet1']

full_border = workbook.add_format({"border":1,"border_color": "#000000"})
link_format = workbook.add_format({'text_wrap': True})

worksheet.write("D3", None, full_border)
worksheet.write("E1", None, link_format)

writer.save()
like image 350
Sinha Avatar asked Jan 01 '19 23:01

Sinha


People also ask

How do you autofit column width in Excel using Python?

The easiest way to auto-size the width and height of a column is to call the Worksheet class' autoFitColumn method. The autoFitColumn method takes the column index (of the column about to be resized) as a parameter. Copy def autofit_column(self): \# Instantiating a Workbook object by excel file path workbook = self.

Does pandas use XlsxWriter?

Practical ApplicationYou can use XlsxWriter as an engine for Pandas ExcelWriter class to output a DataFrame to Excel. Below we have included a simple script which will create a Pandas DataFrame with three columns and output it to an Excel file.

What is the method of exporting data frames to Excel files in pandas?

The pandas DataFrame class has an instance method to_excel() that exports the contents of a DataFrame object into an excel file. Using the to_excel() method either an Excel workbook with single sheet or multiple sheets can be created.


1 Answers

I'm a little late to the party but here is what you were looking for:

import xlsxwriter
import pandas as pd

df = pd.DataFrame({
    'Class': ['A', 'A', 'A'],
    'Type': ['Mary', 'John', 'Michael'],
    'JoinDate YYYY-MM-DD': ['2018-12-12', '2018-12-12', '2018-12-15'],
    'Weight': [150, 139, 162],
    'Height': [166.4, 160, 143],
    'Marks': [96, 89, 71],
    'LastDate YYYY-MM-DD': ['2020-01-17', '2020-01-17', '2020-01-17']
})

with pd.ExcelWriter('output.xlsx', engine='xlsxwriter') as writer:
    # remove the index by setting the kwarg 'index' to False
    df.to_excel(excel_writer=writer, sheet_name='Sheet1', index=False)

    workbook = writer.book
    worksheet = writer.sheets['Sheet1']

    # dynamically set column width
    for i, col in enumerate(df.columns):
        column_len = max(df[col].astype(str).str.len().max(), len(col) + 2)
        worksheet.set_column(i, i, column_len)

    # wrap the text in all cells
    wrap_format = workbook.add_format({'text_wrap': True, 'align': 'center'})
    worksheet.set_column(0, len(df.columns) - 1, cell_format=wrap_format)

    # mimic the default pandas header format for use later
    hdr_fmt = workbook.add_format({
        'bold': True,
        'border': 1,
        'text_wrap': True,
        'align': 'center'
    })

    def update_format(curr_frmt, new_prprty, wrkbk):
        """
        Update a cell's existing format with new properties
        """
        new_frmt = curr_frmt.__dict__.copy()

        for k, v in new_prprty.items():
            new_frmt[k] = v

        new_frmt = {
            k: v
            for k, v in new_frmt.items()
            if (v != 0) and (v is not None) and (v != {}) and (k != 'escapes')
        }

        return wrkbk.add_format(new_frmt)

    # create new border formats
    header_right_thick = update_format(hdr_fmt, {'right': 2}, workbook)
    normal_right_thick = update_format(wrap_format, {'right': 2}, workbook)
    normal_bottom_thick = update_format(wrap_format, {'bottom': 2}, workbook)
    normal_corner_thick = update_format(wrap_format, {
        'right': 2,
        'bottom': 2
    }, workbook)

    # list the 0-based indices where you want bold vertical border lines
    vert_indices = [2, 5, 6]

    # create vertical bold border lines
    for i in vert_indices:
        # header vertical bold line
        worksheet.conditional_format(0, i, 0, i, {
            'type': 'formula',
            'criteria': 'True',
            'format': header_right_thick
        })
        # body vertical bold line
        worksheet.conditional_format(1, i,
                                     len(df.index) - 1, i, {
                                         'type': 'formula',
                                         'criteria': 'True',
                                         'format': normal_right_thick
                                     })
        # bottom corner bold lines
        worksheet.conditional_format(len(df.index), i, len(df.index), i, {
            'type': 'formula',
            'criteria': 'True',
            'format': normal_corner_thick
        })
    # create bottom bold border line
    for i in [i for i in range(len(df.columns) - 1) if i not in vert_indices]:
        worksheet.conditional_format(len(df.index), i, len(df.index), i, {
            'type': 'formula',
            'criteria': 'True',
            'format': normal_bottom_thick
        })
like image 155
Clade Avatar answered Sep 26 '22 03:09

Clade