Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Using pandas to format excel cell

I have a pandas dataframe, which is something like shown below.

dataframee

I would like to format the column "Pass/Fail" as if Fail --> red background, else green background, like:

format

I have tried to use Pandas to do the formatting, but it fails to add color to the excel. Following is the code:

writer = pandas.ExcelWriter(destination,engine = 'xlsxwriter')
color = Answer.style.applymap(lambda x: 'color: red' if x == "Fail" else 'color: green',subset= pandas.IndexSlice[:,['Pass/Fail']])
color.to_excel(writer,'sheet1')

I tried StyleFrame which failed to install. Seems that StyleFrame does not comply with my python version 3.6.

How can I format the excel as I want?

like image 389
user3843433 Avatar asked May 24 '17 05:05

user3843433


People also ask

Can pandas format Excel?

Pandas makes it very easy to output a DataFrame to Excel. However, there are limited options for customizing the output and using Excel's features to make your output as useful as it could be.

Does pandas work with Xlsx?

Read an Excel file into a pandas DataFrame. Supports xls , xlsx , xlsm , xlsb , odf , ods and odt file extensions read from a local filesystem or URL. Supports an option to read a single sheet or a list of sheets. Any valid string path is acceptable.


3 Answers

You can use conditional_format:

df = pd.DataFrame({'Pass/Fail':['Pass','Fail','Fail'],
                   'expect':[1,2,3]})
print (df)
  Pass/Fail  expect
0      Pass       1
1      Fail       2
2      Fail       3

writer = pd.ExcelWriter('pandas_conditional.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook  = writer.book
worksheet = writer.sheets['Sheet1']
red_format = workbook.add_format({'bg_color':'red'})
green_format = workbook.add_format({'bg_color':'green'})

worksheet.conditional_format('B2:B4', {'type': 'text',
                                      'criteria': 'containing',
                                       'value':     'Fail',
                                       'format': red_format})

worksheet.conditional_format('B2:B4', {'type': 'text',
                                      'criteria': 'containing',
                                       'value':   'Pass',
                                       'format':  green_format})
writer.save()

More dynamic solution with get_loc for position of column and mapping with dictionary:

import string

df = pd.DataFrame({'Pass/Fail':['Pass','Fail','Fail'],
                   'expect':[1,2,3]})
print (df)
  Pass/Fail  expect
0      Pass       1
1      Fail       2
2      Fail       3

writer = pd.ExcelWriter('pandas_conditional.xlsx', engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
workbook  = writer.book
worksheet = writer.sheets['Sheet1']
red_format = workbook.add_format({'bg_color':'red'})
green_format = workbook.add_format({'bg_color':'green'})

#dict for map excel header, first A is index, so omit it
d = dict(zip(range(25), list(string.ascii_uppercase)[1:]))
print (d)
{0: 'B', 1: 'C', 2: 'D', 3: 'E', 4: 'F', 5: 'G', 6: 'H', 7: 'I', 8: 'J',
 9: 'K', 10: 'L', 11: 'M', 12: 'N', 13: 'O', 14: 'P', 15: 'Q', 16: 'R', 
 17: 'S', 18: 'T', 19: 'U', 20: 'V', 21: 'W', 22: 'X', 23: 'Y', 24: 'Z'}

#set column for formatting
col = 'Pass/Fail'
excel_header = str(d[df.columns.get_loc(col)])
#get length of df
len_df = str(len(df.index) + 1)
rng = excel_header + '2:' + excel_header + len_df
print (rng)
B2:B4

worksheet.conditional_format(rng, {'type': 'text',
                                      'criteria': 'containing',
                                       'value':     'Fail',
                                       'format': red_format})

worksheet.conditional_format(rng, {'type': 'text',
                                      'criteria': 'containing',
                                       'value':   'Pass',
                                       'format':  green_format})
writer.save()

EDIT1:

Thank you jmcnamara for comment and for XlsxWriter

col = 'Pass/Fail'
loc = df.columns.get_loc(col) + 1
len_df = len(df.index) + 1

worksheet.conditional_format(1,loc,len_df,loc, {'type': 'text',
                                      'criteria': 'containing',
                                       'value':     'Fail',
                                       'format': red_format})

worksheet.conditional_format(1,loc,len_df,loc, {'type': 'text',
                                      'criteria': 'containing',
                                       'value':   'Pass',
                                       'format':  green_format})
writer.save()

EDIT:

Another solution with last version of pandas (0.20.1) and styles:

df = pd.DataFrame({'Pass/Fail':['Pass','Fail','Fail'],
                   'expect':['d','f','g']})
print (df)
  Pass/Fail expect
0      Pass      d
1      Fail      f
2      Fail      g

def f(x):
    col = 'Pass/Fail'
    r = 'background-color: red'
    g = 'background-color: green'
    c = np.where(x[col] == 'Pass', g, r)
    y = pd.DataFrame('', index=x.index, columns=x.columns)
    y[col] = c
    return y

styled = df.style.apply(f, axis=None)
styled.to_excel('styled.xlsx', engine='openpyxl')
like image 63
jezrael Avatar answered Oct 19 '22 21:10

jezrael


If have one or more columns and more than two values to format, and want to apply multiple format rules at once then you can do the following:

def fmt(data, fmt_dict):
    return data.replace(fmt_dict)

styled = df.style.apply(fmt, fmt_dict=fmt_dict, subset=['Test_1', 'Test_2' ])
styled.to_excel('styled.xlsx', engine='openpyxl')

more than two values, multiple columns, multiple format

Above, fm_dict is a dictionary with the values mapped to the corresponding format:

fmt_dict = {
    'Pass': 'background-color: green',
    'Fail': 'background-color: red',
    'Pending': 'background-color: yellow; border-style: solid; border-color: blue'; color: red,
}

Notice that for the 'Pending' value, you can also specify multiple format rules (e.g. border, background color, foreground color)

(Requires: openpyxl and jinja2)


Here is a full running example:

import pandas as pd

df = pd.DataFrame({'Test_1':['Pass','Fail', 'Pending', 'Fail'],
                   'expect':['d','f','g', 'h'],
                   'Test_2':['Pass','Pending', 'Pass', 'Fail'],
                  })

fmt_dict = {
    'Pass': 'background-color: green',
    'Fail': 'background-color: red',
    'Pending': 'background-color: yellow; border-style: solid; border-color: blue; color:red',
}

def fmt(data, fmt_dict):
    return data.replace(fmt_dict)

styled = df.style.apply(fmt, fmt_dict=fmt_dict, subset=['Test_1', 'Test_2' ])
styled.to_excel('styled.xlsx', engine='openpyxl')
like image 30
toto_tico Avatar answered Oct 19 '22 20:10

toto_tico


Disclaimer: I wrote the following library

I'd like to suggest using StyleFrame:

import pandas as pd
from StyleFrame import StyleFrame, Styler

df = pd.DataFrame({'Pass/Fail':['Pass','Fail','Fail'],
                   'expect':[1,2,3]})

sf = StyleFrame(df)

sf.apply_style_by_indexes(sf[sf['Pass/Fail'] == 'Pass'], cols_to_style='Pass/Fail',
                          styler_obj=Styler(bg_color='green'))
sf.apply_style_by_indexes(sf[sf['Pass/Fail'] == 'Fail'], cols_to_style='Pass/Fail',
                          styler_obj=Styler(bg_color='red'))

sf.to_excel('test.xlsx').save()

Since it bridges the gap between pandas and openpyxl, the styling is done on the dataframe level instead of the worksheet level (so for example you don't need to know the relevant cell range is B2:B4 or mess with indexes.

The code above outputs the following:

enter image description here

EDIT: Just saw you mentioned you've tried to install but got an error. Can you edit your question and include the error?

like image 4
DeepSpace Avatar answered Oct 19 '22 20:10

DeepSpace