Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use Pandas stylers for coloring an entire row based on a given column?

I've been trying to print out a Pandas dataframe to html and have specific entire rows highlighted if the value of one specific column's value for that row is over a threshold. I've looked through the Pandas Styler Slicing and tried to vary the highlight_max function for such a use, but seem to be failing miserably; if I try, say, to replace the is_max with a check for whether a given row's value is above said threshold (e.g., something like

is_x = df['column_name'] >= threshold

), it isn't apparent how to properly pass such a thing or what to return.

I've also tried to simply define it elsewhere using df.loc, but that hasn't worked too well either.

Another concern also came up: If I drop that column (currently the criterion) afterwards, will the styling still hold? I am wondering if a df.loc would prevent such a thing from being a problem.

like image 987
Dread Pirate Roberts Avatar asked Apr 24 '17 19:04

Dread Pirate Roberts


People also ask

How do I highlight specific cells in pandas?

To highlight a particular cell of a DataFrame, use the DataFrame's style. apply(~) method.

How do I fill a row in pandas?

Fill Data in an Empty Pandas DataFrame by Appending Rows First, create an empty DataFrame with column names and then append rows one by one. The append() method can also append rows. When creating an empty DataFrame with column names and row indices, we can fill data in rows using the loc() method.


3 Answers

This solution allows for you to pass a column label or a list of column labels to highlight the entire row if that value in the column(s) exceeds the threshold.

import pandas as pd
import numpy as np

np.random.seed(24)
df = pd.DataFrame({'A': np.linspace(1, 10, 10)})

df = pd.concat([df, pd.DataFrame(np.random.randn(10, 4), columns=list('BCDE'))],
               axis=1)
df.iloc[0, 2] = np.nan

def highlight_greaterthan(s, threshold, column):
    is_max = pd.Series(data=False, index=s.index)
    is_max[column] = s.loc[column] >= threshold
    return ['background-color: yellow' if is_max.any() else '' for v in is_max]


df.style.apply(highlight_greaterthan, threshold=1.0, column=['C', 'B'], axis=1)

Output:

enter image description here

Or for one column

df.style.apply(highlight_greaterthan, threshold=1.0, column='E', axis=1)

enter image description here

like image 119
Scott Boston Avatar answered Oct 24 '22 20:10

Scott Boston


Here is a simpler approach:

  1. Assume you have a 100 x 10 dataframe, df. Also assume you want to highlight all the rows corresponding to a column, say "duration", greater than 5.

  2. You first need to define a function that highlights the cells. The real trick is that you need to return a row, not a single cell. For example:

    def highlight(s):
        if s.duration > 5:
            return ['background-color: yellow'] * len(s)
        else:
            return ['background-color: white'] * len(s)
    

**Note that the return part should be a list of 10 (corresponding to the number of columns). This is the key part.

  1. Now you can apply this to the dataframe style as:

    df.style.apply(highlight, axis=1)
    
like image 38
Steven Avatar answered Oct 24 '22 21:10

Steven


Assume you have the following dataframe and you want to highlight the rows where id is greater than 3 to red

   id char       date
0   0    s 2022-01-01
1   1    t 2022-02-01
2   2    y 2022-03-01
3   3    l 2022-04-01
4   4    e 2022-05-01
5   5    r 2022-06-01

You can try Styler.set_properties with pandas.IndexSlice

# Subset your original dataframe with condition
df_ = df[df['id'].gt(3)]

# Pass the subset dataframe index and column to pd.IndexSlice
slice_ = pd.IndexSlice[df_.index, df_.columns]

s = df.style.set_properties(**{'background-color': 'red'}, subset=slice_)
s.to_html('test.html')

enter image description here


You can also try Styler.apply with axis=None which passes the whole dataframe.

def styler(df):
    color = 'background-color: {}'.format
    mask = pd.concat([df['id'].gt(3)] * df.shape[1], axis=1)
    style = np.where(mask, color('red'), color('green'))
    return style

s = df.style.apply(styler, axis=None)

enter image description here

like image 20
Ynjxsjmh Avatar answered Oct 24 '22 19:10

Ynjxsjmh