I have a function like this:
def highlight_otls(df):
return ['background-color: yellow']
And a DataFrame like this:
price outlier
1.99 F,C
1.49 L,C
1.99 F
1.39 N
What I want to do is highlight a certain column in my df based off of this condition of another column:
data['outlier'].str.split(',').str.len() >= 2
So if the column values df['outlier'] >= 2, I want to highlight the corresponding column df['price']. (So the first 2 prices should be highlighted in my dataframe above).
I attempted to do this by doing the following which gives me an error:
data['price'].apply(lambda x: highlight_otls(x) if (x['outlier'].str.split(',').str.len()) >= 2, axis=1)
Any idea on how to do this the proper way?
You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd. series() , in operator, pandas. series. isin() , str.
To find the positions of two matching columns, we first initialize a pandas dataframe with two columns of city names. Then we use where() of numpy to compare the values of two columns. This returns an array that represents the indices where the two columns have the same value.
The iloc() function in python is defined in the Pandas module, which helps us select a specific row or column from the data set. Using the iloc method in python, we can easily retrieve any particular value from a row or column by using index values.
You can use the loc and iloc functions to access columns in a Pandas DataFrame. Let's see how. If we wanted to access a certain column in our DataFrame, for example the Grades column, we could simply use the loc function and specify the name of the column in order to retrieve it.
Use Styler.apply
. (To output to xlsx
format, use to_excel
function.)
Suppose one's dataset is
other price outlier
0 X 1.99 F,C
1 X 1.49 L,C
2 X 1.99 F
3 X 1.39 N
def hightlight_price(row):
ret = ["" for _ in row.index]
if len(row.outlier.split(",")) >= 2:
ret[row.index.get_loc("price")] = "background-color: yellow"
return ret
df.style.\
apply(hightlight_price, axis=1).\
to_excel('styled.xlsx', engine='openpyxl')
From the documentation, "DataFrame.style
attribute is a property that returns a Styler object."
We pass our styling function, hightlight_price
, into Styler.apply
and demand a row-wise nature of the function with axis=1
. (Recall that we want to color the price
cell in each row based on the outlier
information in the same row.)
Our function hightlight_price
will generate the visual styling for each row. For each row row
, we first generate styling for other
, price
, and outlier
column to be ["", "", ""]
. We can obtain the right index to modify only the price
part in the list with row.index.get_loc("price")
as in
ret[row.index.get_loc("price")] = "background-color: yellow"
# ret becomes ["", "background-color: yellow", ""]
Results
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With