I need some help here. So i have something like this
import pandas as pd
path = '/Users/arronteb/Desktop/excel/ejemplo.xlsx'
xlsx = pd.ExcelFile(path)
df = pd.read_excel(xlsx,'Sheet1')
df['is_duplicated'] = df.duplicated('#CSR')
df_nodup = df.loc[df['is_duplicated'] == False]
df_nodup.to_excel('ejemplo.xlsx', encoding='utf-8')
So basically this program load the ejemplo.xlsx
(ejemplo is example in Spanish, just the name of the file) into df
(a DataFrame
), then checks for duplicate values in a specific column. It deletes the duplicates and saves the file again. That part works correctly. The problem is that instead of removing duplicates, I need highlight the cells containing them with a different color, like yellow.
To highlight a particular cell of a DataFrame, use the DataFrame's style. apply(~) method.
With your one line of code, can you apply to several columns with different colors for each column? @sqllearner you can apply the same color to several columns just by adding them to the subset, like df. style. set_properties(**{'background-color': 'red'}, subset=['A', 'C']).
You can create a function to do the highlighting...
def highlight_cells():
# provide your criteria for highlighting the cells here
return ['background-color: yellow']
And then apply your highlighting function to your dataframe...
df.style.apply(highlight_cells)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With