I am trying to write a pandas/python script the do the following in jupyter notebookssee excel data for example
I need to search column C for each row of data, and look at what number corresponds in that row in col E. I then want it to look for the same number in col G and put the corresponding value it got from E into col I.
If there are multiple instances of a value in col C with different corresponding values in Col E, flag those col C values so I can take a look.
Given col C contains 111 and has code “a” in col E, code “a” would be placed any spot in col I where col G had number 111.
If they do not have same number, Highlight in red those values in col C.
I am having trouble figuring out how to code this up. If anyone can show me that would be greatly appreciated. Thanks
Here's what you want..
dct = {'C':[111,222,333,111,444],'E':['a','b','c','d','e'],'G':[111,123,333,111,444]}
df = pd.DataFrame(dct)
highlight = []
vals = []
for i in range(len(df)):
if df['C'][i] == df['G'][i]:
highlight.append(False)
vals.append(df['E'][i])
else:
highlight.append(True)
vals.append(None)
df['I'] = vals
def highlight_cells(x):
c1 = 'background-color: red'
c2 = ''
df1 = pd.DataFrame(c2, index=df.index, columns=df.columns)
#modify values of df1 column by boolean highlight
df1.loc[highlight, 'C'] = c1 #new styled dataframe
return df1
df.style.apply(highlight_cells, axis=None).to_excel('styled.xlsx', engine='openpyxl')
Initially prepare highlight list(boolean) i.e which we are marking which rows of colC need to be highlighted. Now we use this highlight list in function highlight_cells, which creates new masked dataframe and it is applied to dataframe df using df.style.apply().
Output:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With