I have an xlsx file, with columns with various coloring.
I want to read only the white columns of this excel in python using pandas, but I have no clues on hot to do this.
I am able to read the full excel into a dataframe, but then I miss the information about the coloring of the columns and I don't know which columns to remove and which not.
@sqllearner you can apply the same color to several columns just by adding them to the subset, like df. style. set_properties(**{'background-color': 'red'}, subset=['A', 'C']).
One way to conditionally format your Pandas DataFrame is to highlight cells which meet certain conditions. To do so, we can write a simple function and pass that function into the Styler object using . apply() or .
read_excel() function is used to read excel sheet with extension xlsx into pandas DataFrame. By reading a single sheet it returns a pandas DataFrame object, but reading two sheets it returns a Dict of DataFrame. Can load excel files stored in a local filesystem or from an URL.
(Disclosure: I'm one of the authors of the library I'm going to suggest)
With StyleFrame (that wraps pandas) you can read an excel file into a dataframe without loosing the style data.
Consider the following sheet:
And the following code:
from styleframe import StyleFrame, utils
# from StyleFrame import StyleFrame, utils (if using version < 3.X)
sf = StyleFrame.read_excel('test.xlsx', read_style=True)
print(sf)
# b p y
# 0 nan 3 1000.0
# 1 3.0 4 2.0
# 2 4.0 5 42902.72396767148
sf = sf[[col for col in sf.columns
if col.style.fill.fgColor.rgb in ('FFFFFFFF', utils.colors.white)]]
# "white" can be represented as 'FFFFFFFF' or
# '00FFFFFF' (which is what utils.colors.white is set to)
print(sf)
# b
# 0 nan
# 1 3.0
# 2 4.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With