My question is similar to How to check if a column exists in Pandas but for the multi-index column case.
I'm trying to process values in a multi index column dataframe using column names originating in another file - hence the need to check if the column exists. A representative example is below:
import pandas as pd
from numpy.random import randint,randn
df = pd.DataFrame({ 'A': [randint(0,3) for p in range(0,12)],'B': [0.1* randint(0,3) for p in range(0,12)],
'C': [0.1*randint(0,3) for p in range(0,12)],'D': randn(12),
})
df1 = df.groupby(['A','B','C']).D.sum().unstack(-1)
df1 = df1.T
df1
A 0 1 2
B 0.0 0.2 0.0 0.1 0.2 0.0 0.1
C
0.0 NaN NaN NaN 0.845316 NaN 0.555513 NaN
0.1 NaN 0.139371 NaN NaN NaN NaN -0.260868
0.2 5.002509 NaN 0.637353 0.438863 0.943098 NaN NaN
df1[1][0.1]
C
0.0 0.845316
0.1 NaN
0.2 0.438863
Accessing df1[0][0.1]
in the above example will result in a key error. How do I check if a multi index column exists, so that non-existent columns can be skipped during processing?
Thanks!
Check for Multiple Columns Exists in Pandas DataFrame In order to check if a list of multiple selected columns exist in pandas DataFrame, use set. issubset . For Example, if set(['Courses','Duration']). issubset(df.
MiltiIndex is also referred to as Hierarchical/multi-level index/advanced indexing in pandas enables us to create an index on multiple columns and store data in an arbitrary number of dimensions.
You can think of a multi index like an array of tuples, so can access like:
df1[(0, 0.1)]
and test like:
(0, 0.1) in df1.columns:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With