Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: How to drop multiple columns with nan as col name?

As per the title here's a reproducible example:

raw_data = {'x': ['this', 'that', 'this', 'that', 'this'], 
            np.nan: [np.nan, np.nan, np.nan, np.nan, np.nan], 
            'y': [np.nan, np.nan, np.nan, np.nan, np.nan],
            np.nan: [np.nan, np.nan, np.nan, np.nan, np.nan]}

df = pd.DataFrame(raw_data, columns = ['x', np.nan, 'y', np.nan])
df

   x     NaN  y    NaN
0  this  NaN  NaN  NaN
1  that  NaN  NaN  NaN
2  this  NaN  NaN  NaN
3  that  NaN  NaN  NaN
4  this  NaN  NaN  NaN

Aim is to drop only the columns with nan as the col name (so keep column y). dropna() doesn't work as it conditions on the nan values in the column, not nan as the col name.

df.drop(np.nan, axis=1, inplace=True) works if there's a single column in the data with nan as the col name, but not with multiple columns with nan as the col name, as in my data.

So how to drop multiple columns where the col name is nan?

like image 489
RDJ Avatar asked Sep 07 '17 17:09

RDJ


1 Answers

In [218]: df = df.loc[:, df.columns.notna()]

In [219]: df
Out[219]:
      x   y
0  this NaN
1  that NaN
2  this NaN
3  that NaN
4  this NaN
like image 117
MaxU - stop WAR against UA Avatar answered Oct 05 '22 06:10

MaxU - stop WAR against UA