Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas dataframe remove constant column

I have a dataframe that may or may not have columns that are the same value. For example

    row    A    B     1      9    0     2      7    0     3      5    0     4      2    0 

I'd like to return just

   row    A      1      9        2      7        3      5        4      2 

Is there a simple way to identify if any of these columns exist and then remove them?

like image 703
user1802143 Avatar asked Nov 26 '13 05:11

user1802143


People also ask

What is Nunique?

Definition and Usage The nunique() method returns the number of unique values for each column. By specifying the column axis ( axis='columns' ), the nunique() method searches column-wise and returns the number of unique values for each row.


2 Answers

I believe this option will be faster than the other answers here as it will traverse the data frame only once for the comparison and short-circuit if a non-unique value is found.

>>> df     0  1  2 0  1  9  0 1  2  7  0 2  3  7  0  >>> df.loc[:, (df != df.iloc[0]).any()]      0  1 0  1  9 1  2  7 2  3  7 
like image 147
chthonicdaemon Avatar answered Sep 18 '22 11:09

chthonicdaemon


Ignoring NaNs like usual, a column is constant if nunique() == 1. So:

>>> df    A  B  row 0  9  0    1 1  7  0    2 2  5  0    3 3  2  0    4 >>> df = df.loc[:,df.apply(pd.Series.nunique) != 1] >>> df    A  row 0  9    1 1  7    2 2  5    3 3  2    4 
like image 45
DSM Avatar answered Sep 20 '22 11:09

DSM