I want to get the names of the columns which have same values across all rows for each column. My data: <pre class="prettyprint"><code> A B C D 0 1 hi 2 a 1 3 hi 2 b 2 4 hi 2 c </code></pre> Desired output: <pre class="prettyprint"><code>['B', 'C'] </code></pre> Code: <pre class="prettyprint"><code>import pandas as pd d = {'A': [1,3,4], 'B': ['hi','hi','hi'], 'C': [2,2,2], 'D': ['a','b','c']} df = pd.DataFrame(data=d) </code></pre> I've been playing around with <code>df.columns</code> and <code>.any()</code>, but can't figure out how to do this.

Use the pandas not-so-well-known builtin <code>nunique()</code>: <pre class="prettyprint"><code>df.columns[df.nunique() <= 1] Index(['B', 'C'], dtype='object') </code></pre> Notes: <ul> <li>Use <code>nunique(dropna=False)</code> option if you want na's counted as a separate value</li> <li>It's the cleanest code, but not the fastest. (But in general code should prioritize clarity and readability).</li> </ul>

Pandas: Get all columns that have constant value

Tags:

python

pandas

I want to get the names of the columns which have same values across all rows for each column.

My data:

   A   B  C  D
0  1  hi  2  a
1  3  hi  2  b
2  4  hi  2  c

Desired output:

['B', 'C']

Code:

import pandas as pd

d = {'A': [1,3,4], 'B': ['hi','hi','hi'], 'C': [2,2,2], 'D': ['a','b','c']}
df = pd.DataFrame(data=d)

I've been playing around with df.columns and .any(), but can't figure out how to do this.

800

asked May 29 '18 10:05

tbienias

Video Answer

2 Answers

Use the pandas not-so-well-known builtin nunique():

df.columns[df.nunique() <= 1]
Index(['B', 'C'], dtype='object')

Notes:

Use nunique(dropna=False) option if you want na's counted as a separate value
It's the cleanest code, but not the fastest. (But in general code should prioritize clarity and readability).

152

answered Oct 09 '22 12:10

smci

Solution 1:

c = [c for c in df.columns if len(set(df[c])) == 1]
print (c)

['B', 'C']

Solution 2:

c = df.columns[df.eq(df.iloc[0]).all()].tolist()
print (c)
['B', 'C']

Explanation for Solution 2:

First compare all rows to the first row with DataFrame.eq...

print (df.eq(df.iloc[0]))
       A     B     C      D
0   True  True  True   True
1  False  True  True  False
2  False  True  True  False

... then check each column is all Trues with DataFrame.all...

print (df.eq(df.iloc[0]).all())
A    False
B     True
C     True
D    False
dtype: bool

... finally filter columns' names for which result is True:

print (df.columns[df.eq(df.iloc[0]).all()])
Index(['B', 'C'], dtype='object')

Timings:

np.random.seed(100)
df = pd.DataFrame(np.random.randint(10, size=(1000,100)))

df[np.random.randint(100, size=20)] = 100
print (df)

# Solution 1 (second-fastest):
In [243]: %timeit ([c for c in df.columns if len(set(df[c])) == 1])
3.59 ms ± 43.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# Solution 2 (fastest):
In [244]: %timeit df.columns[df.eq(df.iloc[0]).all()].tolist()
1.62 ms ± 13.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

#Mohamed Thasin ah solution
In [245]: %timeit ([col for col in df.columns if len(df[col].unique())==1])
6.8 ms ± 352 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

#jpp solution
In [246]: %%timeit
     ...: vals = df.apply(set, axis=0)
     ...: res = vals[vals.map(len) == 1].index
     ...: 
5.59 ms ± 64.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

#smci solution 1
In [275]: %timeit df.columns[ df.nunique()==1 ]
11 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

#smci solution 2
In [276]: %timeit [col for col in df.columns if not df[col].is_unique]
9.25 ms ± 80 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

#smci solution 3
In [277]: %timeit df.columns[ df.apply(lambda col: not col.is_unique) ]
11.1 ms ± 511 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

answered Oct 09 '22 13:10

jezrael

Related questions
                            
                                Parsing of table from .docx file [closed]
                            
                                Print real roots only in numpy
                            
                                Can I create model in Django without automatic ID?
                            
                                How to convert SQL query results into a python dictionary
                            
                                Getting tweet replies to a particular tweet from a particular user
                            
                                How to filter based on array value in PySpark?
                            
                                Use Unix-based commands with Anaconda in Windows Operating System
                            
                                How do you automate pyspark jobs on emr using boto3 (or otherwise)?
                            
                                Rename variable scope of saved model in TensorFlow
                            
                                Efficient way to find overlapping of N rectangles
                            
                                Django, Security and Settings
                            
                                SpaCy: How to get the spacy model name?
                            
                                How do I read and write with msgpack?
                            
                                PyInstaller with Pandas creates over 500 MB exe
                            
                                Dataframe filtering rows by column values
                            
                                Tensorflow, Variable W3 already exists, disallowed
                            
                                NameError: name 'json' is not defined
                            
                                brew-installed Python not overriding system python
                            
                                How does the logical `and` operator work with integers? [duplicate]
                            
                                filter pandas dataframe on one level of a multi level index

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With