I have a <code>pandas</code> data frame with a column <code>uniqueid</code>. I would like to remove all duplicates from the data frame based on this column, such that all remaining observations are unique.

Use the <code>duplicated</code> method Since we only care if <code>uniqueid</code> (<code>A</code> in my example) is duplicated, select that and call <code>duplicated</code> on that series. Then use the <code>~</code> to flip the bools. <pre class="prettyprint"><code>In [90]: df = pd.DataFrame({'A': ['a', 'b', 'b', 'c'], 'B': [1, 2, 3, 4]}) In [91]: df Out[91]: A B 0 a 1 1 b 2 2 b 3 3 c 4 In [92]: df['A'].duplicated() Out[92]: 0 False 1 False 2 True 3 False Name: A, dtype: bool In [93]: df.loc[~df['A'].duplicated()] Out[93]: A B 0 a 1 1 b 2 3 c 4 </code></pre>

Selecting unique observations in a pandas data frame

2 Answers

There is also the drop_duplicates() method for any data frame (docs here). You can pass specific columns to drop from as an argument.

df.drop_duplicates(subset='uniqueid', inplace=True)

168

answered Sep 20 '22 21:09

cwharland

Use the duplicated method

Since we only care if uniqueid (A in my example) is duplicated, select that and call duplicated on that series. Then use the ~ to flip the bools.

In [90]: df = pd.DataFrame({'A': ['a', 'b', 'b', 'c'], 'B': [1, 2, 3, 4]})

In [91]: df
Out[91]: 
   A  B
0  a  1
1  b  2
2  b  3
3  c  4

In [92]: df['A'].duplicated()
Out[92]: 
0    False
1    False
2     True
3    False
Name: A, dtype: bool

In [93]: df.loc[~df['A'].duplicated()]
Out[93]: 
   A  B
0  a  1
1  b  2
3  c  4

answered Sep 21 '22 21:09

TomAugspurger

Related questions
                            
                                String formatting in Python: can I use %s for all types?
                            
                                in Python how to convert number to float in a mixed list [duplicate]
                            
                                how to merge 2 list as a key value pair in python [duplicate]
                            
                                What is a better Tkinter geometry manager than .grid()
                            
                                Please explain these Python Fetch types
                            
                                Simultaneous .replace functionality
                            
                                How to create a PixBuf from file with Gdk3?
                            
                                python round leaving a trailing 0 [duplicate]
                            
                                How to find out the date of the last Saturday in Linux shell script or python?
                            
                                Save a file depending on the user Python
                            
                                Why does testing `NaN == NaN` not work for dropping from a pandas dataFrame?
                            
                                Get all table data in Django
                            
                                func(*args, **kwargs, x) throwing invalid syntax
                            
                                Find the 2nd highest element
                            
                                Random rounding to integer in Python
                            
                                A better way of setting values in CreateView?
                            
                                Rounding floats to nearest 10th
                            
                                Regex to find last word in a string (Python)
                            
                                Function Returning a NoneType in Python?
                            
                                finding a sum of X numbers within a list (Python)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Selecting unique observations in a pandas data frame

Tags:

python

pandas

Michael

People also ask

2 Answers

cwharland

TomAugspurger

Recent Activity

Donate For Us