Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find duplicate names using pandas?

Tags:

I have a pandas.DataFrame with a column called name containing strings. I would like to get a list of the names which occur more than once in the column. How do I do that?

I tried:

funcs_groups = funcs.groupby(funcs.name) funcs_groups[(funcs_groups.count().name>1)] 

But it doesn't filter out the singleton names.

like image 513
Yariv Avatar asked Mar 06 '13 12:03

Yariv


People also ask

How do I find duplicates in pandas?

The pandas. DataFrame. duplicated() method is used to find duplicate rows in a DataFrame. It returns a boolean series which identifies whether a row is duplicate or unique.

How do I filter duplicates in pandas?

We can use Pandas built-in method drop_duplicates() to drop duplicate rows. Note that we started out as 80 rows, now it's 77. By default, this method returns a new DataFrame with duplicate rows removed. We can set the argument inplace=True to remove duplicates from the original DataFrame.

How do you find duplicates in a DataFrame column?

To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.


2 Answers

If you want to find the rows with duplicated name (except the first time we see that), you can try this

In [16]: import pandas as pd In [17]: p1 = {'name': 'willy', 'age': 10} In [18]: p2 = {'name': 'willy', 'age': 11} In [19]: p3 = {'name': 'zoe', 'age': 10} In [20]: df = pd.DataFrame([p1, p2, p3])  In [21]: df Out[21]:     age   name 0   10  willy 1   11  willy 2   10    zoe  In [22]: df.duplicated('name') Out[22]:  0    False 1     True 2    False 
like image 60
waitingkuo Avatar answered Oct 21 '22 11:10

waitingkuo


A one liner can be:

x.set_index('name').index.get_duplicates() 

the index contains a method for finding duplicates, columns does not seem to have a similar method..

like image 37
idoda Avatar answered Oct 21 '22 13:10

idoda