Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find rows that have same values in another column - Python

I have a basic Python questions.

I have a pandas dataframe like this:

ID | Name | User_id
---+------+--------
 1   John     10
 2   Tom      11  
 3   Sam      12
 4   Ben      13
 5   Jen      10
 6   Tim      11
 7   Sean     14
 8   Ana      15
 9   Sam      12
 10  Ben      13

I want to get the names and user ids that share the same value for User_id, without returning names that appear twice. So I would like the output to look something like this:

John Jen 10
Tom Tim 11
like image 369
Elena Forres Avatar asked Feb 22 '16 15:02

Elena Forres


People also ask

How do you find duplicate rows in Python?

The pandas. DataFrame. duplicated() method is used to find duplicate rows in a DataFrame. It returns a boolean series which identifies whether a row is duplicate or unique.

How do you find duplicates in two columns in Python?

duplicated() In Python's Pandas library, Dataframe class provides a member function to find duplicate rows based on all columns or some specific columns i.e. It returns a Boolean Series with True value for each duplicated row.

How do I check if two rows have the same value in pandas?

Pandas Series: equals() function The equals() function is used to test whether two Pandas objects contain the same elements. This function allows two Series or DataFrames to be compared against each other to see if they have the same shape and elements. NaNs in the same location are considered equal.


1 Answers

IIUC you could do it this way, groupby on 'User_id' and then filter the groupby:

In [54]:
group = df.groupby('User_id')['Name'].unique()

In [55]:
group[group.apply(lambda x: len(x)>1)]

Out[55]:
User_id
10    [John, Jen]
11     [Tom, Tim]
Name: Name, dtype: object
like image 105
EdChum Avatar answered Sep 21 '22 10:09

EdChum