Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Rows based on distinct values from Column 2

Tags:

python

pandas

How can I get the rows by distinct values in COL2?

For example, I have the dataframe below:

COL1   COL2 a.com  22 b.com  45 c.com  34 e.com  45 f.com  56 g.com  22 h.com  45 

I want to get the rows based on unique values in COL2:

COL1  COL2 a.com 22 b.com 45 c.com 34 f.com 56 

So, how can I get that? I would appreciate it very much if anyone can provide any help.

like image 674
import.zee Avatar asked Apr 29 '17 11:04

import.zee


People also ask

How do you select rows based on distinct values of a column only pandas?

And you can use the following syntax to select unique rows across specific columns in a pandas DataFrame: df = df. drop_duplicates(subset=['col1', 'col2', ...])

How do I get unique values of two columns in pandas?

Pandas series aka columns has a unique() method that filters out only unique values from a column. The first output shows only unique FirstNames. We can extend this method using pandas concat() method and concat all the desired columns into 1 single column and then find the unique of the resultant column.

How do I use distinct in one column in SQL?

Adding the DISTINCT keyword to a SELECT query causes it to return only unique values for the specified column list so that duplicate rows are removed from the result set. Since DISTINCT operates on all of the fields in SELECT's column list, it can't be applied to an individual field that are part of a larger group.


1 Answers

Use drop_duplicates with specifying column COL2 for check duplicates:

df = df.drop_duplicates('COL2') #same as #df = df.drop_duplicates('COL2', keep='first') print (df)     COL1  COL2 0  a.com    22 1  b.com    45 2  c.com    34 4  f.com    56 

You can also keep only last values:

df = df.drop_duplicates('COL2', keep='last') print (df)     COL1  COL2 2  c.com    34 4  f.com    56 5  g.com    22 6  h.com    45 

Or remove all duplicates:

df = df.drop_duplicates('COL2', keep=False) print (df)     COL1  COL2 2  c.com    34 4  f.com    56 
like image 158
jezrael Avatar answered Oct 17 '22 23:10

jezrael