I want to display all the rows where any value in the column - "Website" occurs more than once. For example - if a certain website "xyz.com" occurs more than once, then I want to display all those rows. I am using the below code -
df[df.website.isin(df.groupby('website').website.count() > 1)]
Above code returns zero rows. But I can actually see that there are so many websites that occurs more than once by running the below code -
df.website.value_counts()
How should I modify my 1st line of code to display all such rows?
Use duplicated
with subset='website'
and keep=False
:
df[df.duplicated(subset='website', keep=False)]
Sample Input:
col1 website
0 A abc.com
1 B abc.com
2 C abc.com
3 D abc.net
4 E xyz.com
5 F foo.bar
6 G xyz.com
7 H foo.baz
Sample Output:
col1 website
0 A abc.com
1 B abc.com
2 C abc.com
4 E xyz.com
6 G xyz.com
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With