I have this simple dataframe <code>df</code>: <pre class="prettyprint"><code>a,b 1,2 1,3 1,4 1,2 2,1 2,2 2,3 2,5 2,5 </code></pre> I would like to check whether there are duplicates in <code>b</code> with respect to each group in <code>a</code>. So far I did the following: <pre class="prettyprint"><code>g = df.groupby('a')['b'].unique() </code></pre> which returns: <pre class="prettyprint"><code>a 1 [2, 3, 4] 2 [1, 2, 3, 5] </code></pre> But what I would like to have is a list, for each group in <code>a</code>, with multiple occurrences in <code>b</code>. The expected output in this case would be: <pre class="prettyprint"><code>a 1 [2] 2 [5] </code></pre>

<pre class="prettyprint"><code>g=df.groupby('a')['b'].value_counts() g.where(g>1).dropna() </code></pre>

We can use <code>duplicated</code> <pre class="prettyprint"><code>print(df[df.duplicated()].drop_duplicates()) </code></pre>

pandas - check for non unique values in dataframe groupby

Tags:

python

pandas

I have this simple dataframe df:

a,b
1,2
1,3
1,4
1,2
2,1
2,2
2,3
2,5
2,5

I would like to check whether there are duplicates in b with respect to each group in a. So far I did the following:

g = df.groupby('a')['b'].unique()

which returns:

a
1       [2, 3, 4]
2    [1, 2, 3, 5]

But what I would like to have is a list, for each group in a, with multiple occurrences in b. The expected output in this case would be:

a
1    [2]
2    [5]

829

asked Nov 16 '15 09:11

Fabio Lamanna

2 Answers

g=df.groupby('a')['b'].value_counts()
g.where(g>1).dropna()

138

answered Sep 21 '22 15:09

atomh33ls

We can use duplicated

print(df[df.duplicated()].drop_duplicates())

answered Sep 22 '22 15:09

akrun

Related questions
                            
                                Filter list using Boolean index arrays
                            
                                Get and Post methods in Python (Flask)
                            
                                Python : generating random numbers from a power law distribution [duplicate]
                            
                                Lambda with nested if else is not working
                            
                                Using Selenium in Python to click through all elements with the same class name
                            
                                how to select specific json element in python [duplicate]
                            
                                How can I load an image using Python Pillow?
                            
                                Install psycopg2 for Anaconda Python
                            
                                SQLite 3 Database with Django
                            
                                Get consistent Key error: \n [duplicate]
                            
                                Load JPEG from URL to skimage without temporary file
                            
                                Is it possible to overload logical and in Python?
                            
                                Changing a class attribute within __init__
                            
                                Sorting and auto filtering Excel with openpyxl
                            
                                Building a Bootstrap table with dynamic elements in Flask
                            
                                NumPy sum along disjoint indices
                            
                                Plots are not visible using matplotlib plt.show()
                            
                                Embed "Bokeh created html file" into Flask "template.html" file
                            
                                How do I list contents of a gz file without extracting it in python?
                            
                                Matplotlib: how to plot with a specific hex color and a specific marker?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With