Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find paired records after groupby Python

Tags:

python

pandas

I have a dataframe like this:

df = pd.DataFrame(
    [['101', 'a', 'in', '10'],
     ['101', 'a', 'out', '10'], 
     ['102', 'b', 'in', '20'], 
     ['103', 'c', 'in', '30'],
     ['103', 'c', 'out', '40']], 
    columns=['col1', 'col2', 'col3', 'col4']
)

I want to group by col1 and find paired records that have the same value in col2 and col4, but one has 'in' in col3 one has 'out' in col3. The expected outcome is:

df_out = pd.DataFrame(
    [['101', 'a', 'in', '10'],
     ['101', 'a', 'out', '10']],
    columns=['col1', 'col2', 'col3', 'col4']
)

Thank you for the help.

like image 987
J_jane Avatar asked Sep 07 '20 14:09

J_jane


1 Answers

Let us try transform with nunique

out = df[df.groupby(['col1','col2','col4'])['col3'].transform('nunique')==2]
Out[187]: 
  col1 col2 col3 col4
0  101    a   in   10
1  101    a  out   10
like image 72
BENY Avatar answered Oct 17 '22 11:10

BENY