Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Pandas - GroupBy conditional string addition

Currently I'm having trouble setting up a combination of setting up a list and filtering when grouping a dataframe.

Let's say we have a DataFrame of the form:

      A       B    C
0    x2   a32cd    1
1    x1   a11aa    0
2    x1     NaN    1 
3    x1   d75dd    0
4    x1   a11aa    1
5    x2   a32cd    1
6    x2   w22xz    0
...

And what I'm looking for is to group on column A (strings) and then make a list of non-duplicate, non-null values of B (strings) and I can drop out list C (integers). The final form I am looking for is something like:

      A           B 
0    x1   [a11aa, d75dd, ...]
1    x2   [a32cd, w22xz, ...]

I was thinking of setting it up somehow with the form of:

df_x.groupby('A')['B'].apply(list)

and then apply some conditions to it, but I can't seem to find it. Should I set up a function for it? I come from a MATLAB based background, so I am inclined to just run through the entire DataFrame, row by row. But I have been told that once you are thinking about doing that in Pandas that there probably is a smarter way to do it.

like image 571
SirGianmarcoD Avatar asked Feb 13 '26 08:02

SirGianmarcoD


1 Answers

>>> df.dropna().groupby("A")["B"].unique()
A
x1    [a11aa, d75dd]
x2    [a32cd, w22xz]
dtype: object
like image 126
w-m Avatar answered Feb 15 '26 22:02

w-m



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!