Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Keep values of dataframe that are contained in an other dataframe

I have 2 dataframe that contain lists and i want to keep the elements of the first dataframe that are contained in the second dataframe. Is it possible or i must try some other data structures?

example of input:

df1:

elem1
a,c,v,b,n
b
c,x,a

df2:

elem2
j,k,a,i,v
o,b
g,f,w

expected output:

elem
a,v
b
NaN
like image 632
mnmbs Avatar asked Nov 27 '25 21:11

mnmbs


1 Answers

so first of all you can create a regular expression of letters you want to match

In [77]:
chars = df2.elem2.str.replace(',' , '|')
chars
Out[77]:
0    j|k|a|i|v
1          o|b
2        g|f|w
Name: elem2, dtype: object

the concatenate both into a data frame in order to perform a custom function later

In [24]:
to_compare = pd.concat([df1 , chars] , axis = 1)
to_compare
Out[24]:
       elem1    elem2
0   a,c,v,b,n   j|k|a|i|v
1   b           o|b
2   c,x,a       g|f|w

finally use your regular expression to match the date from elem1

In [76]:
to_compare.apply( lambda x : ','.join(re.findall(x['elem2'] , x['elem1'])) , axis = 1)
Out[76]:
0    a,v
1      b
2       
dtype: object

if you want to convert empty string from the final result to NAN , I'll leave you to figure it out on your own :-)

like image 185
Nader Hisham Avatar answered Nov 29 '25 10:11

Nader Hisham



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!