I have a dataframe like this:
RecID| A  |B
----------------
1    |a   | abc 
2    |b   | cba 
3    |c   | bca
4    |d   | bac 
5    |e   | abc 
And want to create another column, C, out of A and B such that for the same row, if the string in column A is contained in the string of column B, then C = True and if not then C = False.
The example output I am looking for is this:
RecID| A  |B    |C 
--------------------
1    |a   | abc |True
2    |b   | cba |True
3    |c   | bca |True
4    |d   | bac |False
5    |e   | abc |False
Is there a way to do this in pandas quickly and without using a loop? Thanks
You need apply with in:
df['C'] = df.apply(lambda x: x.A in x.B, axis=1)
print (df)
   RecID  A    B      C
0      1  a  abc   True
1      2  b  cba   True
2      3  c  bca   True
3      4  d  bac  False
4      5  e  abc  False
Another solution with list comprehension is faster, but there has to be no NaNs:
df['C'] = [x[0] in x[1] for x in zip(df['A'], df['B'])]
print (df)
   RecID  A    B      C
0      1  a  abc   True
1      2  b  cba   True
2      3  c  bca   True
3      4  d  bac  False
4      5  e  abc  False
                        I could not get either answer @jezreal provided to handle None's in the first column. A slight alteration to the list comprehension is able to handle it:
[x[0] in x[1] if x[0] is not None else False for x in zip(df['A'], df['B'])]
                        If you are comparing string to string and getting the Type Error you can code this like that:
df['C'] = df.apply(lambda x: str(x.A) in str(x.B), axis=1)
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With