I am trying to search for values and portions of values from one column to another and return a third value.
Essentially, I have two dataframes: df and df2. The first has a part number in 'col1'. The second has the part number, or portion of it, in 'col1' and the value I want to put in df['col2'] in 'col2'.
import pandas as pd
df = pd.DataFrame({'col1': ['1-1-1', '1-1-2', '1-1-3',
'2-1-1', '2-1-2', '2-1-3']})
df2 = pd.DataFrame({'col1': ['1-1-1', '1-1-2', '1-1-3', '2-1'],
'col2': ['A', 'B', 'C', 'D']})
Of course this:
df['col1'].isin(df2['col1'])
Only covers everything that matches, not the portions:
df['col1'].isin(df2['col1'])
Out[27]:
0 True
1 True
2 True
3 False
4 False
5 False
Name: col1, dtype: bool
I tried:
df[df['col1'].str.contains(df2['col1'])]
but get:
TypeError: 'Series' objects are mutable, thus they cannot be hashed
I also tried use a dictionary made from df2; using the same approaches as above and also mapping it--with no luck
The results for df I need would look like this:
col1 col2
'1-1-1' 'A'
'1-1-2' 'B'
'1-1-3' 'C'
'2-1-1' 'D'
'2-1-2' 'D'
'2-1-3' 'D'
I can't figure out how to get the 'D' value into 'col2' because df2['col1'] contains '2-1'--only a portion of the part number.
Any help would be greatly appreciated. Thank you in advance.
We can do str.findall
s=df.col1.str.findall('|'.join(df2.col1.tolist())).str[0].map(df2.set_index('col1').col2)
df['New']=s
df
col1 New
0 1-1-1 A
1 1-1-2 B
2 1-1-3 C
3 2-1-1 D
4 2-1-2 D
5 2-1-3 D
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With