I am trying to search for values and portions of values from one column to another and return a third value.
Essentially, I have two dataframes: df and df2. The first has a part number in 'col1'. The second has the part number, or portion of it, in 'col1' and the value I want to put in df['col2'] in 'col2'.
import pandas as pd
df = pd.DataFrame({'col1': ['1-1-1', '1-1-2', '1-1-3',
    '2-1-1', '2-1-2', '2-1-3']})
df2 = pd.DataFrame({'col1': ['1-1-1', '1-1-2', '1-1-3', '2-1'],
    'col2': ['A', 'B', 'C', 'D']})
Of course this:
df['col1'].isin(df2['col1']) 
Only covers everything that matches, not the portions:
df['col1'].isin(df2['col1'])
Out[27]: 
0     True
1     True
2     True
3    False
4    False
5    False
Name: col1, dtype: bool
I tried:
df[df['col1'].str.contains(df2['col1'])] 
but get:
TypeError: 'Series' objects are mutable, thus they cannot be hashed
I also tried use a dictionary made from df2; using the same approaches as above and also mapping it--with no luck
The results for df I need would look like this:
 col1     col2
'1-1-1'    'A'
'1-1-2'    'B'
'1-1-3'    'C'
'2-1-1'    'D'  
'2-1-2'    'D'  
'2-1-3'    'D'  
I can't figure out how to get the 'D' value into 'col2' because df2['col1'] contains '2-1'--only a portion of the part number.
Any help would be greatly appreciated. Thank you in advance.
We can do str.findall 
s=df.col1.str.findall('|'.join(df2.col1.tolist())).str[0].map(df2.set_index('col1').col2)
df['New']=s
df
    col1 New
0  1-1-1   A
1  1-1-2   B
2  1-1-3   C
3  2-1-1   D
4  2-1-2   D
5  2-1-3   D
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With