I have a dataframe:
df = pd.DataFrame({'id' : ['abarth 1.4 a','abarth 1 a','land rover 1.3 r','land rover 2',
'land rover 5 g','mazda 4.55 bl'],
'series': ['a','a','r','','g', 'bl'] })
I would like to remove the 'series' string from the corresponding id, so the end result should be:
Final result should be 'id': ['abarth 1.4','abarth 1','land rover 1.3','land rover 2','land rover 5', 'mazda 4.55']
Currently I am using df.apply:
df.id = df.apply(lambda x: x['id'].replace(x['series'], ''), axis =1)
But this removes all instances of the strings, even in other words, like so:
'id': ['brth 1.4','brth 1','land ove 1.3','land rover 2','land rover 5', 'mazda 4.55']
Should I somehow mix and match regex with the variable inside df.apply, like so?
df.id = df.apply(lambda x: x['id'].replace(r'\b' + x['series'], ''), axis =1)
Method 1- Python get first word in string using split() The easiest way to get the first word in string in python is to access the first element of the list which is returned by the string split() method. String split() method – The split() method splits the string into a list.
Pandas DataFrame first() Method The first() method returns the first n rows, based on the specified value. The index have to be dates for this method to work as expected.
The first element is at the index 0 position. So it is accessed by mentioning the index value in the series. We can use both 0 or the custom index to fetch the value.
Use str.split
and str.get
and assign using loc
only where df.make == ''
df.loc[df.make == '', 'make'] = df.id.str.split().str.get(0)
print df
id make
0 abarth 1.4 abarth
1 abarth 1 abarth
2 land rover 1.3 rover
3 land rover 2 rover
4 land rover 5 rover
5 mazda 4.55 mazda
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With