I have a dataframe: <pre class="prettyprint"><code>df = pd.DataFrame({'id' : ['abarth 1.4 a','abarth 1 a','land rover 1.3 r','land rover 2', 'land rover 5 g','mazda 4.55 bl'], 'series': ['a','a','r','','g', 'bl'] }) </code></pre> I would like to remove the 'series' string from the corresponding id, so the end result should be: Final result should be <code>'id': ['abarth 1.4','abarth 1','land rover 1.3','land rover 2','land rover 5', 'mazda 4.55']</code> Currently I am using df.apply: <pre class="prettyprint"><code>df.id = df.apply(lambda x: x['id'].replace(x['series'], ''), axis =1) </code></pre> But this removes all instances of the strings, even in other words, like so: <code>'id': ['brth 1.4','brth 1','land ove 1.3','land rover 2','land rover 5', 'mazda 4.55']</code> Should I somehow mix and match regex with the variable inside df.apply, like so? <pre class="prettyprint"><code>df.id = df.apply(lambda x: x['id'].replace(r'\b' + x['series'], ''), axis =1) </code></pre>

Use <code>str.split</code> and <code>str.get</code> and assign using <code>loc</code> only where <code>df.make == ''</code> <pre class="prettyprint"><code>df.loc[df.make == '', 'make'] = df.id.str.split().str.get(0) print df id make 0 abarth 1.4 abarth 1 abarth 1 abarth 2 land rover 1.3 rover 3 land rover 2 rover 4 land rover 5 rover 5 mazda 4.55 mazda </code></pre>

pandas dataframe return first word in string for column

Tags:

python

pandas

dataframe

I have a dataframe:

df = pd.DataFrame({'id' : ['abarth 1.4 a','abarth 1 a','land rover 1.3 r','land rover 2',
                           'land rover 5 g','mazda 4.55 bl'], 
                   'series': ['a','a','r','','g', 'bl'] })

I would like to remove the 'series' string from the corresponding id, so the end result should be:

Final result should be 'id': ['abarth 1.4','abarth 1','land rover 1.3','land rover 2','land rover 5', 'mazda 4.55']

Currently I am using df.apply:

df.id = df.apply(lambda x: x['id'].replace(x['series'], ''), axis =1)

But this removes all instances of the strings, even in other words, like so: 'id': ['brth 1.4','brth 1','land ove 1.3','land rover 2','land rover 5', 'mazda 4.55']

Should I somehow mix and match regex with the variable inside df.apply, like so?

df.id = df.apply(lambda x: x['id'].replace(r'\b' + x['series'], ''), axis =1)

524

asked May 28 '16 23:05

Testy8

1 Answers

Use str.split and str.get and assign using loc only where df.make == ''

df.loc[df.make == '', 'make'] = df.id.str.split().str.get(0)

print df

               id    make
0      abarth 1.4  abarth
1        abarth 1  abarth
2  land rover 1.3   rover
3    land rover 2   rover
4    land rover 5   rover
5      mazda 4.55   mazda

118

answered Oct 14 '22 10:10

piRSquared

Related questions
                            
                                TypeError: concat() got multiple values for argument 'axis'
                            
                                mysql data base connection inside Sam local
                            
                                How to iterate over two dataloaders simultaneously using pytorch?
                            
                                case_when function from R to Python
                            
                                Understanding and evaluating template matching methods
                            
                                Python: Why can't I iterate over a list? Is my exception class borked?
                            
                                Ordered Sets Python 2.7
                            
                                Compress whitespaces in string [duplicate]
                            
                                saving an 'lxml.etree._ElementTree' object
                            
                                Pyramid: Equivalent of MVC in PHP Frameworks in Pyramid / Python?
                            
                                how to split a string on the first instance of delimiter in python
                            
                                Can I raise a signal from python?
                            
                                strange UnicodeDecodeError on django
                            
                                Python - except (OSError, e) - No longer working in 3.3.3?
                            
                                win32com import error python 3.4 [duplicate]
                            
                                Swapping Columns with NumPy arrays
                            
                                Two Sum on LeetCode
                            
                                Python - printing out list separated with comma
                            
                                Sort elements with specific order in python
                            
                                How to raise a error inside form_valid method of a CreateView

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With