I have a similar problem to the one posted here: Pandas DataFrame: remove unwanted parts from strings in a column I need to remove newline characters from within a string in a DataFrame. Basically, I've accessed an api using python's json module and that's all ok. Creating the DataFrame works amazingly, too. However, when I want to finally output the end result into a csv, I get a bit stuck, because there are newlines that are creating false 'new rows' in the csv file. So basically I'm trying to turn this: '...this is a paragraph. And this is another paragraph...' into this: '...this is a paragraph. And this is another paragraph...' I don't care about preserving any kind of '\n' or any special symbols for the paragraph break. So it can be stripped right out. I've tried a few variations: <pre class="prettyprint"><code>misc['product_desc'] = misc['product_desc'].strip('\n') AttributeError: 'Series' object has no attribute 'strip' </code></pre> here's another <pre class="prettyprint"><code>misc['product_desc'] = misc['product_desc'].str.strip('\n') TypeError: wrapper() takes exactly 1 argument (2 given) misc['product_desc'] = misc['product_desc'].map(lambda x: x.strip('\n')) misc['product_desc'] = misc['product_desc'].map(lambda x: x.strip('\n\t')) </code></pre> There is no error message, but the newline characters don't go away, either. Same thing with this: <pre class="prettyprint"><code>misc = misc.replace('\n', '') </code></pre> The write to csv line is this: <pre class="prettyprint"><code>misc_id.to_csv('C:\Users\jlalonde\Desktop\misc_w_id.csv', sep=' ', na_rep='', index=False, encoding='utf-8') </code></pre> Version of Pandas is 0.9.1 Thanks! :)

<code>strip</code> only removes the specified characters at the beginning and end of the string. If you want to remove all <code>\n</code>, you need to use <code>replace</code>. <pre class="prettyprint"><code>misc['product_desc'] = misc['product_desc'].str.replace('\n', '') </code></pre>

You could use <code>regex</code> parameter of <code>replace</code> method to achieve that: <pre class="prettyprint"><code>misc['product_desc'] = misc['product_desc'].replace(to_replace='\n', value='', regex=True) </code></pre>

Replacing part of string in python pandas dataframe

Tags:

python

pandas

csv

I have a similar problem to the one posted here:

Pandas DataFrame: remove unwanted parts from strings in a column

I need to remove newline characters from within a string in a DataFrame. Basically, I've accessed an api using python's json module and that's all ok. Creating the DataFrame works amazingly, too. However, when I want to finally output the end result into a csv, I get a bit stuck, because there are newlines that are creating false 'new rows' in the csv file.

So basically I'm trying to turn this:

'...this is a paragraph.

And this is another paragraph...'

into this:

'...this is a paragraph. And this is another paragraph...'

I don't care about preserving any kind of '\n' or any special symbols for the paragraph break. So it can be stripped right out.

I've tried a few variations:

misc['product_desc'] = misc['product_desc'].strip('\n')  AttributeError: 'Series' object has no attribute 'strip'

here's another

misc['product_desc'] = misc['product_desc'].str.strip('\n')  TypeError: wrapper() takes exactly 1 argument (2 given)  misc['product_desc'] = misc['product_desc'].map(lambda x: x.strip('\n')) misc['product_desc'] = misc['product_desc'].map(lambda x: x.strip('\n\t'))

There is no error message, but the newline characters don't go away, either. Same thing with this:

misc = misc.replace('\n', '')

The write to csv line is this:

misc_id.to_csv('C:\Users\jlalonde\Desktop\misc_w_id.csv', sep=' ', na_rep='', index=False, encoding='utf-8')

Version of Pandas is 0.9.1

Thanks! :)

270

asked Jan 15 '13 19:01

joseph_pindi

2 Answers

strip only removes the specified characters at the beginning and end of the string. If you want to remove all \n, you need to use replace.

misc['product_desc'] = misc['product_desc'].str.replace('\n', '')

139

answered Sep 29 '22 19:09

BrenBarn

You could use regex parameter of replace method to achieve that:

misc['product_desc'] = misc['product_desc'].replace(to_replace='\n', value='', regex=True)

answered Sep 29 '22 17:09

Anton Protopopov

Related questions
                            
                                How To Reduce Python Script Memory Usage
                            
                                How to scale axes in mplot3d
                            
                                Python Parse CSV Correctly
                            
                                Invalidate an old session in Flask
                            
                                Python: Split a list into sub-lists based on index ranges
                            
                                Use Python to find out if a timezone currently in daylight savings time [duplicate]
                            
                                Check if a OneToOne relation exists in Django
                            
                                Tee does not show output or write to file
                            
                                Send keys control + click in Selenium with Python bindings
                            
                                Accessing NumPy array elements not in a given index list
                            
                                How to substract a single value from column of pandas DataFrame
                            
                                Generator Comprehension different output from list comprehension?
                            
                                Adding calculated column in Pandas
                            
                                Multiple condition filter on dataframe
                            
                                What is dispatch used for in django?
                            
                                What is the way to ignore/skip some issues from python bandit security issues report?
                            
                                Customizing an Admin form in Django while also using autodiscover
                            
                                Getting all items less than a month old
                            
                                Multiprocessing debug techniques
                            
                                How to fix issue with 'datetime.datetime' which has no attribute timedelta?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With