Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove this \xa0 from a string in python?

Tags:

python

unicode

I have the following string:

 word = u'Buffalo,\xa0IL\xa060625'

I don't want the "\xa0" in there. How can I get rid of it? The string I want is:

word = 'Buffalo, IL 06025
like image 545
slopeofhope Avatar asked Sep 26 '14 21:09

slopeofhope


People also ask

How do I remove a symbol from a string in Python?

You can remove a character from a Python string using replace() or translate(). Both these methods replace a character or string with a given value. If an empty string is specified, the character or string you select is removed from the string without a replacement.

How do I remove Unicode characters from a string in Python?

In python, to remove Unicode ” u “ character from string then, we can use the replace() method to remove the Unicode ” u ” from the string. After writing the above code (python remove Unicode ” u ” from a string), Ones you will print “ string_unicode ” then the output will appear as a “ Python is easy. ”.

What does \xa0 mean in Python?

The \xa0 Unicode represents a hard space or a no-break space in a program. It is represented as   in HTML.


2 Answers

The most robust way would be to use the unidecode module to convert all non-ASCII characters to their closest ASCII equivalent automatically.

The character \xa0 (not \xa as you stated) is a NO-BREAK SPACE, and the closest ASCII equivalent would of course be a regular space.

import unidecode
word = unidecode.unidecode(word)
like image 120
Mark Ransom Avatar answered Sep 23 '22 17:09

Mark Ransom


If you know for sure that is the only character you don't want, you can .replace it:

>>> word.replace(u'\xa0', ' ')
u'Buffalo, IL 60625'

If you need to handle all non-ascii characters, encoding and replacing bad characters might be a good start...:

>>> word.encode('ascii', 'replace')
'Buffalo,?IL?60625'
like image 37
mgilson Avatar answered Sep 22 '22 17:09

mgilson