I have the following string:
word = u'Buffalo,\xa0IL\xa060625'
I don't want the "\xa0" in there. How can I get rid of it? The string I want is:
word = 'Buffalo, IL 06025
You can remove a character from a Python string using replace() or translate(). Both these methods replace a character or string with a given value. If an empty string is specified, the character or string you select is removed from the string without a replacement.
In python, to remove Unicode ” u “ character from string then, we can use the replace() method to remove the Unicode ” u ” from the string. After writing the above code (python remove Unicode ” u ” from a string), Ones you will print “ string_unicode ” then the output will appear as a “ Python is easy. ”.
The \xa0 Unicode represents a hard space or a no-break space in a program. It is represented as in HTML.
The most robust way would be to use the unidecode
module to convert all non-ASCII characters to their closest ASCII equivalent automatically.
The character \xa0
(not \xa
as you stated) is a NO-BREAK SPACE, and the closest ASCII equivalent would of course be a regular space.
import unidecode
word = unidecode.unidecode(word)
If you know for sure that is the only character you don't want, you can .replace
it:
>>> word.replace(u'\xa0', ' ')
u'Buffalo, IL 60625'
If you need to handle all non-ascii characters, encoding and replacing bad characters might be a good start...:
>>> word.encode('ascii', 'replace')
'Buffalo,?IL?60625'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With