Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

replace or delete specific unicode characters in python

There seem to be a lot of posts about doing this in other languages, but I can't seem to figure out how in Python (I'm using 2.7).

To be clear, I would ideally like to keep the string in unicode, just be able to replace certain specific characters.

For instance:

thisToken = u'tandh\u2013bm'
print(thisToken)

prints the word with the m-dash in the middle. I would just like to delete the m-dash. (but not using indexing, because I want to be able to do this anywhere I find these specific characters.)

I try using replace like you would with any other character:

newToke = thisToken.replace('\u2013','')
print(newToke)

but it just doesn't work. Any help is much appreciated. Seth

like image 928
seth127 Avatar asked Mar 10 '23 15:03

seth127


1 Answers

The string you're searching for to replace must also be a Unicode string. Try:

newToke = thisToken.replace(u'\u2013','')
like image 80
Kevin Avatar answered Mar 19 '23 16:03

Kevin