Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

remove special characters from string

Tags:

python

i have a string "Mikael Håfström" which contains some special characters how do i remove this using python?

like image 621
shaan Avatar asked Mar 10 '11 10:03

shaan


People also ask

How do I remove special characters from a string in HTML?

This should do what you're looking for: function clean($string) { $string = str_replace(' ', '-', $string); // Replaces all spaces with hyphens. return preg_replace('/[^A-Za-z0-9\-]/', '', $string); // Removes special chars. }

How do I remove all special characters from a string in C++?

To remove all the characters other than alphabets(a-z) && (A-Z), we just compare the character with the ASCII value, and for the character whose value does not lie in the range of alphabets, we remove those characters using string erase function.


2 Answers

You can use the unicodedata module to normalize unicode strings and encode them in their ASCII form like so:

>>> import unicodedata
>>> source = u'Mikael Håfström'
>>> unicodedata.normalize('NFKD', source).encode('ascii', 'ignore')
'Mikael Hafstrom'

One notable exception is that the letters 'đ' and 'Đ' are not recognized by Python and they do not get encoded to 'd', so they will simply be omitted from the result. That's a voiced alveolo-palatal affricate present in the latin alphabet of some SEE languages, so it may or may not immediately concern you based on your audience or whether or not your providing full support for the Latin-1 character set. I currently have Python 2.6.5 (Mar 19 2010) running locally and the issue is present, though I'm sure it may have been resolved with newer releases.

like image 94
Filip Dupanović Avatar answered Oct 13 '22 00:10

Filip Dupanović


For example using the encode method: u"Mikael Håfström".encode("ascii", "ignore")

like image 21
filmor Avatar answered Oct 13 '22 00:10

filmor