Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I remove all non-letter (all languages) and non-numeric characters from a string?

I've been searching for quite some time now yet I can not find any explanation on the subject.

If I have a string, say: u'àaeëß35+{}"´'. I want all non-alphanumeric charachters removed (however, I want à, ë, ß etc. kept.

I'm fairly new to Python and I could not figure out a regex to perform this task. Only other solution I can think of is having a list with the chars I want to remove and iterating through the string replacing them.

What is the correct Pythonic solution here?

Thank you.

like image 863
Phil Avatar asked Dec 21 '22 10:12

Phil


2 Answers

In [63]: s = u'àaeëß35+{}"´'

In [64]: print ''.join(c for c in s if c.isalnum())
àaeëß35
like image 83
root Avatar answered Feb 01 '23 23:02

root


What about:

def StripNonAlpha(s):
    return "".join(c for c in s if c.isalpha())
like image 24
rodrigo Avatar answered Feb 01 '23 22:02

rodrigo