Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check Unicode char in python

Tags:

python

unicode

def order_check_uni(body):  
    ccnt=0  
    for x in body:    
        if x.isUpper():  
            ccnt+=1  
        if ccnt>2:  
            print 'success'   

I try to find char non ASCII or special char or unicode char or cyrillic char like абвгдеёжзийклмнопрстуфхцчшщъыьэюя ®©™ in string body with that script, i try to replace isUpper() with isascii() and len(x) == len(x.encode), with unichr() and the other function but still find error, can somebody help me?

like image 560
nameless Avatar asked Oct 13 '25 08:10

nameless


1 Answers

for x in body:
    if ord(x) > 127:
        # character is *not* ASCII

This works if you have a Unicode string. If you just want to detect if the string contains a non-ASCII character it also works on a UTF-8 encoded byte string.

Update for Python 3: the above still works on Unicode strings, but ord no longer works for byte strings. But that's OK, because indexing into a byte string already returns an integer - no conversion necessary! The code becomes even simpler, especially if you combine it with the any function:

if any(x > 127 for x in body):
    # string is *not* ASCII
like image 97
Mark Ransom Avatar answered Oct 14 '25 20:10

Mark Ransom