Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python isdigit() is True for ④ but int() fails [duplicate]

I come across a strange problem dealing with python isdigit function.

For example:

>>> a = u'\u2466'
>>> a.isdigit()
Out[1]: True
>>> a.isnumeric()
Out[2]: True

Why this character is a digit?

Any way to make this return False instead, thanks?


Edit, If I don't want to treat it as a digit, then how to filter it out?

For example, when I try to convert it to a int:

>>> int(u'\u2466')

Then UnicodeEncodeError happened.

like image 809
lxyu Avatar asked Nov 23 '22 11:11

lxyu


2 Answers

U+2466 is the CIRCLED DIGIT SEVEN (⑦), so yes, it's a digit.

If your definition of what is a digit differs from that of the Unicode Consortium, you might have to write your own isdigit() method.

Edit, If I don't want to treat it as a digit, then how to filter it out?

If you are just interested in the ASCII digits 0...9, you could do something like:

In [4]: s = u'abc 12434 \u2466 5 def'

In [5]: u''.join(c for c in s if '0' <= c <= '9')
Out[5]: u'124345'
like image 178
NPE Avatar answered Nov 25 '22 00:11

NPE


If you're going to convert something to int you need isdecimal rather than isdigit.

Note that "decimal" is not just 0, 1, 2, ... 9, there are number of characters that can be interpreted as decimal digits and converted to an integer. Example:

#coding=utf8

s = u"1٢٣٤5"
print s.isdecimal() # True
print int(s) # 12345
like image 41
georg Avatar answered Nov 24 '22 23:11

georg