I come across a strange problem dealing with python isdigit function.
For example:
>>> a = u'\u2466'
>>> a.isdigit()
Out[1]: True
>>> a.isnumeric()
Out[2]: True
Why this character is a digit?
Any way to make this return False instead, thanks?
Edit, If I don't want to treat it as a digit, then how to filter it out?
For example, when I try to convert it to a int:
>>> int(u'\u2466')
Then UnicodeEncodeError
happened.
Definition and Usage. The isdigit() method returns True if all the characters are digits, otherwise False. Exponents, like ², are also considered to be a digit.
The isdigit() method returns True if all characters in a string are digits or Unicode char of a digit. If not, it returns False.
Python String isdigit() method returns “True” if all characters in the string are digits, Otherwise, It returns “False”.
U+2466 is the CIRCLED DIGIT SEVEN (⑦), so yes, it's a digit.
If your definition of what is a digit differs from that of the Unicode Consortium, you might have to write your own isdigit()
method.
Edit, If I don't want to treat it as a digit, then how to filter it out?
If you are just interested in the ASCII digits 0
...9
, you could do something like:
In [4]: s = u'abc 12434 \u2466 5 def'
In [5]: u''.join(c for c in s if '0' <= c <= '9')
Out[5]: u'124345'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With