Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between isdecimal and isdigit [duplicate]

Tags:

python

unicode

The Python 3 documentation for isdigit says

Return true if all characters in the string are digits and there is at least one character, false otherwise. Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits. Formally, a digit is a character that has the property value Numeric_Type=Digit or Numeric_Type=Decimal.

So it sounds like isdigit should be a superset of isdecimal. But then the docs for isdecimal say

Return true if all characters in the string are decimal characters and there is at least one character, false otherwise. Decimal characters are those from general category “Nd”. This category includes digit characters, and all characters that can be used to form decimal-radix numbers, e.g. U+0660, ARABIC-INDIC DIGIT ZERO.

That sounds like isdecimal should be a superset of isdigit.

How are these methods related? Does one of them match a strict superset of what the other matches? Does the Numeric_Type property even have anything to do with the Nd category? (And is this contradictory documentation a documentation bug?)

like image 990
user2357112 supports Monica Avatar asked Mar 10 '14 10:03

user2357112 supports Monica


People also ask

What is the difference between Isdigit and Isdecimal?

The isdigit() method accepts only decimals, subscripts, and superscripts. Python isdecimal() – If all of the characters in a string are decimal characters, this function returns True, else it returns False. The isdecimal() method accepts only decimals.

What is the difference between Isdigit and Isnumeric and Isdecimal Python?

isdecimal() vs isdigit() vs isnumeric()isdecimal() method supports only Decimal Numbers. isdigit() method supports Decimals, Subscripts, Superscripts. isnumeric() method supports Digits, Vulgar Fractions, Subscripts, Superscripts, Roman Numerals, Currency Numerators.

What is difference between Isnumeric and Isdigit?

The Python isnumeric method has a number of key differences between the Python isdigit method. While the isidigit method checks whether the string contains only digits, the isnumeric method checks whether all the characters are numeric.

What does Isdecimal mean in Python?

The isdecimal() method returns True if all the characters are decimals (0-9). This method is used on unicode objects.


3 Answers

As I found out, the correspondence between string predicates checking for a numeric value and Unicode character properties is the following:

isdecimal: Nd,
isdigit:   No, Nd,
isnumeric: No, Nd, Nl,
isalnum:   No, Nd, Nl, Lu, Lt, Lo, Lm, Ll,

E.g., ᛰ (RUNIC BELGTHOR SYMBOL, U+16F0) belongs to Nl, therefore:

'ᛰ'.isdecimal() # False
'ᛰ'.isdigit()   # False
'ᛰ'.isnumeric() # True
'ᛰ'.isalnum()   # True
like image 102
Mirzhan Irkegulov Avatar answered Oct 30 '22 00:10

Mirzhan Irkegulov


The way I read section 4.6 of the Unicode 6.0 standard, the digit category is a superset of the decimal digit category.

Decimal digits, as commonly understood, are digits used to form decimal-radix numbers. They include script-specific digits, but exclude characters such as Roman numerals and Greek acrophonic numerals, which do not form decimal-radix expressions. (Note that <1, 5> = 15 = fifteen, but = IV = four.)

The Numeric_Type=decimal property value (which is correlated with the General_Category=Nd property value) is limited to those numeric characters that are used in decimal-radix numbers and for which a full set of digits has been encoded in a contiguous range, with ascending order of Numeric_Value, and with the digit zero as the first code point in the range.

So the decimal category would exclude digit types such as Roman numerals, fractions, etc.

like image 37
tripleee Avatar answered Oct 30 '22 00:10

tripleee


Python 3

The Python 3 documentation for str.isdecimal appears to have been corrected so it no longer says that decimals include digits:

str.isdecimal

Return true if all characters in the string are decimal characters and there is at least one character, false otherwise. Decimal characters are those that can be used to form numbers in base 10, e.g. U+0660, ARABIC-INDIC DIGIT ZERO. Formally a decimal character is a character in the Unicode General Category “Nd”.

Python 2

The Python 2 documentation still appears to be wrong (doesn't match the 2.7.14 implementation) and consistently states that decimals include digits:

str.isdigit

Return true if all characters in the string are digits and there is at least one character, false otherwise. For 8-bit strings, this method is locale-dependent.

unicode.isdecimal

Return True if there are only decimal characters in S, False otherwise. Decimal characters include digit characters, and all characters that can be used to form decimal-radix numbers, e.g. U+0660, ARABIC-INDIC DIGIT ZERO.

A quick test of the character '³' in Python 2.7.14 shows that decimals do not include digits:

>>> u'\u00b3'.isdecimal()
False
>>> u'\u00b3'.isdigit()
True

Summary

Python 2 and 3 now have similar behavior (digits include decimals) matching the Python 3 documentation, whereas the Python 2 documentation is wrong.

like image 27
Ovaflo Avatar answered Oct 29 '22 23:10

Ovaflo