Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How and why is string.isdigit() locale dependent?

Python has the function for string to test whether all characters are digits: string.isdigit().

In the manual is written:

For 8-bit strings, this method is locale-dependent

How is this method locale-depedent? In what locales are there digits that are outside the 0-9 range?

Also, if this is locale dependent, does python have a method for checking it with a specific locale (i.e. only 0-9 digits).

like image 962
Peter Smit Avatar asked Jul 02 '12 13:07

Peter Smit


People also ask

What is locale dependent in Python?

The locale module is part of Python's internationalization and localization support library. It provides a standard way to handle operations that may depend on the language or location of a user. For example, it handles formatting numbers as currency, comparing strings for sorting, and working with dates.

What is locale dependent?

Locale refers to country/region and language settings that you can use to customize your program. Some locale-dependent categories include the display formats for dates and monetary values. For more information, see Locale Categories.

Does Isdigit work for strings?

isdigit() only returns true for strings (here consisting of just one character each) contains only digits. Because only digits are passed through, int() always works, it is never called on a letter.

How does Isdigit work in Python?

Definition and Usage. The isdigit() method returns True if all the characters are digits, otherwise False. Exponents, like ², are also considered to be a digit.


2 Answers

CPython uses the C function "isdigit" for the is_digit method on strings (see stringobject.c). See this related thread: Can isdigit legitimately be locale dependent in C

Apparently, it has to do with superscript digits, like 0xB2 ('²'), 0xB3 ('³') and 0xB9 ('¹').

HTH

like image 172
Jordan Dimov Avatar answered Sep 28 '22 03:09

Jordan Dimov


does python have a method for checking it with a specific locale (i.e. only 0-9 digits).

The simplest way:

>>> '1' in '1234567890'
True
>>> 'a' in '1234567890'
False

Your can also check ord, it might be faster (isn't):

>>> ord('0') <= ord('a') <= ord('9')
False
>>> ord('0') <= ord('5') <= ord('9')
True
like image 23
Kos Avatar answered Sep 28 '22 04:09

Kos