Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

\d only matchs 0-9 digits?

As far as I know, \d should matchs non-english digits, e.g. ۱۲۳۴۵۶۷۸۹۰ but it doesn't work properly in JavaScript.

See this jsFiddle: http://jsfiddle.net/xZpam/

Is this a normal behavior?

like image 996
Afshin Mehrabani Avatar asked May 21 '13 05:05

Afshin Mehrabani


People also ask

What does regex 0 9 mean?

Definition and Usage The [0-9] expression is used to find any character between the brackets. The digits inside the brackets can be any numbers or span of numbers from 0 to 9. Tip: Use the [^0-9] expression to find any character that is NOT a digit.

What is the difference between \D and 0 9?

As in the later versions of perl \d is not the same as [0-9] , as \d will represent any Unicode character that has the digit attribute, and that [0-9] represents the characters '0', '1', '2', ..., '9'.

What is the difference between 0 9 and [: digit :]?

So only in C locale all [0-9] , [0123456789] , \d and [[:digit:]] mean exactly the same. The [0123456789] has no possible misinterpretations, [[:digit:]] is available in more utilities and in some cases mean only [0123456789] . The \d is supported by few utilities.

What does 0 9]+$ mean?

The regular expression ^[0-9]+$ will match a non-empty contiguous string of digits, i.e. a non-empty line that is composed of nothing but digits.


7 Answers

It seems that JavaScript does not support this (along with other weaknesses of the language in RegExp). However there's a library called XRegExp that has a unicode addon, which enables unicode support through \p{} category definition. For example if you use \p{Nd} instead of \d it will match digits:

<script src="xregexp-all.js" type="text/javascript"></script>
<script type="text/javascript">
    var englishDigits = '123123';
    var nonEnglishDigits = '۱۲۳۱۲۳';

    var digitsPattern = XRegExp('\\p{Nd}+');
    if (digitsPattern.test(nonEnglishDigits)) {
        alert('Non-english using xregexp');
    }

    if (digitsPattern.test(englishDigits)) {
        alert('English using xregexp');
    }
</script>

EDIT:

Used \p{Nd} instead of \p{N} as it seems that \d is equivalent to \p{Nd} in non ECMA Script Regex engines. Thanks go to Shervin for pointing it out. See also this fiddle by Shervin.

like image 89
Sina Iravanian Avatar answered Oct 03 '22 06:10

Sina Iravanian


JavaScript does not support Unicode regex matching (and it is far from the only language where such is true).

http://www.regular-expressions.info/unicode.html

like image 32
Amber Avatar answered Oct 03 '22 04:10

Amber


In the documention of Mozilla Firefox (https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/RegExp) you will find that:

\d  

Matches a digit character in the basic Latin alphabet. Equivalent to [0-9].
like image 35
kaljak Avatar answered Oct 03 '22 05:10

kaljak


\d is equivalent to [0-9], according to MDN.

like image 20
Arjan Avatar answered Oct 03 '22 05:10

Arjan


From MDN . RegEx Test

Matches a digit character in the basic Latin alphabet. Equivalent to [0-9].

like image 35
Ravi Gadag Avatar answered Oct 03 '22 06:10

Ravi Gadag


Matches a digit character. Equivalent to [0-9].

For example, /\d/ or /[0-9]/ matches '2' in "B2 is the suite number."

From MDN

like image 30
Dinever Avatar answered Oct 03 '22 06:10

Dinever


Yes, it is normal and correct that \d matches the Ascii digits 0 to 9 only. The authoritative reference is the ECMAScript standard. It is not particularly easy reading, but clause 15.10.2.12 (CharacterClassEscape) specifies that \d denotes “the ten-element set of characters containing the characters 0 through 9 inclusive”.

like image 28
Jukka K. Korpela Avatar answered Oct 03 '22 05:10

Jukka K. Korpela