Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

National (non-Arabic) digits in Unicode?

Tags:

unicode

I know unicode contains all characters from most world aphabets..but what about digits? Are they part of unicode or not? I was not able to find straight answer. Thanks

like image 530
Petr Avatar asked Sep 17 '10 08:09

Petr


People also ask

What is Unicode digit?

A numeral (often called number in Unicode) is a character that denotes a number. The decimal number digits 0–9 are used widely in various writing systems throughout the world, however the graphemes representing the decimal digits differ widely.

What is the numeric value corresponding to a Unicode character called?

A single number is assigned to each code element defined by the Unicode Standard. Each of these numbers is called a code point and, when referred to in text, is listed in hexadecimal form following the prefix "U+". For example, the code point U+0041 is the hexadecimal number 0041 (equal to the decimal number 65).


2 Answers

As already stated, Indo-Arabic numerals (0,1,..,9) are included in Unicode, inherited from ASCII. If you're talking about representation of numbers in other languages, the answer is still yes, they are also part of Unicode.

//numbers (0-9) in Malayalam (language spoken in Kerala, India)
൦ ൧ ൨ ൩ ൪ ൫ ൬ ൭ ൮ ൯  
//numbers (0-9) in Hindi (India's national language)
० १ २ ३ ४ ५ ६ ७ ८ ९ 

You can use \p{N} or \p{Number} in a regular expression to match any kind of numeric character in any script.

This document (Page-3) describes the Unicode code points for Malayalam digits.

like image 178
Amarghosh Avatar answered Oct 22 '22 13:10

Amarghosh


In short: yes, of course. There are three categories in UNICODE containing various representations of digits and numbers:

  • Number, Decimal Digit (characters) – e.g. Arabic, Thai, Devanagari digits;
  • Number, Letter (characters) – e.g. Roman numerals;
  • Number, Other (characters) – e.g. fractions.
like image 35
Bolo Avatar answered Oct 22 '22 13:10

Bolo