Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Differences between IsDigit and IsNumber in unicode in Go

Tags:

unicode

go

It seems IsDigit and IsNumber in the unicode package don't behave differently, at least in my following test code:

package main

import "fmt"
import "unicode"

func main() {
    r := rune('1')
    fmt.Println(unicode.IsDigit(r))
    fmt.Println(unicode.IsNumber(r))
    //true
    //true
}

They both print true.

I tried to understand from their source code. However, I still don't understand what the differences are between them, even from their source code.

// IsNumber reports whether the rune is a number (category N).
func IsNumber(r rune) bool {
    if uint32(r) <= MaxLatin1 {
        return properties[uint8(r)]&pN != 0
    }
    return isExcludingLatin(Number, r)
}


// IsDigit reports whether the rune is a decimal digit.
func IsDigit(r rune) bool {
    if r <= MaxLatin1 {
        return '0' <= r && r <= '9'
    }
    return isExcludingLatin(Digit, r)
}
like image 637
Qian Chen Avatar asked Aug 28 '14 04:08

Qian Chen


2 Answers

The general category is number and the sub category is decimal digit.

Unicode Standard

4. Character Properties

4.5 General Category

Nd = Number, decimal digit
Nl = Number, letter
No = Number, other

4.6 Numeric Value

Numeric_Value and Numeric_Type are normative properties of characters that represent numbers.

Decimal Digits.

Decimal digits, as commonly understood, are digits used to form decimal-radix numbers.

For example,

Unicode Characters in the 'Number, Decimal Digit' Category (Nd)

Unicode Characters in the 'Number, Letter' Category (Nl)

Unicode Characters in the 'Number, Other' Category (No)

package main

import (
    "fmt"
    "unicode"
)

func main() {
    digit := rune('1')
    fmt.Println(unicode.IsDigit(digit))
    fmt.Println(unicode.IsNumber(digit))
    letter := rune('Ⅷ')
    fmt.Println(unicode.IsDigit(letter))
    fmt.Println(unicode.IsNumber(letter))
    other := rune('½')
    fmt.Println(unicode.IsDigit(other))
    fmt.Println(unicode.IsNumber(other))
}

Output:

true
true
false
true
false
true
like image 175
peterSO Avatar answered Oct 11 '22 08:10

peterSO


As far as I know IsDigit() is a subset of IsNumber() so the result which you are getting is fine since both should evaluate to true. The IsNumber is use to determine if it is in any numeric Unicode category and IsDigit() checks if it is a radix-10 digit..

like image 35
Rahul Tripathi Avatar answered Oct 11 '22 07:10

Rahul Tripathi