I'm looking for a way to get the unicode category (RangeTable) from a rune in Go. For example, the character a maps to the Ll category. The unicode package specifies all of the categories (http://golang.org/pkg/unicode/#pkg-variables), but I don't see any way to lookup the category from a given rune. Do I need to manually construct the RangeTable from the rune using the appropriate offsets?
The docs for the "unicode" package does not have a method that returns ranges for the rune but it is not very tricky to build one:
func cat(r rune) (names []string) {
names = make([]string, 0)
for name, table := range unicode.Categories {
if unicode.Is(table, r) {
names = append(names, name)
}
}
return
}
Here is an alternative version based on the accepted answer, that returns the Unicode Category:
// UnicodeCategory returns the Unicode Character Category of the given rune.
func UnicodeCategory(r rune) string {
for name, table := range unicode.Categories {
if len(name) == 2 && unicode.Is(table, r) {
return name
}
}
return "Cn"
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With