Finding Unicode script of a Char in Haskell

Question

I wanted to write a function checking that a Char represents a Cyrillic letter, purely for pedagogical reasons. The simple approximation for Russian is

isCyrillic c = 
    let lc = toLower c 
    in 'а' <= lc && lc <= 'я'

but I don't like it because it doesn't handle other Cyrillic-using languages. I could hardcode the ranges:

U+0400–U+04FF Cyrillic
U+0500–U+052F Cyrillic Supplement
U+2DE0–U+2DFF Cyrillic Extended-A
U+A640–U+A69F Cyrillic Extended-B
U+1C80–U+1C8F Cyrillic Extended-C

but this doesn't seem good practice either.

Ideally the function would be just

isCyrillic c = unicodeScript c == Cyrillic

but this assumes the existence of a type enumerating Unicode scripts (Unicode ranges would do as well). Is there one somewhere?

duplode · Accepted Answer

property from text-icu's Data.Text.ICU.Char seems to fit the bill:

import Data.Text.ICU.Char

isCyrilic c = property Block c == Cyrillic

Finding Unicode script of a Char in Haskell

Tags:

haskell

unicode

Alexey Romanov

1 Answers

duplode

Recent Activity

Donate For Us

Finding Unicode script of a Char in Haskell

Tags:

haskell

unicode

Alexey Romanov

1 Answers

duplode

Related questions

Recent Activity

Donate For Us