I wanted to write a function checking that a Char
represents a Cyrillic letter, purely for pedagogical reasons. The simple approximation for Russian is
isCyrillic c =
let lc = toLower c
in 'а' <= lc && lc <= 'я'
but I don't like it because it doesn't handle other Cyrillic-using languages. I could hardcode the ranges:
U+0400–U+04FF Cyrillic
U+0500–U+052F Cyrillic Supplement
U+2DE0–U+2DFF Cyrillic Extended-A
U+A640–U+A69F Cyrillic Extended-B
U+1C80–U+1C8F Cyrillic Extended-C
but this doesn't seem good practice either.
Ideally the function would be just
isCyrillic c = unicodeScript c == Cyrillic
but this assumes the existence of a type enumerating Unicode scripts (Unicode ranges would do as well). Is there one somewhere?
property
from text-icu's Data.Text.ICU.Char
seems to fit the bill:
import Data.Text.ICU.Char isCyrilic c = property Block c == Cyrillic
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With