Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find if codepoint is upper case in Elixir

Tags:

elixir

I need to detect if a codepoint is an upper case letter in Elixir. I have tried checking if it's value is in the range 65..90 but this fails on non-latin upper case letters. I have also tried checking if

String.upcase(cp) == cp

however this fails on non-letters (ie numbers, punctuation).

I really don't want to go through the entirety of unicode and create a list of upper case codepoints, is there a built in function for this?

like image 711
cjm Avatar asked May 01 '16 13:05

cjm


2 Answers

You can use the \p{Lu} Unicode character property regex escape sequence to match any uppercase letter:

iex(1)> "a" =~ ~r/^\p{Lu}$/u
false
iex(2)> "A" =~ ~r/^\p{Lu}$/u
true
iex(3)> "π" =~ ~r/^\p{Lu}$/u
false
iex(4)> "Π" =~ ~r/^\p{Lu}$/u
true
iex(5)> "!" =~ ~r/^\p{Lu}$/u
false

Make sure you pass the u flag to turn on Unicode matching in the regex.

You can find more information about the supported properties on this page. Search for the heading "Unicode character properties" on the page.

like image 96
Dogbert Avatar answered Nov 18 '22 08:11

Dogbert


I think you could use something like this:

<< *CODEPOINT* :: utf8 >> != String.downcase(<< *CODEPOINT* :: utf8 >>)

there is maybe a better way but that's the start.

like image 21
NoDisplayName Avatar answered Nov 18 '22 09:11

NoDisplayName