How do I check whether grapheme is a letter (or something that is often used in words, like hieroglyph)?
After looking through Elixir's String
documentation the only way I see is to check whether String.downcase
and String.upcase
return the same string. Iff they do, then the grapheme is not something that is used in words.
This is how I do it, but surely there should be a simpler way?
defmodule Words do
defp all_letters_uppercase?(string) do
String.upcase(string) == string
end
defp all_letters_downcase?(string) do
String.downcase(string) == string
end
defp contains_letter?(string) do
not (all_letters_uppercase?(string) and all_letters_downcase?(string))
end
def single_grapheme?(string) do
with graphemes = String.graphemes(string)
do
length(graphemes) == 1 and hd(graphemes) == string
end
end
@doc """
Check whether string is a single letter.
"""
def letter?(string) do
single_grapheme?(string) and contains_letter?(string)
end
end
Update: my code doesn't work for japanese letters
iex(35)> Words.letter?("グ")
false
You can use regular expressions to check for some unicode features, one of which is \p{Letter}
, or \p{L}
for short. You might want to add a \p{Mark}*
, or \p{M}*
to also match multiple following combining diacritics. This would closely match the logic found in String.graphemes/1
. Be sure to add the u
modifier after the regex to enable these Unicode features. For example:
iex> String.match?("グ", ~r/\A\p{L}\p{M}*\z/u)
true
Also see http://erlang.org/doc/man/re.html, section on "Unicode character properties" and http://www.regular-expressions.info/unicode.html#grapheme.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With