I would like to have a function clause that matches any single UTF-8 character.
I can match on specific characters like this
def foo("a") do
"It's an a"
end
But I cannot determine if it possible to do the same for any single UTF8 character.
My current solution is to split the string to a char list and pattern match on that, but I was curious if I could skip that step.
You can do this with:
def char?(<<c::utf8>>), do: true
def char?(_), do: false
Note that this only matches a binary with a single character, to match on the next character in a string, you can just do:
def char?(<<c::utf8, _rest::binary>>), do: true
From the Regex docs:
The modifiers available when creating a Regex are: ...
unicode
(u) - enables Unicode specific patterns like\p
and changes modifiers like\w
,\W
,\s
and friends to also match on Unicode. It expects valid Unicode strings to be given on matchdotall
(s) - causes dot to match newlines and also set newline to anycrlf; the new line setting can be overridden by setting(*CR)
or(*LF)
or(*CRLF)
or(*ANY)
according to:re
documentation
So you might try: ~r/./us
From http://elixir-lang.org/crash-course.html
In Elixir, the word string means a UTF-8 binary and there is a String module that works on such data
So I think you should be good to go.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With