I need to be able to use binaries with Cyrillic characters in them. I tried just writing <<"абвгд">>
but I got a badarg error.
How can I work with Cyrillic (or unicode) strings in Erlang?
If you want to input the above expression in erlang shell
, please read unicode
module user manual.
Function character_to_binary
, and character_to_list
are both reversable function. The following are an example:
([email protected])37> io:getopts().
[{expand_fun,#Fun<group.0.33302583>},
{echo,true},
{binary,false},
{encoding,unicode}]
([email protected])40> A = unicode:characters_to_binary("上海").
<<228,184,138,230,181,183>>
([email protected])41> unicode:characters_to_list(A).
[19978,28023]
([email protected])45> io:format("~s~n",[ unicode:characters_to_list(A,utf8)]).
** exception error: bad argument
in function io:format/3
called as io:format(<0.30.0>,"~s~n",[[19978,28023]])
([email protected])46> io:format("~ts~n",[ unicode:characters_to_list(A,utf8)]).
上海
ok
If you want to use unicode:characters_to_binary("上海").
directly in the source code, it is a little more complex. You can try it firstly to find difference.
The Erlang compiler will interpret the code as ISO-8859-1 encoded text, which limits you to Latin characters. Although you may be able to bang in some ISO characters that may have the same byte representation as you want in Unicode, this is not a very good idea.
You want to make sure your editor reads and writes ISO-8859-1, and you want to avoid using literals as much as possible. Source these strings from files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With