Given an Elixir bitstring encoded in UTF-16LE:
<<68, 0, 101, 0, 118, 0, 97, 0, 115, 0, 116, 0, 97, 0, 116, 0, 111, 0, 114, 0, 0, 0>>
how can I get this converted into a readable Elixir String (it spells out "Devastator")? The closest I've gotten is transforming the above into a list of the Unicode codepoints (["0044", "0065", ...]
) and trying to prepend the \u
escape sequence to them, but Elixir throws an error since it's an invalid sequence. I'm out of ideas.
The simplest way is using functions from the :unicode
module:
:unicode.characters_to_binary(utf16binary, {:utf16, :little})
For example
<<68, 0, 101, 0, 118, 0, 97, 0, 115, 0, 116, 0, 97, 0, 116, 0, 111, 0, 114, 0, 0, 0>>
|> :unicode.characters_to_binary({:utf16, :little})
|> IO.puts
#=> Devastator
(there's a null byte at the very end, so the binary display instead of string will be used in the shell, and depending on OS it may print some extra representation for the null byte)
You can make use of Elixir's pattern matching, specifically <<codepoint::utf16-little>>
:
defmodule Convert do
def utf16le_to_utf8(binary), do: utf16le_to_utf8(binary, "")
defp utf16le_to_utf8(<<codepoint::utf16-little, rest::binary>>, acc) do
utf16le_to_utf8(rest, <<acc::binary, codepoint::utf8>>)
end
defp utf16le_to_utf8("", acc), do: acc
end
<<68, 0, 101, 0, 118, 0, 97, 0, 115, 0, 116, 0, 97, 0, 116, 0, 111, 0, 114, 0, 0, 0>>
|> Convert.utf16le_to_utf8
|> IO.puts
<<192, 3, 114, 0, 178, 0>>
|> Convert.utf16le_to_utf8
|> IO.puts
Output:
Devastator
πr²
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With