I want to build a relatively long string (an SVG if that matters) and return it from a function. What is the best way to build such a string in Elixir? In other languages I'd use something like a StringBuilder class. Is there an equivalent thing in Elixir?
You can append strings with the <> operator, but isn't that just a list append? It seems like doing a lot of those would get really inefficient.
In Elixir the to_string function can convert a number into a string. There is already a function in the String module for reversing a string called String. reverse . Strings can be converted into integers using the String.
Strings in Elixir are UTF-8 encoded binaries. Strings in Elixir are a sequence of Unicode characters, typically written between double quoted strings, such as "hello" and "héllò" .
Erlang and Elixir have another type of strings called IO lists, which are lists that consist of a number of binaries (Strings in elixir) and IO lists, so it's a recursive type.
Examples:
helloworld = ["Hello" | ["World"]]
sentence = ["A ", "small", " bike" | [" is", " red."]]
paragraph = [helloworld, ". ", sentence]
As you can see, we combined the two IO lists helloworld
and sentence
to another IO list paragraph
. This happened without copying or appending any data to the actual binaries, as the new list paragraph
internally just contains address locations of the other lists and the ". "
string.
This allows you to cheaply combine huge lists of binaries. A lot of Elixir and Erlang functions work with IO lists, for example, Phoenix' EEx templates return IO lists which are eventually passed to the socket to send out, while still being IO lists.
Poison's encoder will also internally use IO lists so it won't have to combine the strings for every list or object it encodes.
Elixir Strings (called Binaries in Erlang) are represented as a contiguous sequence of bytes in memory, not a Linked List of bytes/characters, if that's what you mean by "list". They are immutable and a naive implementation of appending two binaries will be O(n+m)
if the strings are of length n
and m
, but the Erlang VM optimizes the use case of building a large string: if you have two strings, a
and b
, and a
has free memory after its allocation, and you concatenate them (a <> b
), the VM only copies b
and reuses the old value of a
. This optimization will for obvious reasons not be applied if you later concatenate another string c
to a
, but this optimization itself is enough to make the task of building a large binary as efficient as in languages with mutable strings. This optimization is explained in detail here.
Here's a demo of this optimization in action. In the first example, I'm creating a 10,000,000 byte string by appending to a base value. In the second example, I'm creating a 500,000 byte string by prepending to a base value, which takes over 10x more time than appending 10,000,000 bytes. In a naive implementation, both would take the same amount of time.
{time, _} = :timer.tc(fn ->
Enum.reduce(1..10_000_000, "", fn _, acc -> acc <> "." end)
end)
IO.inspect time
{time, _} = :timer.tc(fn ->
Enum.reduce(1..500_000, "", fn _, acc -> "." <> acc end)
end)
IO.inspect time
683621
7807815
In short, you should be fine building large string as long as you're only appending the value.
If you're going to write the resulting string to a socket or stream or similar, you might be able to do it significantly faster by creating iolists instead of a flat string. More info about those here and here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With