I'm considering solving a problem using Elixir, mainly because of the ability to spawn large numbers of processes cheaply.
In my scenario, I'd want to create several "original" processes, which load specific, immutable data into memory, then make copies of those processes as needed. The copies would all use the same base data, but do different, read-only tasks with it; eg, imagine that one "original" has the text of "War and Peace" in memory, and each copy of that original does a different kind of analysis on the text.
My questions:
There is no built-in way to copy processes. The easiest way to do it is to start the "original" process and the "copies" and send all the relevant data in messages to the copies. Processes don't share data so there is no more efficient way of doing it. Putting the data in ETS tables only partially helps with sharing as the data in the ETS tables are copied to the process when they are used, however, you don't need to have all the data in the process heap.
An Erlang process has no process-specific data apart from what's stored in variables (and the process dictionary), so to make a copy of the memory of a process, just spawn a new process passing all relevant variables as arguments to the function.
In general, memory is not shared between processes; everything is copied. The exceptions are ETS tables (though data is copied from ETS tables when processes read it), and binaries larger than 64 bytes. If you store "War and Peace" in a binary, and send it to each worker process (or pass it along when you spawn those worker processes), then the processes would share the memory, only copying it if they wanted to modify the binary. See the chapter on binaries in the Erlang efficiency guide for more details.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With