I'm working on a project where we have an array of atoms which acts as a hash. Whenever a user connects to the server a certain value is hashed, and that hash is used as an index to lookup the element in the array, and return that element. "Outside forces" (which are handled by a long-running gen_server) are able to change this array, so I can't simply hardcode it. My problem is how to "host" this array.
My first implementation was a simple gen_server which kept a copy of the array around and sent it to whoever asked for it. The process asking for it could then traverse it and get the index they want. This implementation had and inordinate amount of memory being used, which I attributed to there being so many copies of this same array floating around.
My current implementation has a central gen_server which handles the state of this array, and children which handle the actual requests. When the state changes the central gen_server updates the children. When a process wants to find it's hash result it sends its index number to the central gen_server, which forwards the request to one of the children. The child traverses its "local" list, and sends the resulting atom back to the original process.
The problem with the current implementation is that it gets bogged down at high traffic. I've tried using more and more children, but I'm pretty sure the central gen_server is the bottleneck.
Does anyone have any ideas on a better solution to my problem?
EDIT: %s/array/list/g
I suggest that you use ETS Tables
.I think that the Array method is not efficient enough. With an ETS Table
, created as public within the application backend, any process can lookup an item as soon as it needs it. ETS Tables
in the current newer versions of erlang have the capability for concurrent access.
%% Lets create a record structure %% where by the key will be a value %% in the array. %% For now, i do not know what to %% put in the field: 'other'With this kind of arrangement, you will avoid
-record(element,{key,other}).
create_table(TableName)-> Options = [ named_table,set, public, {keypos,2}, %% coz we are using record NOT tuple {write_concurrency,true} ], case ets:new(TableName,Options) of TableName -> {success,true}; Error -> {error,Error} end.
lookup_by_hash(TableName,HashValue)-> try ets:lookup(TableName,HashValue) of Value -> {value,Value}; catch X:Y -> {error,{X,Y}} end.
A Single Point of Failure
arising from a single gen_server holding data. This data is needed by many processes and hence should not be held by a single process. That's where a Table accessible by any process at any time as soon as it needs to make a look up.element
and then inserted in the ETS Tables
. ETS Tables
as possibleETS Tables
can be concurrently accessed by any process within reach and hence you will not need a central process or server to handle data
ETS Tables
ownership can be transferred from process to process. When the owning process is crashing (Only gen_servers can detect that they are dying [take note of this]), it can transfer the ETS Table
to another process to take over. Check here: ETS Give AwayNot sure if this helps, but could you manage the central hash value in a distributed hash-table (independent from your hash business) just as any other value? That way multiple process can take the load instead of one central process.
From what I read, the array does not seem to really need to be an array.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With