Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In a mnesia cluster, which node is queried?

Tags:

erlang

mnesia

Let's say you have a mnesia table replicated on nodes A and B. If on node C, which does not contain a copy of the table, I do mnesia:change_config(extra_db_nodes, [NodeA, NodeB]), and then on node C I do mnesia:dirty_read(user, bob) how does node C choose which node's copy of the table to execute a query on?

like image 426
ryeguy Avatar asked Apr 06 '09 18:04

ryeguy


1 Answers

According to my own research answer for the question is - it will choose the most recently connected node. I will be grateful for pointing out errors if found - mnesia is a really complex system!

As Dan Gudmundsson pointed out on the mailing list algorithm of selection of the remote node to query is defined in mnesia_lib:set_remote_where_to_read/2. It is the following

set_remote_where_to_read(Tab, Ignore) ->
    Active = val({Tab, active_replicas}),
    Valid =
       case mnesia_recover:get_master_nodes(Tab) of
           [] ->  Active;
           Masters -> mnesia_lib:intersect(Masters, Active)
       end,
    Available = mnesia_lib:intersect(val({current, db_nodes}), Valid -- Ignore),
    DiscOnlyC = val({Tab, disc_only_copies}),
    Prefered  = Available -- DiscOnlyC,
    if
       Prefered /= [] ->
           set({Tab, where_to_read}, hd(Prefered));
       Available /= [] ->
           set({Tab, where_to_read}, hd(Available));
       true ->
           set({Tab, where_to_read}, nowhere)
    end.

So it gets the list of active_replicas (i.e. list of candidates), optionally shrinks the list to master nodes for the table, remove tables to be ignored (for any reason), shrinks the list to currently connected nodes and then selects in the following order:

  1. First non-disc_only_copies
  2. Any available node

The most important part is in fact the list of active_replicas, since it determines the order of nodes in the list of candidates.

List of active_replicas is formed by remote calls of mnesia_controller:add_active_replica/* from newly connected nodes to old nodes (i.e. one which were in the cluster before), which boils down to the function add/1 which adds the item as the head of the list.

Hence answer for the question is - it will choose the most recently connected node.

Notes: To check out the list of active replicas on the given node you can use this (dirty hack) code:

[ {T,X} || {{T,active_replicas}, X} <- ets:tab2list(mnesia_gvar) ]. 
like image 53
gleber Avatar answered Oct 10 '22 05:10

gleber