One replicated mnesia table has become out-of-sync

Tags:

I have an erlang application currently running on four nodes with a replicated mnesia db that stores minimal data regarding connected clients. The mnesia replication has been working seamlessly in the past (as far as I know anyway) but a client recently noticed that one of the nodes is missing some ids related to his application.

I'm not really sure how this happened. Our network may have had a hiccup at the time. Maybe? But, of more urgency at the moment is getting the data into a good state across all nodes. Is there a way to tell mnesia to replicate from a known-good node?

505

asked Nov 13 '13 04:11

RockyMountainHigh

2 Answers

Mnesia is legendary about this issue. It's a huge PITA.

Looking at it from CAP theorem's point of view, most systems built with Mnesia end up being C-A (consistency-availability with no partition tolerance) systems. For most of the time you have (and heavily rely on) its hard consistency. Then a network partition happens... It's still available for writes, but these writes destroy consistency. And later on, Mnesia has no mechanism for automatic data repair.

Everyone who uses Mnesia in a cluster should familiarize themselves with these tradeoffs. Your problem is a clear sign that using Mnesia was a poor choice. Double so if this data is critical to you.

I too use Mnesia in such a way (sometimes we all need speed you know). But I make sure to only use it to store data that I can easily reconstruct. In general, if you need it stored on disk, Mnesia is no good, except for toy projects.

I make sure to always have this function at hand:

reinit_mnesia_cluster() ->
    rpc:multicall(mnesia, stop, []),
    AllNodes = [node() | nodes()],
    mnesia:delete_schema(AllNodes),
    mnesia:create_schema(AllNodes),
    rpc:multicall(mnesia, start, []).

Use it only after the network partition has been resolved and all nodes are reachable. This will erase all Mnesia replicas and start it anew. Again, if you can't live with what it does, then using Mnesia was a poor choice.

For important data that needs hard consistency, use SQL. For important data that needs availability, use Riak. For shared state that needs speed, use Redis. Mnesia is no replacement for these systems, although at first it does seem so.

Edit on 2014-11-16: Here is a much better article on the topic, explaining in detail what I said above https://medium.com/@jlouis666/mnesia-and-cap-d2673a92850

answered Sep 18 '22 13:09

loxs

Honestly, I think the cleanest way to get an out-of-sync Mnesia to replicate from a known good node is to shut down the application on the bad node, and delete all its Mnesia database files, then do the following.

Write an escript that starts Mnesia up standalone using the "bad" node name and Mnesia directory, replicates the tables from a known good node, and shuts Mnesia down. Run that escript on the bad node.

The act of replicating the tables and shutting Mnesia down gracefully puts the node back in sync with the cluster. Then, when you start the application up on the bad node, it will join up and stay in sync with the cluster.

Of course, this description lacks precise details, but that's the gist of it. There are surely less brute force ways of doing this, but unless you have massive amounts of data to replicate, I think this way is the quickest and cleanest.

answered Sep 21 '22 13:09

Edwin Fine

Related questions
                            
                                Erlang EUnit test module that depends on a library application
                            
                                An efficient algorithm to calculate the integer square root (isqrt) of arbitrarily large integers
                            
                                Unsupervised gen_server doesn't call terminate when it receives exit signal
                            
                                Connecting a local Elixir/Erlang to a running application inside a Docker container
                            
                                Erlang binary to lower case performance
                            
                                erlang: ei_get_type() : where are the defined constants for the 'type' field?
                            
                                Detecting HTTP close using inet
                            
                                Best strategy for "mutable" records in Erlang
                            
                                How to write Native Erlang list functions in CouchDB
                            
                                How should I auto-expire entires in an ETS table, while also limiting its total size?
                            
                                net_kernel:start fails with `{'EXIT',nodistribution}`
                            
                                Is it possible to recreate erlang's :math functions as elixir macros?
                            
                                What is the best way to package Elixir CLI application?
                            
                                Which DB (SQL) is better supported in Erlang?
                            
                                Log out of an SSH session into Erlang VM without stopping the VM or leaving stale processes
                            
                                Efficient Erlang Port Driver
                            
                                Alternative FTP client libraries for Erlang
                            
                                Erlang start application in production
                            
                                Erlang fault-tolerant application: PA or CA of CAP?
                            
                                NIF to wrap my multi-threaded C++ code

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

One replicated mnesia table has become out-of-sync

Tags:

erlang

distributed-computing

mnesia

RockyMountainHigh

People also ask

2 Answers

loxs

Edwin Fine

Recent Activity

Donate For Us