Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mariadb galera - Error when a node shutdown ERROR 1047 WSREP has not yet prepared node for application use

Tags:

mariadb

galera

I installed 2 Mariadb Galera nodes (mariadb-galera-10.0.27-linux-x86_64.tar.gz) on 2 CentOs 6.6 servers.

After installed, I start node1 with parameter --wsrep-new-cluster, then start node2 without this parameter. They work fine, data is synchronized successfully between 2 nodes.

But, when I shutdown node1. Node2 still running, but when I try to access database. It show this error:

use testdb;
ERROR 1047 (08S01): WSREP has not yet prepared node for application use 

What's happen in this case? Here is my configuration on 2 NODES (Just different IP address)

[galera] 
wsrep_on=ON
wsrep_cluster_name='mysql-cluster'
wsrep_provider='/home/mariadb/mariadb-galera/lib/galera/libgalera_smm.so'
wsrep_provider_options="gcache.size=1G"
wsrep_cluster_address="gcomm://10.211.26.116:4567?

pc.wait_prim=no"
wsrep_sst_method=rsync
binlog_format=row
default_storage_engine=InnoDB
innodb_autoinc_lock_mode=2
bind-address=0.0.0.0
wsrep_node_address=10.211.26.117:4567
wsrep_node_name='db2'
like image 733
namdt55555 Avatar asked Nov 17 '16 11:11

namdt55555


2 Answers

TWO-NODE CLUSTERS

In a two-node cluster, a single-node failure causes the other to stop working.

Situation

You have a cluster composed of only two nodes. One of the nodes leaves the cluster ungracefully. That is, instead of being shut down through init or systemd, it crashes or suffers a loss of network connectivity. The node that remains becomes nonoperational. It remains so until some additional information is provided by a third party, such as a human operator or another node.

If the node remained operational after the other left the cluster ungracefully, there would be the risk that each of the two nodes will think itself as being the Primary Component. To prevent this, the node becomes nonoperational.

Solutions

There are two solutions available to you:

  • You can bootstrap the surviving node to form a new Primary Component, using the pc.boostrap wsrep Provider option. To do so, log into the database client and run the following command:

SET GLOBAL wsrep_provider_options='pc.bootstrap=YES';

This bootstraps the surviving node as a new Primary Component. When the other node comes back online or regains network connectivity with this node, it will initiate a state transfer and catch up with this node.

  • In the event that you want the node to continue to operate, you can use the pc.ignore_sb wsrep Provider option. To do so, log into the database client and run the following command:

SET GLOBAL wsrep_provider_options='pc.ignore_sb=TRUE';

The node resumes processing updates and it will continue to do so, even in the event that it suspects a split-brain situation.

Note Warning: Enabling pc.ignore_sb is dangerous in a multi-master setup, due to the aforementioned risk for split-brain situations. However, it does simplify things in master-slave clusters, (especially in cases where you only use two nodes).

In addition to the solutions provided above, you can avoid the situation entirely using Galera Arbitrator. Galera Arbitrator functions as an odd node in quorum calculations. Meaning that, if you enable Galera Arbitrator on one node in a two-node cluster, that node remains the Primary Component, even if the other node fails or loses network connectivity.

http://galeracluster.com/documentation-webpages/twonode.html

like image 104
scarface_90 Avatar answered Nov 11 '22 13:11

scarface_90


The likely reason is that your node1 went down ungracefully, or at least node2 thought it did. In this case 2-node cluster reaches a split-brain situation, where the remaining part(s) of the cluster cannot decide whether they are supposed to be the primary component. That's why 2-node clusters are not recommended.

Check the logs of node1 to see if it shut down normally, and if it did, then logs of node2 to see how it perceived the situation. If it saw node1 normal shutdown, it would say something like

[Note] WSREP: forgetting xxxxxxx (tcp://X.X.X.X:XXXX)

etc.; but if it thought the other node was lost, it would be more like

[Note] WSREP: (70f85e74, 'tcp://x.x.x.x:xxxx') turning message relay requesting on, nonlive peers: tcp://X.X.X.X:XXXX

etc.

See http://nirbhay.in/blog/2015/02/split-brain/ for more details and log examples of the split brain situation.

The cheapest way to avoid it is to use Galera arbitrator: http://nirbhay.in/blog/2013/11/what-is-galera-arbitrator/

like image 42
elenst Avatar answered Nov 11 '22 14:11

elenst