Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to configure Cassandra to work across multiple EC2 regions with Ec2MultiRegionSnitch

I am new to Cassandra and have been tasked with getting it up and running in the EC2 environment across multiple regions such that if an entire EC2 region goes belly up our app will continue on its merry way. I've read as much documentation as I could find regarding Ec2MultiRegionSnitch and have come to a dead stop. I am running cassandra 1.0.10.

My problems are as follows:

1) when I start bin/cassandra I get the error: Could not start register mbean in JMX. Though I can run bin/nodetool -h ring on any of the nodes and I get the display you would expect from a healthy system. I have added the mx4j library to my cassandra deployment. I could try removing that I suppose.

2) when I then start bin/cassandra-cli -h I am able to create the keyspace as follows:

    CREATE KEYSPACE mykeyspace 
    WITH placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy'
    and strategy_options = {us-east-1:2,us-west-1:2};

3) After I run 'use mykeyspace' I can create a column family as follows:

    CREATE COLUMN FAMILY people 
       WITH comparator=UTF8Type AND key_validation_class=UTF8Type AND 
       default_validation_class=UTF8Type AND column_metadata=[{column_name:FIRST_NAME,validation_class:UTF8Type},
      {column_name:LAST_NAME,validation_class:UTF8Type},
      {column_name:EMAIL,validation_class:UTF8Type},
      {column_name:LOGIN,validation_class:UTF8Type, index_type: KEYS}];

4) After I do this I can run bin/cassandra-cli -h on any of the 4 nodes, run use mykeyspace; describe; and each node correctly describes mykeyspace including the column family and seed list.

5) But when I try to perform a simple:

    set people['1']['FIRST_NAME'] = 'John'; 

I get a stack trace as follows:

    null
    UnavailableException()
        at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:15206)
        at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:858)
        at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:830)
        at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:901)

My configuration:

I have performed ec2-authorize for ports 22, 7000, 7199 and 9160

I have 4 nodes in my cluster: one node in each of the following regions:AvailabilityZones.

    us-east-1:us-east-1a  (initial_token: 0)
    us-east-1:us-east-1c  (initial_token: 85070591730234615865843651857942052864)
    us-west-1:us-west-1a  (initial_token: 1)
    us-west-1:us-west-1c  (initial_token: 85070591730234615865843651857942052865)

Each EC2 instance has been associated with a public IP address.

In each node I have configured cassandra.yaml as follows:

    seeds: <set to the public ip address for the us-east-1a and us-west-1a nodes>
    storage_port: 7000
    listen_address: <private ip address of this node>
    broadcast_address: <public ip address of this node>
    rpc_address: 0.0.0.0
    rpc_port: 9160
    endpoint_snitch: Ec2MultiRegionSnitch

Additionally in each node's cassandra-env.sh I've included:

    JVM_OPTS="$JVM_OPTS -Djava.rmi.server.hostname=<Node's local IP Address>"

My Plea Hopefully I have provided someone with enough information to help me get this thing working as one would like.

Additional Information Stack trace from first mx4j issue:

    WARN 22:07:17,651 Could not start register mbean in JMX java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.cassandra.utils.Mx4jTool.maybeLoad(Mx4jTool.java:66)
    at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:243)
    at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:356)
    at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:107)
    Caused by: java.net.BindException: Cannot assign requested address
    at java.net.PlainSocketImpl.socketBind(Native Method)
    at java.net.AbstractPlainSocketImpl.bind(AbstractPlainSocketImpl.java:353)

My cassandra-topology.properties

    aaa.aaa.aaa.aaa=us-east-1:us-east-1a
    bbb.bbb.bbb.bbb=us-east-1:us-east-1c

    ccc.ccc.ccc.ccc=us-west-1:us-west-1a
    ddd.ddd.ddd.ddd=us-west-1:us-west-1c

    default=us-east-1:us-east-1a

My nodetool ring output __

    Address         DC          Rack        Status State   Load            Owns    Token                                       
                                                                           85070591730234615865843651857942052865      
    aaa.aaa.aaa.aaa  us-east     1a          Up     Normal  11.09 KB        50.00%  0                                           
    bbb.bbb.bbb.bbb  us-west     1a          Up     Normal  6.68 KB         0.00%   1                                           
    ccc.ccc.ccc.ccc  us-east     1c          Up     Normal  11.09 KB        50.00%  85070591730234615865843651857942052864      
    ddd.ddd.ddd.ddd  us-west     1c          Up     Normal  15.5 KB         0.00%   85070591730234615865843651857942052865  

I'm pretty certain I've added the regions/availability zone correctly. At least I think I matched what appears in the documentation. (Look at Ec2MultiRegionSnitch in this link) http://www.datastax.com/docs/1.0/cluster_architecture/replication

I don't think I can just list the regions as us-west and us-east because there are two regions out west (us-west-1 is the California region and us-west-2 is the Oregon region). So I don't think just putting us-west would successfully differentiate regions.

like image 379
jspyeatt Avatar asked Jun 13 '12 22:06

jspyeatt


1 Answers

My guess in my comment was right. Your replication settings and datacenter names don't match. A couple of things.

1) cassandra-topology.properties is only used by the PropertyFileSnitch. That file is irrelevant while using the ec2 snitch. 2) The reason the snitch is currently reporting 'us-west' instead of 'us-west-1' is due to a bug. https://issues.apache.org/jira/browse/CASSANDRA-4026. If you added nodes in 'us-west-2' they will correctly get reported as that.

So the solution here is to update your replication settings:

CREATE KEYSPACE mykeyspace 
    WITH placement_strategy = 'org.apache.cassandra.locator.NetworkTopologyStrategy'
    and strategy_options = {us-east:2,us-west:2};

Also, I unfortunately do not know what is wrong with mx4j. It isn't needed by cassandra though so unless you actually need it for something you can just remove it.

like image 124
nickmbailey Avatar answered Oct 23 '22 05:10

nickmbailey