I am using datastax java driver 3.1.0 to connect to cassandra cluster and my cassandra cluster version is 2.0.10. I am writing asynchronously with <code>QUORUM</code> consistency. <pre class="prettyprint"><code> private final ExecutorService executorService = Executors.newFixedThreadPool(10); private final Semaphore concurrentQueries = new Semaphore(1000); public void save(String process, int clientid, long deviceid) { String sql = "insert into storage (process, clientid, deviceid) values (?, ?, ?)"; try { BoundStatement bs = CacheStatement.getInstance().getStatement(sql); bs.setConsistencyLevel(ConsistencyLevel.QUORUM); bs.setString(0, process); bs.setInt(1, clientid); bs.setLong(2, deviceid); concurrentQueries.acquire(); ResultSetFuture future = session.executeAsync(bs); Futures.addCallback(future, new FutureCallback<ResultSet>() { @Override public void onSuccess(ResultSet result) { concurrentQueries.release(); logger.logInfo("successfully written"); } @Override public void onFailure(Throwable t) { concurrentQueries.release(); logger.logError("error= ", t); } }, executorService); } catch (Exception ex) { logger.logError("error= ", ex); } } </code></pre> My above save method will be called from multiple threads at very fast speed. If I write at very high speed than my Cassandra cluster can handle then it will start throwing errors and I want all my writes should go successfully into cassandra without any loss. Question: I was thinking to use some sort off queue or buffer to enqueue requests (e.g. <code>java.util.concurrent.ArrayBlockingQueue</code>). "Buffer full" would mean that clients should wait. Buffer would also be used to re-enqueue failed requests. However to be more fair failed requests probably should be put to front of queue so they are retried first. Also we should somehow handle situation when queue is full and there are new failed requests at the same time. A single-threaded worker then would pick requests from queue and send them to Cassandra. Since it should not do much it's unlikely that it becomes a bottle-neck. This worker can apply it's own rate limits, e.g. based on timing with <code>com.google.common.util.concurrent.RateLimiter</code>. What is the best way to implement this queue or buffer feature which can apply particular guava rate limiting as well while writing into Cassandra or if there is any better approach let me know as well? I wanted to write to Cassandra at 2000 request per second (this should be configurable so that I can play with it to see what is optimal setting). As noted below in the comments, if memory keeps increasing we can use Guava Cache or CLHM to keep dropping old records to make sure my program doesn't run out of memory. We will be having around 12GB of memory on the box and these records are very small so I don't see it should be a problem.

<blockquote> If I write at very high speed than my Cassandra cluster can handle then it will start throwing errors and I want all my writes should go successfully into cassandra without any loss. </blockquote> Datastax driver allows to configure number of connections per host and number of concurrent requests per connection (see PoolingOptions settings) Adjust these settings to decrease pressure on Cassandra cluster.

How to send request to cassandra at a particular rate using Guava RateLimiter?

Tags:

java

cassandra

guava

rate-limiting

datastax-java-driver

I am using datastax java driver 3.1.0 to connect to cassandra cluster and my cassandra cluster version is 2.0.10. I am writing asynchronously with QUORUM consistency.

  private final ExecutorService executorService = Executors.newFixedThreadPool(10);
  private final Semaphore concurrentQueries = new Semaphore(1000);

  public void save(String process, int clientid, long deviceid) {
    String sql = "insert into storage (process, clientid, deviceid) values (?, ?, ?)";
    try {
      BoundStatement bs = CacheStatement.getInstance().getStatement(sql);
      bs.setConsistencyLevel(ConsistencyLevel.QUORUM);
      bs.setString(0, process);
      bs.setInt(1, clientid);
      bs.setLong(2, deviceid);

      concurrentQueries.acquire();
      ResultSetFuture future = session.executeAsync(bs);
      Futures.addCallback(future, new FutureCallback<ResultSet>() {
        @Override
        public void onSuccess(ResultSet result) {
          concurrentQueries.release();
          logger.logInfo("successfully written");
        }

        @Override
        public void onFailure(Throwable t) {
          concurrentQueries.release();
          logger.logError("error= ", t);
        }
      }, executorService);
    } catch (Exception ex) {
      logger.logError("error= ", ex);
    }
  }

My above save method will be called from multiple threads at very fast speed. If I write at very high speed than my Cassandra cluster can handle then it will start throwing errors and I want all my writes should go successfully into cassandra without any loss.

Question:

I was thinking to use some sort off queue or buffer to enqueue requests (e.g. java.util.concurrent.ArrayBlockingQueue). "Buffer full" would mean that clients should wait. Buffer would also be used to re-enqueue failed requests. However to be more fair failed requests probably should be put to front of queue so they are retried first. Also we should somehow handle situation when queue is full and there are new failed requests at the same time. A single-threaded worker then would pick requests from queue and send them to Cassandra. Since it should not do much it's unlikely that it becomes a bottle-neck. This worker can apply it's own rate limits, e.g. based on timing with com.google.common.util.concurrent.RateLimiter.

What is the best way to implement this queue or buffer feature which can apply particular guava rate limiting as well while writing into Cassandra or if there is any better approach let me know as well? I wanted to write to Cassandra at 2000 request per second (this should be configurable so that I can play with it to see what is optimal setting).

As noted below in the comments, if memory keeps increasing we can use Guava Cache or CLHM to keep dropping old records to make sure my program doesn't run out of memory. We will be having around 12GB of memory on the box and these records are very small so I don't see it should be a problem.

955

asked Jan 20 '17 07:01

john

1 Answers

If I write at very high speed than my Cassandra cluster can handle then it will start throwing errors and I want all my writes should go successfully into cassandra without any loss.

Datastax driver allows to configure number of connections per host and number of concurrent requests per connection (see PoolingOptions settings)

Adjust these settings to decrease pressure on Cassandra cluster.

answered Oct 05 '22 07:10

Mikhail Baksheev

Related questions
                            
                                Application Logging for elastic beanstalk
                            
                                Replacement or workaround for org.hibernate.jmx.statisticsservice in hibernate 5.2.1
                            
                                Workaround for bug_id=4806603 regarding getBounds() and setBounds() on Linux?
                            
                                How to manage Java memory when using recursion to step through a large directory structure
                            
                                How to parse a URL and run a method with Spring MVC 'reflectively'?
                            
                                Amazon EMR: running Custom Jar with input and output from S3
                            
                                How to catch transaction exceptions in @Async?
                            
                                Why is Internal memory in java Native Memory Tracking increasing
                            
                                TableauSDK proxy settings
                            
                                How to use Spring Data JPA example matcher with additional conditions
                            
                                Is it possible to have a jar Manifest to use all jars in a folder
                            
                                Delete the element at index N, LinkedList
                            
                                Read field from parent node in custom Jackson Deserializer
                            
                                Android, Java: Stretch Bitmap or Path to arbitrary polygon
                            
                                ScrollPane JavaFX make it scroll more?
                            
                                ClassCastException when using custom map supplier in grouping-by
                            
                                Getting GDK_BACKEND does not match available displays error in debian
                            
                                Why does type inference fail for lambda, but succeed for equivalent method reference?
                            
                                Android 7 - Cannot display PDF (pdf_name is of invalid format)
                            
                                JavaFX WebView html loader not loading images

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With