why it is so slow with 100,000 records when using pipeline in redis?

Tags:

It is said that pipeline is a better way when many set/get is required in redis, so this is my test code:

public class TestPipeline {

    /**
     * @param args
     */
    public static void main(String[] args) {
        // TODO Auto-generated method stub

        JedisShardInfo si = new JedisShardInfo("127.0.0.1", 6379);
        List<JedisShardInfo> list = new ArrayList<JedisShardInfo>();
        list.add(si);
        ShardedJedis jedis = new ShardedJedis(list);
        long startTime = System.currentTimeMillis();
        ShardedJedisPipeline pipeline = jedis.pipelined();
        for (int i = 0; i < 100000; i++) {
            Map<String, String> map = new HashMap<String, String>();
            map.put("id", "" + i);
            map.put("name", "lyj" + i);
            pipeline.hmset("m" + i, map);
        }
        pipeline.sync();
        long endTime = System.currentTimeMillis();
        System.out.println(endTime - startTime);
    }
}

When I ran it, there is no response with this program for a while, but when I don't work with pipe, it takes only 20073 ms, so I am confused why it is even better without pipeline and how a wide gap!

Thanks for answer me, a few questions, how do you calculate 6MB data? When I send 10K data, pipeline is always faster than normal mode, but with 100k, pipeline would no response.I think 100-1000 operations is a advisable choice as below said.Is there anyting with JIT since I don't understand it?

233

asked May 22 '13 16:05

znlyj

1 Answers

There are a few points you need to consider before writing such a benchmark (and especially a benchmark using the JVM):

on most (physical) machines, Redis is able to process more than 100K ops/s when pipelining is used. Your benchmark only deals with 100K item, so it does not last long enough to produce meaningful results. Furthermore, there is no time for the successive stages of the JIT to kick in.
the absolute time is not a very relevant metric. Displaying the throughput (i.e. the number of operation per second) while keeping the benchmark running for at least 10 seconds would be a better and more stable metric.
your inner loop generates a lot of garbage. If you plan to benchmark Jedis+Redis, then you need to keep the overhead of your own program low.
because you have defined everything into the main function, your loop will not be compiled by the JIT (depending on the JVM you use). Only the inner method calls may be. If you want the JIT to be efficient, make sure to encapsulate your code into methods that can be compiled by the JIT.
optionally, you may want to add a warm-up phase before performing the actual measurement to avoid accounting the overhead of running the first iterations with the bare-bone interpreter, and the cost of the JIT itself.

Now, regarding Redis pipelining, your pipeline is way too long. 100K commands in the pipeline means Jedis has to build a 6MB buffer before sending anything to Redis. It means the socket buffers (on client side, and perhaps server-side) will be saturated, and that Redis will have to deal with 6 MB communication buffers as well.

Furthermore, your benchmark is still synchronous (using a pipeline does not magically make it asynchronous). In other words, Jedis will not start reading replies until the last query of your pipeline has been sent to Redis. When the pipeline is too long, it has the potential to block things.

Consider limiting the size of the pipeline to 100-1000 operations. Of course, it will generate more roundtrips, but the pressure on the communication stack will be reduced to an acceptable level. For instance, consider the following program:

import redis.clients.jedis.*;
import java.util.*;

public class TestPipeline {

    /**
     * @param args
     */

    int i = 0; 
    Map<String, String> map = new HashMap<String, String>();
    ShardedJedis jedis;  

    // Number of iterations
    // Use 1000 to test with the pipeline, 100 otherwise
    static final int N = 1000;

    public TestPipeline() {
      JedisShardInfo si = new JedisShardInfo("127.0.0.1", 6379);
      List<JedisShardInfo> list = new ArrayList<JedisShardInfo>();
      list.add(si);
      jedis = new ShardedJedis(list);
    } 

    public void push( int n ) {
     ShardedJedisPipeline pipeline = jedis.pipelined();
     for ( int k = 0; k < n; k++) {
      map.put("id", "" + i);
      map.put("name", "lyj" + i);
      pipeline.hmset("m" + i, map);
      ++i;
     }
     pipeline.sync(); 
    }

    public void push2( int n ) {
     for ( int k = 0; k < n; k++) {
      map.put("id", "" + i);
      map.put("name", "lyj" + i);
      jedis.hmset("m" + i, map);
      ++i;
     }
    }

    public static void main(String[] args) {
      TestPipeline obj = new TestPipeline();
      long startTime = System.currentTimeMillis();
      for ( int j=0; j<N; j++ ) {
       // Use push2 instead to test without pipeline
       obj.push(1000); 
       // Uncomment to see the acceleration
       //System.out.println(obj.i);
     }
     long endTime = System.currentTimeMillis();
     double d = 1000.0 * obj.i;
     d /= (double)(endTime - startTime);
     System.out.println("Throughput: "+d);
   }
 }

With this program, you can test with or without pipelining. Be sure to increase the number of iterations (N parameter) when pipelining is used, so that it runs for at least 10 seconds. If you uncomment the println in the loop, you will realize that the program is slow at the begining and will get quicker as the JIT starts to optimize things (that's why the program should run at least several seconds to give a meaningful result).

On my hardware (an old Athlon box), I can get 8-9 times more throughput when the pipeline is used. The program could be further improved by optimizing key/value formatting in the inner loop and adding a warm-up phase.

180

answered Oct 28 '22 19:10

Didier Spezia

Related questions
                            
                                Why can't I create an array of a generic type?
                            
                                What Java utilities exist for benchmarking a machine's CPU, memory, disk and network I/O performance?
                            
                                Why is a Scala companion object compiled into two classes(both Java and .NET compilers)?
                            
                                JAXB Bindings file: validation error
                            
                                Defining a a generic method for two subclasses that implement the same interface
                            
                                Convert Windows-style path into Unix path [closed]
                            
                                Returning an InputStream from a public method
                            
                                Binding to a specific IP address and port to receive UDP data
                            
                                How to keep HttpClient Connection Keep-Alive?
                            
                                singleton and public static variable Java
                            
                                IntelliJ IDEA 12.0 JVM Startup Error
                            
                                How to unmarshall repeated nested classes with JAXB?
                            
                                Generate random numbers except certain values
                            
                                How to start transaction and rollback with JOOQ?
                            
                                Adding additional metrics to Dropwizard
                            
                                How to make a line curve through points
                            
                                JRE Expiration Date
                            
                                I can not run C3P0ConnectionProvider
                            
                                Custom exception message using JUnit assertEquals?
                            
                                Why doesn't ArrayList.contains(Object.class) work for finding instances types?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

why it is so slow with 100,000 records when using pipeline in redis?

Tags:

java

redis

pipeline

znlyj

People also ask

1 Answers

Didier Spezia

Recent Activity

Donate For Us