Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redis slow query with pipelined hgetall

So I have a small and simple Redis database. It contains 136689 keys whose values are hash maps containing 27 fields. I'm accessing the table through the Python interface on the server node, and need to load about 1000-1500 values per call (eventually I will see around 10 requests per second). A simple call looks something like this:

# below keys is a list of approximately 1000 integers, 
# not all of which are in the table

import redis
db = redis.StrictRedis(
  host='127.0.0.1',
  port=6379,
  db=0,
  socket_timeout=1,
  socket_connection_timeout=1,
  decode_responses=True
)

with db.pipeline() as pipe:
  for key in keys: 
    pipe.hgetall(key)
  results = zip(keys,pipe.execute())

The overall time for this is ~328ms with an average time per request of ~0.25ms.

Question: This is very slow for a small database and relatively few queries per second. Is there something wrong with my configuration or the way I'm calling the server? Can something be done to make this faster? I don't expect the table to get much bigger so I'm perfectly happy sacrificing disk space for speed.


Additional Information

Calling hget on each key (without the pipeline) is slower (as expected) and reveals the the time distribution is bimodal. The smaller peak corresponds to keys that are not in the table and the bigger one corresponds to keys that are.

Redis response time distribution

My conf file is as follows:

port 6379
daemonize yes 
save ""
bind 127.0.0.1
tcp-keepalive 300 
dbfilename mytable.rdb
dir .
rdbcompression yes 

appendfsync no
no-appendfsync-on-rewrite yes 
loglevel notice

I launch the server with:

> echo never > /sys/kernel/mm/transparent_hugepage/enabled
> redis-server myconf.conf

I've also measured the intrinsic latency with redis-cli --intrinsic-latency 100 which gives:

Max latency so far: 1 microseconds.
Max latency so far: 10 microseconds.
Max latency so far: 11 microseconds.
Max latency so far: 12 microseconds.
Max latency so far: 18 microseconds.
Max latency so far: 32 microseconds.
Max latency so far: 34 microseconds.
Max latency so far: 38 microseconds.
Max latency so far: 48 microseconds.
Max latency so far: 52 microseconds.
Max latency so far: 60 microseconds.
Max latency so far: 75 microseconds.
Max latency so far: 94 microseconds.
Max latency so far: 120 microseconds.
Max latency so far: 281 microseconds.
Max latency so far: 413 microseconds.
Max latency so far: 618 microseconds.

1719069639 total runs (avg latency: 0.0582 microseconds / 58.17 nanoseconds per run).
Worst run took 10624x longer than the average latency.

This suggests that I should be able to get much better latency. However, when I check the server latency with: > redis-cli --latency -h 127.0.0.1 -p 6379 I get min: 0, max: 2, avg: 0.26 (2475 samples)

This seems to suggest that ~0.25ms is the latency for my server, but that seems suggest that the latency per request I'm seeing from Python is the same as the CLI, but it all seems very very slow.

The hashmap associated to each key (after decoding) has a size of ~1200 bytes. So I ran the following benchmark

redis-benchmark -h 127.0.0.1 -p 6379 -d 1500 hmset hgetall myhash rand_int rand_string
====== hmset hgetall myhash rand_int rand_string ======
  100000 requests completed in 1.45 seconds
  50 parallel clients
  1500 bytes payload
  keep alive: 1

100.00% <= 1 milliseconds
100.00% <= 1 milliseconds
69060.77 requests per second

This seems to support that my latency is very high, but doesn't really tell me why.

like image 963
Paul Avatar asked Oct 31 '19 22:10

Paul


People also ask

What is Hgetall in Redis?

Redis - Hash Hgetall Command Redis HGETALL command is used to get all the fields and values of the hash stored at the key. In the returned value, every field name is followed by its value, so the length of the reply is twice the size of the hash.

What does Hgetall return?

0 hGetAll returns an empty object instead of null if the key doesn't exist #2161. Closed.

What is Hmget Redis?

Redis HMGET command is used to get the values associated with the specified fields in the hash stored at the key. If the field does not exist in Redis hash, then a nil value is returned.


1 Answers

One of the conclusion that I got from the way I was using Redis is that we should not store each transaction inside one hash. As in one transaction one hash.

For each hget request we have a network connexion that is slowing the query down.

I think the way Redis is design it would be faster to store everything inside one hash, As in all transaction under the same hash.

Furthermore the granular data could be store in each key:values as JSON.

The time I got to retrieve all hashes vs the time I got from retrieving all values stored inside one hash is for 140mb worth of data:

  • 3 seconds of iterating each hash and getting its key:values vs
  • 0,008 seconds for getting one hash and then searching a key:value inside that hash, vs
  • 0,008 seconds to get all data stored under one hash.

Instead of having 1 000 000 000 iteration (if you have 1 000 000 000 hashes) in your for iteration, here with the proposed solution you only have 1 (more if you can segregate your data based on an intrinsic value), hence reducing significantly the query time.

like image 95
Pelican Avatar answered Sep 18 '22 19:09

Pelican