Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to get count of values in redis hash?

Tags:

redis

i got a redis hash eg

key field value

1000 state "ca" 
1000 zip "95054"
2000 state "ca" 
2000 zip "95050"
3000 state "ny" 
3000 zip "12345"

how can i answer questions like how many state are "CA" . i need to do count of values for a field. is it possible ? help would be appreciated.

-Avi

like image 516
Avinash Avatar asked Apr 28 '11 21:04

Avinash


3 Answers

I think you need to store a separate count for the unique values. You can get the length of a single hash, but not when you in this case got three different keys.

command key field value

HSET 1000 state "ca"

HSET 1000 zip "95054"

INCR ca

HSET 2000 state "ca"

HSET 2000 zip "95050"

INCR ca

HSET 3000 state "ny"

HSET 3000 zip "12345"

INCR ny

If you want to get how many hash keys with state "ca" use:

GET ca

like image 187
ptz0n Avatar answered Nov 15 '22 12:11

ptz0n


Assuming that you are in charge of inserting values in the hash, then use MULTI/EXEC to wrap hash insertions and appropriate INCRS. So take ptzOn's approach but don't forget to wrap it all up in MULTI/EXEC to have atomic semantics. Do the same thing on hash deletes and use a DECR.

like image 43
alphazero Avatar answered Nov 15 '22 11:11

alphazero


I'm going to assume that the data above is a list of cities. Really, you might consider storing these like...

key              value
city:1000:state  "ca"
city:1000:zip    "95054"
city:2000:state  "ca"
city:2000:zip    "95050"

Now back to your question... In SQL, you might want to do something like: SELECT Count(*) FROM cities WHERE state='CA'. Or maybe if you wanted the count for each state... SELECT state, COUNT(*) FROM cities GROUP BY state.

In my opinion, this is something that Redis struggles with a bit, but it gives you the ability to re-think your database design. If you plan on executing a query like this one a lot, consider creating a new Sorted Set of states. The score would be your count in this case. Use ZINCRBY on this set (let's call it city_count_by_state) when you add/remove records to/from your cities "table". Now your query is something like: ZRANK city_count_by_state 'CA'. Adding/removing/getting rank on a sorted set runs in O(log(n)) time.

If you don't want waste memory with a sorted set, you can use KEYS to return a large list of all of your keys in the cities "table". Then, your program can iterate through these keys automatically and count how many cities have a state of 'CA'. This is a bit of manual labor, but it will still run in O(n) time.

If you already have city data pre-populated in your database, you can generate your sorted set using KEYS, as well (see the method outlined above). Hopefully that helps!

See:

  • http://redis.io/commands#sorted_set
  • http://redis.io/commands/keys
like image 38
BMiner Avatar answered Nov 15 '22 11:11

BMiner