i got a redis hash eg
key field value
1000 state "ca"
1000 zip "95054"
2000 state "ca"
2000 zip "95050"
3000 state "ny"
3000 zip "12345"
how can i answer questions like how many state are "CA" . i need to do count of values for a field. is it possible ? help would be appreciated.
-Avi
I think you need to store a separate count for the unique values. You can get the length of a single hash, but not when you in this case got three different keys.
command key field value
HSET 1000 state "ca"
HSET 1000 zip "95054"
INCR ca
HSET 2000 state "ca"
HSET 2000 zip "95050"
INCR ca
HSET 3000 state "ny"
HSET 3000 zip "12345"
INCR ny
If you want to get how many hash keys with state "ca" use:
GET ca
Assuming that you are in charge of inserting values in the hash, then use MULTI/EXEC to wrap hash insertions and appropriate INCRS. So take ptzOn's approach but don't forget to wrap it all up in MULTI/EXEC to have atomic semantics. Do the same thing on hash deletes and use a DECR.
I'm going to assume that the data above is a list of cities. Really, you might consider storing these like...
key value
city:1000:state "ca"
city:1000:zip "95054"
city:2000:state "ca"
city:2000:zip "95050"
Now back to your question... In SQL, you might want to do something like: SELECT Count(*) FROM cities WHERE state='CA'. Or maybe if you wanted the count for each state... SELECT state, COUNT(*) FROM cities GROUP BY state.
In my opinion, this is something that Redis struggles with a bit, but it gives you the ability to re-think your database design. If you plan on executing a query like this one a lot, consider creating a new Sorted Set of states. The score would be your count in this case. Use ZINCRBY on this set (let's call it city_count_by_state) when you add/remove records to/from your cities "table". Now your query is something like: ZRANK city_count_by_state 'CA'. Adding/removing/getting rank on a sorted set runs in O(log(n)) time.
If you don't want waste memory with a sorted set, you can use KEYS to return a large list of all of your keys in the cities "table". Then, your program can iterate through these keys automatically and count how many cities have a state of 'CA'. This is a bit of manual labor, but it will still run in O(n) time.
If you already have city data pre-populated in your database, you can generate your sorted set using KEYS, as well (see the method outlined above). Hopefully that helps!
See:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With