Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is better datastructure to store user profiles in redis?

I want to store user's profiles in redis, as I have to frequently read multiple user's profiles.. there are two options I see at present:

Option 1: - store separate hash key per user's profile

  • [hash] - u1 profile {id: u1, name:user1, email:[email protected], photo:url}
  • [hash] - u2 profile {id: u2, name:user2, email:[email protected], photo:url}
  • where for every user's id is hash key and profile field and values JSON-serialized profile objects. (OR instead of json user field-value pairs)

Option 2: - use single hash key to store all users profile

  • [hash] - users-profile u1 {id: u1, name:user1, email:[email protected], photo:url}
  • [hash] - users-profile u2 {id:u2, name:user2, email:[email protected], photo:url}
  • where in users-profile hash key, user's ids field and values JSON-serialized profile objects.

Please tell me which option is best considering following:

  1. performance
  2. memory utilization
  3. read multiple user's profile - for batch processing I should able to read 1-100, 101-200 user's profile at time
  4. larger dataset - what if there are millions users profile
like image 337
Suyog Kale Avatar asked Oct 18 '22 18:10

Suyog Kale


1 Answers

As Sergio Tulentsev pointed out, its not good to store all the user's data (especially if the dataset is huge) inside one single hash by any means.

Storing the users data as individual keys is also not preferred if your looking for memory optimization as pointed out in this blog post

Reading the user's data using pagination mechanism demands one to use a database rather than a simple caching system like redis. Hence it's recommended to use a NoSQL database such as mongoDB for this.

But reading from the database each time is a costly operation especially if you're reading a lot of records.

Hence the best solution would be to cache the most active user's data in redis to eliminate the database fetch overhead.

I recommend you looking into walrus .

It basically follows the following pattern:

@cache.cached(timeout=expiry_in_secs)
def function_name(param1, param2, ...., param_n):
    # perform database fetch
    # return user data

This ensures that the frequently accessed or requested user data is in redis and the function automatically returns the value from redis than making the database call. Also the key is expired if not accessed for a long time.

You set it up as follows:

from walrus import *
db = Database(host='localhost', port=6379, db=0)

where host can take the domain name of the redis cluster running remotely.

Hope this helps.

like image 56
Adarsh Avatar answered Oct 21 '22 03:10

Adarsh