Designing "social-feed" in DynamoDB

Question

This question might be relevant for any document based NoSQL database.

I'm making some interest specific social network and decided to go with DynamoDB because of scalability and no-pain-administration factors. There are only two main entities in database: users and posts.

Requirement for common queries are very simple:

Home feed (feed of people I'm following)
My/User feed (feed of mine, or specific user feed)
List of user I/user followed
List of followers

Here is a database scheme I come up with so far (legend: __thisIsHashKey and _thisIsRangeKey):

timeline = { // post 
    __usarname:"totocaster",
    _date:"1245678901345",
    record_type:"collection",
    items: ["2d931510-d99f-494a-8c67-87feb05e1594","2d931510-d99f-494a-8c67-87feb05e1594","2d931510-d99f-494a-8c67-87feb05e1594","2d931510-d99f-494a-8c67-87feb05e1594","2d931510-d99f-494a-8c67-87feb05e1594"],
    number_of_likes:123,
    description:"Hello, this is cool"
} 

timeline = { // new follower 
    __usarname:"totocaster",
    _date:"1245678901345",
    type:"follow",
    follower:"tamuna123"
}

timeline = { // new like 
    __usarname:"totocaster",
    _date:"1245678901345",
    record_type:"like",
    liker:"tamuna123",
    like_date:"123255634567456"
}

users = {
    __username:"totocaster",
    avatar_url:"2d931510-d99f-494a-8c67-87feb05e1594",
    followers:["don_gio","tamuna123","barbie","mikecsharp","bassman"],
    following:["tamuna123","barbie","mikecsharp"],
    likes:[
    {
        username:'barbie',
        date:"123255634567456"
    },
    {
        username:"mikecsharp",
        date:"123255634567456"
    }],
    full_name:"Toto Tvalavadze",
    password:"Hashed Key",
    email:"totocaster@myemailprovider.com"
}

As you can see I came-up storing all my post directly in timeline collection. This way I can query for posts using date and username (hash and range keys). Everything seems fine, but here is the problem:

I can not query for User-Timeline in one go. This will be one of the most demanded queries by system and I can not provide efficient way to do this. Please help. Thanks.

Thierry · Accepted Answer

I happen to work with news feeds daily. (Author of Stream-Framework and founded getstream.io)

The most common solutions I see are:

Cassandra (Instagram)
Redis (expensive, but easy)
MongoDB
DynamoDB
RocksDB (Linkedin)

Most people use either fanout on write or fanout on read. This makes it easier to build a working solution, but it can get expensive quickly. Your best bet is to use a combination of those 2 approaches. So do a fanout on write in most cases, but for very popular feeds keep them in memory.

Stream-Framework is open source and supports Cassandra/Redis & Python

getstream.io is a hosted solution build on top of Go & Rocksdb.

If you do end up using DynamoDB be sure to setup the right partition key: https://shinesolutions.com/2016/06/27/a-deep-dive-into-dynamodb-partitions/

Also note that a Redis or DynamoDB based solution will get expensive pretty quickly. You'll get the lowest cost per user by leveraging Cassandra or RocksDB.

ryan1234 · Answer

I would check out the Titan graph database (http://thinkaurelius.github.com/titan/) and Neo4j (http://www.neo4j.org/).

I know Titan claims to scale pretty well with large data sets.

Ultimately I think your model maps well to a graph. Users and posts would be nodes, and then you can connect them arbitrarily via edges. A user (node) is a friend (edge) of another user (node).

A user (node) has many posts (nodes) in their timeline. Then you can run interesting traversals via the graph.

Designing "social-feed" in DynamoDB

Tags:

database

amazon-web-services

database-schema

database-design

amazon-dynamodb

totocaster

2 Answers

Thierry

ryan1234

Recent Activity

Donate For Us

Designing "social-feed" in DynamoDB

Tags:

database

amazon-web-services

database-schema

database-design

amazon-dynamodb

totocaster

2 Answers

Thierry

ryan1234

Related questions

Recent Activity

Donate For Us