Efficient NoSQL modeling in Google App Engine Datastore

Tags:

I'm writing an app on Google App Engine to help me learn it better. I'm persisting my data in the Datastore.

The application is models similar to StackOverflow: You have a Story entity, which has a collection of Comment entities, which in turn can be liked/hated by many users. The way I'm modeling this right now is as follows:

class Story {
    Comment[] comments;
    ...
}

class Comment {
    User[] likes;
    User[] hates;
    ...
}

So when you load a given story, you can list all the comments, plus the percentage of likes and hates for each comment. You can also keep track of whether or not a given user has voted for a comment or not.

I'm assuming I can lazy load all the actual users in the Comment entity, but even then, I kind of get the idea that there's a better way of doing this.

How would this handle a story with hundreds of comments, each with hundreds of thousands of votes?!

What is a common way of modeling such a concept in NoSQL?

286

asked Jan 06 '13 01:01

rodrigo-silveira

1 Answers

Possible answers:

(1) How would this handle hundreds of comments?

You seemed to already answer this by suggesting that you lazy load the comments in the UI. I know document databases like Mongo and CouchDB give you the option to page data as it comes out of the database. Things like "limit" and "skip".

Hundreds of comments shouldn't be too hard to store and I wouldn't imagine they'd be slow in a query.

(2) How to handle hundreds of thousands of votes?

I think the best way is to simply pre-process this. When a user votes on something, you might consider doing two operations: 1) Increment the comment's like counter by one. 2) Write a record of the users vote somewhere else.

The first step would be very fast and easy and it would show users the total number of likes immediately.

The second operation (storing what a user did - which comment they liked/disliked) might be a bit slower, but you can easily do it.

It's important to keep in mind that with NoSQL we aren't worried about normalizing the data, so redundant information is ok!

(3) What is the common way of modeling these concepts?

Like I mentioned from (2) - and from my experience - a good way to model is to increment items quickly and to also store redundant information.

It's especially useful to store data many times in various documents because joining in things like Mongo and Couch are very difficult to do. It's best to store that information next to the entity that needs it.

Another quality of NoSQL databases is that they are allowed to be inconsistent. It's ok to have a comment like/dislike count be one number in the comments section and a different number when looking at what the user has liked/disliked.

(The only note about your model that might be scary is splitting entities. Always remember if you split things up - the way you would in a traditional RDMS - you'll have to join them later! That can very tough with NoSQL.)

104

answered Sep 27 '22 18:09

ryan1234

Related questions
                            
                                Cannot deploy on App Engine with app engine SDK
                            
                                Spring Boot application in Google App Engine can't connect to Cloud SQL
                            
                                Uploading images along with Google App Engine
                            
                                GAE + Django. app-engine-patch or django-gae-helpers?
                            
                                When should I be responding to HTTP HEAD requests on my website
                            
                                Maven + Grails + App Engine
                            
                                GAE Datastore backup
                            
                                Many Custom Domains for AppEngine Instance
                            
                                How to store and search GeoData in AppEngine?
                            
                                Querying for entities with missing properties in app engine Datastore?
                            
                                Eclipse Google -App -Engine "Will NOT Enhance"
                            
                                Find duplicates in app engine datastore
                            
                                How to create my own Web API / Web Service
                            
                                Directing email for a domain to AppEngine email receiving service?
                            
                                Google Cloud Storage authentication for App Engine
                            
                                How to create a query for matching keys?
                            
                                Importing Google Calendar data (via API v3) to Google App Engine with Java
                            
                                Updating a large number of entities in a datastore on Google App Engine
                            
                                Editing Google docs with drive API
                            
                                How to implement auto suggestion (auto complete) functionality in GAE

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Efficient NoSQL modeling in Google App Engine Datastore

Tags:

data-modeling

nosql

google-app-engine

google-cloud-datastore

rodrigo-silveira

People also ask

1 Answers

ryan1234

Recent Activity

Donate For Us