I'm trying to model a basic linear commenting system for my blog in App Engine (you can see it at http://codeinsider.us). My main classes of objects are:
Users, Articles, Comments
One user will have many comments and should be able to view their comments at a glance.
One article will have many comments and should be visible at a glance.
One comment will be associated with exactly one user and exactly one article.
I know how I might build this in a standard relational database - I might have, say, separate tables for comments, users, and articles, with foreign keys to tie them together, uniqueness constraints on articles and users, and none on comments, etc. Nothing fancy.
What's the best way of modeling this in Python App Engine with NDB? ndb.KeyProperty seems interesting, as does StructuredProperty. I don't think I can use StructuredProperty though, since a comment can "belong" to both a User and an Article. But with ndb.KeyProperty, it seems like the keyProperty doesn't do any checking or validation logic, so I'd have to implement that on my own.
The other thing I can do is just throw in the towel, and store giant JSON blobs in Users and Articles representing the Keys and Kinds of comments. That may not be a bad solution.
Any thoughts?
Edit: This is going to be high-read, low-write. I may add some engagement on comments (upvotes/downvotes), but even then, it will be heavily weighted towards reads.
I recommend to you thinking carefully on what features are you planning to provide since structuring your models in some way may difficult some changes in the future.
I will do this as follows:
First, assume some eventual consistency. No matter how you design this, you will have some eventual consistency in some queries.
Make a KeyProperty "owner" in article to store the user_key. If you want to achieve strong consistency when querying the articles of a single user then instead of using the "owner" KeyProperty just make the user_key the parent of the Article (this will create an entity group for the user and it's articles and is fine here).
With comments you can do more things.
If you expect less than 100 (depending on Article size on the datastore can be more) comments for each article create a comments KeyProperty(repeated=True) in Article to store all the comments keys and then get them with get_multi (strong consistency).
To create the comment and also modify the Article comments property you may need a transaction, because you will want to accomplish the two operations or non of them. But.. the two entities are not in the same entity group so: 1) use cross group transaction or 2) make the parent of the comment the Article (this second option will have some consequences discussed later) Counts of comments are easy but limited to 100 or more comments as said before.
Create a Comment ndb model with two KeyProperties, "owner" and "article". The article will fetch comments with a query. To query all the comments within an Article you will have eventual consistency unless you make the article the parent of the comment (in that case don't create the article KeyProperty of course). This approach allows lots of comments.
The problem of using entity groups are that for example, if you allow to vote on comments, then a single write operation on each comment will block any write in the hole entity group of the Article affected. So creation and voting by other users may be affected. But don't really care about this if you expect few votes and you keep entity groups small.
If you want to allow comment votes this can get quite complicated as you may want for example only one vote per user. This will require extra relationships that need to be thought before.
Personally I prefer to assume eventual consistency almost always.
More approaches are possible but I like this two.
High read, low write scenario is the specialty on GAE, so that's a good thing for your purpose.
I'd take advantage of the ancestry feature of GAE Model as it assures you transactional/atomic operations within an entity group. I guess you don't need much of that but it's a good thing to have still.
The right structure is determined by the way you are going to treat/use your data. I'm assuming the typical case in your blog would be to show comments for an article, thus, I'd make your comment model a child of your article model - you could then query comments for a certain (article) ancestor and that would scale magnificently.
I'd include a KeyProperty for the author on the comment, as that would be used mainly to fetch a user from the key I assume. If you want to extend KeyProperty functionality you can do so. Here's an example on how to make KeyProperty behave as ReferenceProperty used to in db. (point 1.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With