Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GAE - How to live with no joins?

Example Problem:

Entities:

  • User contains name and a list of friends (User references)
  • Blog Post contains title, content, date and Writer (User)

Requirement:

I want a page that displays the title and a link to the blog of the last 10 posts by a user's friend. I would also like the ability to keep paging back through older entries.

SQL Solution:

So in sql land it would be something like:

select * from blog_post where user_id in (select friend_id from user_friend where user_id = :userId) order by date

GAE solutions i can think of are:

  • Load user, loop through the list of friends and load their latest blog posts. Finally merge all the blog posts to find the latest 10 blog entries
  • In a blog post have a list of all users that have the writer as a friend. This would mean a simple read but would result in quota overload when adding a friend who has lots of blog posts.

I don't believe either of these solutions will scale.

Im sure others have hit this problem but I've searched, watched google io videos, read other's code ... What am i missing?

like image 607
Sam Avatar asked Jan 15 '09 06:01

Sam


2 Answers

If you look at how the SQL solution you provided will be executed, it will go basically like this:

  1. Fetch a list of friends for the current user
  2. For each user in the list, start an index scan over recent posts
  3. Merge-join all the scans from step 2, stopping when you've retrieved enough entries

You can carry out exactly the same procedure yourself in App Engine, by using the Query instances as iterators and doing a merge join over them.

You're right that this will not scale well to large numbers of friends, but it suffers from exactly the same issues the SQL implementation has, it just doesn't disguise them as well: Fetching the latest 20 (for example) entries costs roughly O(n log n) work, where n is the number of friends.

like image 126
Nick Johnson Avatar answered Sep 21 '22 16:09

Nick Johnson


This topic is covered in a Google io talk: http://code.google.com/events/io/sessions/BuildingScalableComplexApps.html

Basically the Google team suggest using list properties and what they call relational index entities, an example application can be found here: http://pubsub-test.appspot.com/

like image 35
Sam Avatar answered Sep 21 '22 16:09

Sam