Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Objectify Relationships: One-to-Many, Can I do this efficiently?

I'm fairly new to Objectify, and I had a quick question for the best way to do something:

Lets say I have an application that allows people to send and receive messages (think e-mail for simplicity). When my app loads, I don't want to load every single message from every single contact that's sent a message to a given user. That would be a waste. Instead, I want to load all of the contacts that a user has messages from (read or unread) so that I can display a list of the contacts on my app, and when the user clicks on a given contact I want to load all of the messages from that contact to display to the user.

I can't find a good way of doing this without loading all of the Messages for an account. I read the Objectify wiki on many-to-one relationships, and I still can't think of a good way to do this that isn't extremely inefficient. It would seem for the way that the objectify site recommends, that I would have to load all of the messages for a given user, and then parse them for unique contacts.

I'm trying to use as few App Engine Reads, and Writes as possible, and where possible I'm trying to use Smalls instead of Reads (Overall cost to run my app is a big concern of mine while I'm making this).

On Objectify, how should I be doing this?

like image 734
spierce7 Avatar asked Feb 16 '12 09:02

spierce7


1 Answers

This is copied from my response on the objectify-appengine google group: https://groups.google.com/forum/?fromgroups#!topic/objectify-appengine/LlOyRJZRbnk

There are three main options when dealing with "aggregation data" like what you describe:

1) Calculate it when you need it

You've concluded, rightly I think, that this is too expensive.

2) Calculate it at batch intervals and store this result

Not very satisfying since it involves a delay. Plus you don't want to comb through your entire database every night.

3) Update the aggregation when the data changes

This approach involves a little more work every time the data changes, but it's almost certainly what you want to do.

Create some sort of collection of contacts for each user. When a message arrives, make sure a sender contact exists for that recipient. Maybe you also want to delete the contact when the recipient deletes the last message from a sender.

Be careful not to bump into entity group transaction rate limits (one write per second). I'll walk you through some options:

1) You could store a list of contacts in each recipient:

class Person {
   @Id Long id;
   Set<Key<Person>> contacts;
}

This would be a distinct problem if, say, the recipient received mail from 20 new people all at once. This is almost certainly a bad idea. On the other hand, it's blazingly fast and efficient to look up who your contacts are. A minor improvement would be to move this into a separate entity parented by the person so you aren't always loading that data:

class Contacts {
   @Parent Key<Person> owner;
   @Id long id = 1;   // there's only ever one of these per person, and it should have a predictable key for fetching
   Set<Key<Person>> contacts;
}

Of course, the Set in a single entity gives you a 50,000 entry limit. It might be slightly smaller than this if you hit the 1M entity size limit first. If your keys are ~20 chars, it'll be about the same. If this is an issue you could allow multiple Contacts entities, at which point you have something that looks like the Relation Index Entity pattern from Brett Slatkin's 2009 Google I/O talk: http://www.youtube.com/watch?v=AgaL6NGpkB8

2) You could store a list of contacts in the other direction

class Person {
   @Id Long id;
   @Index Set<Key<Peson>> contactOf;
}

This makes it a bit more expensive to find out who your contacts are - you need a keys-only query, not a simple get-by-key. But you aren't really limited by the entity write rate anymore. People probably don't send more than one message per second, and if they send out 1000 messages in bulk, you can update the contactOf in a single transaction.

As above, you probably want to move this index into a separate entity:

class Contacts {
   @Parent Key<Person> person;
   @Id long id = 1;   // there's only ever one of these per person, and it should have a predictable key for fetching
   Set<Key<Person>> of;
}

3) You could also store these contacts in a completely separate entity

class Contact {
   @Parent Key<Person> person;
   @Id Long id;
   @Index Key<Person> owner;
}

This is really just a less-space-efficient way of doing solution #2.

The important thing is to keep updating this structure when every message is sent or received.

like image 60
stickfigure Avatar answered Sep 29 '22 11:09

stickfigure