Effective implementation of one-to-many relationship with Python NDB

Question

I would like to hear your opinion about the effective implementation of one-to-many relationship with Python NDB. (e.g. Person(one)-to-Tasks(many))

In my understanding, there are three ways to implement it.

Use 'parent' argument
Use 'repeated' Structured property
Use 'repeated' Key property

I choose a way based on the logic below usually, but does it make sense to you? If you have better logic, please teach me.

Use 'parent' argument
- Transactional operation is required between these entities
- Bidirectional reference is required between these entities
- Strongly intend 'Parent-Child' relationship
Use 'repeated' Structured property
- Don't need to use 'many' entity individually (Always, used with 'one' entity)
- 'many' entity is only referred by 'one' entity
- Number of 'repeated' is less than 100
Use 'repeated' Key property
- Need to use 'many' entity individually
- 'many' entity can be referred by other entities
- Number of 'repeated' is more than 100

No.2 increases the size of entity, but we can save the datastore operations. (We need to use projection query to reduce CPU time for the deserialization though). Therefore, I use this way as much as I can.

I really appreciate your opinion.

dragonx · Accepted Answer

A key thing you are missing: How are you reading the data?

If you are displaying all the tasks for a given person on a request, 2 makes sense: you can query the person and show all his tasks.

However, if you need to query say a list of all tasks say due at a certain time, querying for repeated structured properties is terrible. You will want individual entities for your Tasks.

There's a fourth option, which is to use a KeyProperty in your Task that points to your Person. When you need a list of Tasks for a person you can issue a query.

If you need to search for individual Tasks, then you probably want to go with #4. You can use it in combination with #3 as well.

Also, the number of repeated properties has nothing to do with 100. It has everything to do with the size of your Person and Task entities, and how much will fit into 1MB. This is potentially dangerous, because if your Task entity can potentially be large, you might run out of space in your Person entity faster than you expect.

Sudhir Jonathan · Answer

One thing that most GAE users will come to realize (sooner or later) is that the datastore does not encourage design according to the formal normalization principles that would be considered a good idea in relational databases. Instead it often seems to encourage design that is unintuitive and anathema to established norms. Although relational database design principles have their place, they just don't work here.

I think the basis for the datastore design instead falls into two questions:

How am I going to read this data and how do I read it with the minimum number of read operations?
Is storing it that way going to lead to an explosion in the number of write and indexing operations?

If you answer these two questions with as much foresight and actual tests as you can, I think you're doing pretty well. You could formalize other rules and specific cases, but these questions will work most of the time.

Effective implementation of one-to-many relationship with Python NDB

Tags:

python

google-app-engine

app-engine-ndb

Chikashi Kato

2 Answers

dragonx

Sudhir Jonathan

Recent Activity

Donate For Us

Effective implementation of one-to-many relationship with Python NDB

Tags:

python

google-app-engine

app-engine-ndb

Chikashi Kato

2 Answers

dragonx

Sudhir Jonathan

Related questions

Recent Activity

Donate For Us