We have a Java application running on google app engine. Having a kind called Contact. following is the sample schema
Contact
{
long id
String firstName
String lastName
...
}
The above is the existig model,for supporting few requirements we are storing this object both in datastore and text search
Now we want to integrate contacts with their page views data.
Each contact can have thousands of page views records or even millions for some contacts
Following is the sample page visit object [Note : We don't have this object as of now, this is just give information about page visit]
PageVisit
{
long id
String url
String refUrl
int country
String city
....
}
We have a requirement , which needs a query on contact core properties and his page visited data
for ex :
select * from Contact where firstName = 'abc' and url = 'cccccc.com';
select * from Contact where firstName = 'abc' or url = 'cccccc.com';
To write this kind of queries we need both contact core properties and their page visited need to available in Contact object itself but contact can have huge number page views. So this will cross entity maximum size limit
So how to design contact model in this kind of situation both in datastore and text search.
Thanks
Cloud Datastore doesn't support joins, so you will need to handle this in some manner from the client code.
2 possible ways to handle this are:
Denormalize the Contact you need to search into PageVisit:
PageVisit
{
long id
String firstName // Denormalized from Contact
String url
String refUrl
int country
String city
....
}
This requires you to create a composite index:
- kind: PageVisit
ancestor: no
properties:
- name: firstName
- name: url
Or run multiple queries
select id from Contact where firstName = 'abc'
select * from PageVisit where contactId={id} and url = 'cccccc.com';
select * from PageVisit where contactId={id} or url = 'cccccc.com';
This requires you to create a composite index:
- kind: PageVisit
ancestor: no
properties:
- name: contactId
- name: url
Final aside: Depending on how large your site is, it might be worth looking into Cloud Bigtable for the PageView data. It's a better solution for high write OLAP-style workloads.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With