Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should I NOT use App Engine's Full Text Search API?

Tags:

So far, I've used App Engine's Full Text Search to help search through existing entities in my datastore. This involves creating at least one Document per entity, and linking the two together somehow. And every time I change the entity, I must change the corresponding Documents.

My question is, why not just store all my data in Documents and forget about Datastore entities? The search API supports a much richer query language that can handle multiple inequality filters and boolean operators, unlike the datastore.

Am I missing something about the design of the search API that would preclude using it to replace the Datastore entirely?

like image 476
Nick Farina Avatar asked Jun 08 '12 15:06

Nick Farina


2 Answers

According to the Java docs

However, an index search can find no more than 10,000 matching documents. The App Engine Datastore may be more appropriate for applications that need to retrieve very large result sets.

Though I don't see that as a common use case.

More realistically, getting entities by key will be a lot cheaper with the Datastore (presumably faster as well). With the search API, you can either use Index.get() to find a document by ID, or duplicate the ID by storing it in a field and searching on that field.

Here's a cost breakdown:

- Index.get():     $0.10 /  10,000 or 0.00001 per get
- Index.search():  $0.13 /  10,000 or 0.000013 per get
- Datastore get(): $0.06 / 100,000 or 0.0000006 per get

As you can see, a Datastore get is much cheaper than the Search API options (16x cheaper than Index.get()).

If your data is structured in a way that makes use of a lot of direct gets and few complex searches, the Datastore will be a clear winner in terms of cost.

Note: I did not include the extra cost for storing duplicate data with the Index.search() method, since that depends on how many entities you store.

like image 165
Pixel Elephant Avatar answered Nov 08 '22 20:11

Pixel Elephant


Just put the data in both - the storage is cheap and depending how much writes your app does it could be cheap to do updates as well. For easy queries and getting single entities by key - use memcache and datastore. For complex queries use search api. You'll have to make the tradeoff once pricing is announced.

like image 24
aloo Avatar answered Nov 08 '22 21:11

aloo