Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data Migrations and AppEngine

I've done a lot of development in rails, and am looking into developing projects using python & app engine.

From the demo project and what I've seen so far, I've got a question/concern about app engine projects:

How is data migration handled in app-engine? For example, if I change the name of an entity/table (ex: Texts to Documents), or change a column in an existing table (ex: age to dob) - how does it handle old data when this happens?

Thanks!

like image 210
stringo0 Avatar asked Jul 31 '11 21:07

stringo0


People also ask

What type of service is App Engine?

Google App Engine (GAE) is a platform-as-a-service product that provides web app developers and enterprises with access to Google's scalable hosting and tier 1 internet service.

What is GCP App Engine?

App Engine is a fully managed, serverless platform for developing and hosting web applications at scale. You can choose from several popular languages, libraries, and frameworks to develop your apps, and then let App Engine take care of provisioning servers and scaling your app instances based on demand.

What are the deployment technologies currently supported by App Engine?

App Engine offers automatic scaling for web applications—as the number of requests increases for an application, App Engine automatically allocates more resources for the web application to handle the additional demand. Google App Engine primarily supports Go, PHP, Java, Python, Node. js, .

What is App Engine standard environment?

The App Engine standard environment is based on container instances running on Google's infrastructure. Containers are preconfigured with one of several available runtimes. The standard environment makes it easy to build and deploy an application that runs reliably even under heavy load and with large amounts of data.


1 Answers

The short answer is: It doesn't handle it. You can't change the name of an entity, you can change a property but you'll have to update the data manually.

Your Model definitions are just your applications "view" of how to interpret the entities stored in the datastore. If I had a definition like:

class MyEntity(db.Model):
    text = db.TextProperty()

And run my application for a while filling up the text property of my enties, Then later renamed the column to:

class MyEntity(db.Model):
    description = db.TextProperty()

All my existing data would stay exactly as it was (lots of entities in the datastore with populated text properties. Only when I tried to load the entities into my model instances I would only see them as empty entities (with no description set, and no way to access the text data that currently exists). Saving (Putting) my entity back into the datastore would then overwrite the old data, and the data would be lost.

If you make changes to your schema like this, or more likely just changing a field type. It will be up to you to pre-process your data to handle the changes. The model-layer will raise errors if you try and load an entity that no-longer conforms to your model definitions.

To help with this manual task of updating your data the weapons of choice are:

  1. remote_api / remote_api_shell.py
  2. The mapreduce library (especially the "mapper" part)

With the remote_api[1] setup you can open an interactive Python session to your live data, and run scripts locally (mostly) as if they are running directly on the production servers. I find this is the fastest easiest way to fix/cleanup data for smallish one-off tasks.

The mapper api[2] could be employed if you have a much larger task, say altering millions of entities and want to take advantage of doing as much of this in parallel as possible.

like image 153
Chris Farmiloe Avatar answered Sep 30 '22 15:09

Chris Farmiloe