Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I export data from Google App Engine High Replication datastore?

I am looking into using Google App Engine for a project and would like make sure I have a way to export all my data if I ever decide to leave GAE (or GAE shuts down).

Everything I search about exporting data from GAE points to https://developers.google.com/appengine/docs/python/tools/uploadingdata. However, that page contains this note:

Note: This document applies to apps that use the master/slave datastore. If your app uses the High Replication datastore, it is possible to copy data from the app, but Google does not currently support this use case. If you attempt to copy from a High Replication datastore, you'll see a high_replication_warning error in the Admin Console, and the downloaded data might not include recently saved entities.

The problem is that recently the master/slave datastore was recently deprecated in favor of the High Replication datastore. I understand that the master/slave datastore is still supported for a little while, but I don't feel comfortable using something that has officially been deprecated and is on its way out. So that leaves me with the High Replication datastore and the only way it seems to export the data is the method above that is not officially supported (and thus does not provide me with a guarantee that I can get my data out).

Is there any other (officially supported) way of exporting data from the High Replication datastore? I don't feel comfortable using Google App Engine if it means my data could be locked in there forever.

like image 933
Pixel Elephant Avatar asked Oct 22 '22 18:10

Pixel Elephant


2 Answers

It took me quite a long time to setup the download of data from GAE as the documentation is not as clear as it should be.

If you extracting data from a Unix server, you maybe could reuse the script below.

Also, if you do not provide the "config_file" parameter, it will extract all your data for this kind but in a proprietary format which can only be used for restoring data afterwards.

#!/bin/sh
#------------------------------------------------------------------
#-- Param 1 : Namespace
#-- Param 2 : Kind (table id)
#-- Param 3 : Directory in which the csv file should be stored
#-- Param 4 : output file name
#------------------------------------------------------------------
appcfg.py download_data --secure --email=$BACKUP_USERID --        config_file=configClientExtract.yml --filename=$3/$4.csv --kind=$2 --url=$BACKUP_WEBSITE/remote_api --namespace=$1 --passin <<-EOF $BACKUP_PASSWORD EOF
like image 75
Hugues Avatar answered Oct 27 '22 09:10

Hugues


Currently app engine datastore supports another option also. Data backup provision can be used to copy selected data into blob store or google cloud storage. This function is available under datastore admin area in app engine console. If required, the backed up data can then be downloaded from the blob viewer or cloud storage. For doing the backup for high replication datastore, it is recommended that datastore writes are disabled before taking the backup.

like image 37
tony m Avatar answered Oct 27 '22 11:10

tony m