Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

External Backups/Snapshots for Google Cloud Spanner

Is it possible to snapshot a Google Cloud Spanner Database/table(s)? For compliance reasons we have to have daily snapshots of the current database that can be rolled back to in the event of a disaster: is this possible in Spanner? Is there intention to support it if not?

For those who might ask why we would need it as Spanner is replicated/redundant etc - it doesn't guard against human error (dropping a table by accident) or sabotage/espionage hence the question and requirement.

Thanks, M

like image 630
user3707 Avatar asked Feb 28 '17 21:02

user3707


People also ask

Does Google use spanner internally?

History. Spanner was first described in 2012 for internal Google data centers. Spanner's SQL capability was added in 2017 and documented in a SIGMOD 2017 paper. It became available as part of Google Cloud Platform in 2017, under the name "Cloud Spanner".

Where are GCP snapshots stored?

Snapshots can be stored in either one Cloud Storage multi-regional location, such as asia , or one Cloud Storage regional location, such as asia-south1 . A multi-regional storage location provides higher availability and might reduce network costs when creating or restoring a snapshot.

How do GCP snapshots work?

How Google Cloud Snapshots Work. Google Cloud lets you take snapshots of persistent disks attached to your instances. A snapshot is an incremental copy of your data—the first snapshot contains all the data, while the next snapshots only save data blocks that changed in the interim.


2 Answers

Google Cloud Spanner now has two methods that can be used to do backups.

https://cloud.google.com/spanner/docs/backup

You can either use the built-in backups or do an export/import using a dataflow job.

like image 167
onionjake Avatar answered Oct 21 '22 14:10

onionjake


Today, you can stream out a consistent snapshot by reading out all the data using your favorite tool (mapreduce, spark, dataflow) and reads at a specific timestamp (using Timestamp Bounds).

https://cloud.google.com/spanner/docs/timestamp-bounds

You have about an hour to do the export before the data gets garbage collected.

In the future, we will provide a Apache Beam/Dataflow connector to do this in a more scalable fashion. This will be our preferred method for doing import/export of data into Cloud Spanner.

Longer term, we will support backups and the ability to restore to a backup but that functionality is not currently available.

like image 21
Dominic Preuss Avatar answered Oct 21 '22 12:10

Dominic Preuss