Here's the situation : With Heroku & Postgres, you can have automatically generated backups dump file. But what can you do with it?
I found myself so much struggling about latter point that I wanted to share how I have done it.
How to restore specific data from previous backup on Postgres Heroku?
Heroku is built and runs on AWS. With Heroku being an AWS-centric Platform as a Service, we consider offsite to mean “not stored on AWS.” The default Heroku Postgres backups are stored on AWS using the S3 service.
If you use the --clean option of pg_restore , the old tables will be dropped before the new ones are created. If you do not use the --clean option, you will get an error message that the table already exists, but pg_restore will continue processing unless you use the --exit-on-error option.
In 3 steps you'll be able to execute very simply:
INSERT INTO production_db.table_name
SELECT * FROM backup_db.table_name -- backup_db being remote
First install the backup locally, second get a SQL script, third open your localhost to the outside world with ngrok.
psql
will also do (by reading this post by example)Create a database
. Then right click on it and use the restore
function. Select your dump file, click Restore
and you're all set : your backup data is available locally! Good job!I wanted to do the following:
SELECT * FROM backup_db.table_name
-- So I could then do
INSERT INTO production_db.table_name
SELECT * FROM backup_db.table_name
And I would be all set. Super easy, right? Pretty obvious? This must have been done hundreds of times already. Well, no!
There is a utility called db_link
in Postgres 9.1+, but it is pretty constraining as the following syntax applies:
SELECT fname, lname FROM db_link('host=localhost dbname=backup-28-08', 'SELECT fname, lname FROM users') AS remote (varchar255 fname varchar255 lname)
Every column name needs to be repeated twice including its type. Pretty heavy, we are far from the simple SELECT * FROM backup_db.table_name
So the idea here is to use the information_schema
table content, which describes each table with its column names, its types etc. I found this question on SO: Specify dblink column definition list from a local existing type which helped me a lot (Thanks bentrm).
But its solution was a two steps process, first generating a function, then querying it:
SELECT dblink_star_func('dbname=ben', 'public', 'test');
SELECT * FROM star_test() WHERE data = 'success';
And I was still aiming at a 1 liner. After some little pain (not being a SQL Guru), here is the Gist : https://gist.github.com/augnustin/d30973ea8b5bf0067841
I now can do:
SELECT * FROM remote_db(NULL::users) -- (Still not 100% about why I need the NULL::)
-- And also
INSERT INTO users
SELECT * FROM remote_db(NULL::users)
Awesome, right?
If your remote database is already available from the internet (=has an IP address, a domain name Eg. for Heroku it will look like: ec2-54-217-229-169.eu-west-1.compute.amazonaws.com:5672/df68cfpbufjd9p
) you can skip this step. But if you use your local database, you need to make it available from the outside world (so that the Heroku database can access it).
For this, I use the wonderful ngrok.
Once installed I only need to enter the following command:
ngrok -proto=tcp 5432 #5432 being the default port for Postgresql. (Adapt if necessary)
Tunnel Status online
Version 1.7/1.6
Forwarding tcp://ngrok.com:51727 -> 127.0.0.1:5432
Web Interface 127.0.0.1:4040
# Conn 0
Avg Conn Time 0.00ms
And you'd only need to plug db_link
(in the gist) to host=ngrock.com port=51727
and you are good to go!
There are many possible improvements to this. Here are some I see already:
db_link
functionHope I was clear! Please ask for more details otherwise
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With