I've run into an issue with restoring my database on Heroku. I have restored the database multiple times using pg:pull
as well as the pg:backups restore
method. But with a recent increase in db size, it is not working. I note that restoring from one local machine to another using a dump file DOES work. But it doesn't work on Heroku.
My specs:
To give you a sense of my database size, prior to this latest restore, there were 83 tables, around 10mm rows, and the dump file size is around 350MB.
I use my local database to collect large data sets. Periodically, I will use pg_restore to replace this data to production on Heroku.
I recently downloaded a lot of data to my local database for one of the tables which I will call rankings
. This table has increased from 7mm rows to 60mm rows. The pg_dump file is now 1.24GB.
Per usual, I did the following to transfer this new database to heroku:
pg_dump
heroku pg:backups restore AWS_FILE_URL HEROKU_DATABASE_URL
It keeps failing with the following message (from heroku pg:backups info r023
):
//after successfully restoring all tables and data prior to this://
2015-08-09 19:37:33 +0000: pg_restore: processing data for table "rankings"
2015-08-09 19:41:49 +0000: 2015/08/09 19:41:49 aborting: could not write to output stream: Expected HTTP Status 206 or 200, received: "403 Forbidden"
2015-08-09 19:41:49 +0000: waiting for restore to complete
2015-08-09 19:41:49 +0000: pg_restore: [custom archiver] could not read from input file: end of file
2015-08-09 19:41:49 +0000: restore done
2015-08-09 19:41:49 +0000: waiting for download to complete
2015-08-09 19:41:49 +0000: download done
Heroku docs points out that this is a timeout from the application, not the database.
I note, however, that after many tries this timeout occurs only when the data for the newly updated rankings
table is being processed.
I read this SO post that maybe it was a timeout from unicorn. But I have set my unicorn timeout to 18000 seconds.
I am stumped and could use some guidance.
Below is output from logs:
015-08-09T17:52:34Z app[postgres.11]: [GREEN] checkpoint starting: xlog
2015-08-09T17:52:39Z app[postgres.11]: [GREEN] checkpoint complete: wrote 53 buffers (0.2%); 0 transaction log file(s) added, 0 removed, 40 recycled; write=1.105 s, sync=4.111 s, total=5.224 s; sync files=3, longest=4.048 s, average=1.370 s
2015-08-09T17:52:43.777815+00:00 heroku[web.1]: source=web.1 dyno=heroku.22448404.22d13b5c-8725-43f4-9f0b-1b8e219825a9 sample#load_avg_1m=0.00 sample#load_avg_5m=0.00 sample#load_avg_15m=0.01
2015-08-09T17:52:43.777815+00:00 heroku[web.1]: source=web.1 dyno=heroku.22448404.22d13b5c-8725-43f4-9f0b-1b8e219825a9 sample#memory_total=546.99MB sample#memory_rss=529.55MB sample#memory_cache=2.30MB sample#memory_swap=15.13MB sample#memory_pgpgin=152151pages sample#memory_pgpgout=15996pages
2015-08-09T17:52:31+00:00 app[heroku-postgres]: source=HEROKU_POSTGRESQL_GREEN sample#current_transaction=366264 sample#db_size=3561754388bytes sample#tables=83 sample#active-connections=6 sample#waiting-connections=0 sample#index-cache-hit-rate=0 sample#table-cache-hit-rate=0.66952 sample#load-avg-1m=1.245 sample#load-avg-5m=0.845 sample#load-avg-15m=0.62 sample#read-iops=1.2 sample#write-iops=2554.6 sample#memory-total=15405668kB sample#memory-free=171092kB sample#memory-cached=14145576kB sample#memory-postgres=490544kB
2015-08-09T17:52:46Z app[postgres.11021]: [GREEN] canceling statement due to user request
Query: COPY rankings (id, person_id, max, min, current, date, period_years_min_max, created_at, updated_at) FROM stdin;
2015-08-09T17:52:46Z app[postgres.11021]: [GREEN] could not receive data from client: Connection reset by peer
2015-08-09T17:52:46Z app[postgres.11021]: [GREEN] incomplete message from client
below is the full output from heroku logs -t:
015-08-10T02:55:04Z app[postgres.3977]: [GREEN] statement: DROP SCHEMA public CASCADE
2015-08-10T02:55:05Z app[postgres.3675]: [GREEN] canceling autovacuum task
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] sending cancel to blocking autovacuum PID 3675
Detail: Process 3977 waits for AccessExclusiveLock on relation 35169 of database 16385.
Query: DROP SCHEMA public CASCADE
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] process 3977 still waiting for AccessExclusiveLock on relation 35169 of database 16385 after 1000.367 ms
Detail: Process holding the lock: 3675. Wait queue: 3977.
Query: DROP SCHEMA public CASCADE
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] process 3977 acquired AccessExclusiveLock on relation 35169 of database 16385 after 1000.779 ms
Query: DROP SCHEMA public CASCADE
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] drop cascades to 97 other objects
Detail: drop cascades to extension hstore
drop cascades to type daily_box_input
//<--84 tables, materialized views and views redacted for brevity-->//
drop cascades to table users
drop cascades to extension pg_stat_statements
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] statement: CREATE SCHEMA public
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] statement: ALTER SCHEMA public OWNER TO uca46vrebkj2l9
2015-08-10T02:54:33+00:00 app[heroku-postgres]: source=HEROKU_POSTGRESQL_GREEN sample#current_transaction=368590 sample#db_size=4203753236bytes sample#tables=83 sample#active-connections=3 sample#waiting-connections=0 sample#index-cache-hit-rate=0 sample#table-cache-hit-rate=0.66674 sample#load-avg-1m=0.45 sample#load-avg-5m=0.71 sample#load-avg-15m=0.66 sample#read-iops=17.058 sample#write-iops=498.2 sample#memory-total=15405668kB sample#memory-free=283204kB sample#memory-cached=14065608kB sample#memory-postgres=477864kB
2015-08-10T02:55:57.359270+00:00 heroku[router]: at=info code=H80 desc="Maintenance mode" method=GET path="/" host=www.studiocomps.com request_id=e0d6bbd7-076f-4a56-93a1-2f80a45d28c4 fwd="74.86.158.110" dyno= connect= service= status=503 bytes=
2015-08-10T02:56:02Z app[postgres.4009]: [GREEN] statement: CREATE EXTENSION IF NOT EXISTS pg_stat_statements
2015-08-10T02:56:03.118589+00:00 heroku[router]: at=info code=H80 desc="Maintenance mode" method=GET path="/" host=test.studiocomps.com request_id=25fbbab0-8005-454c-b432-0f9b91cd5a75 fwd="69.64.56.47" dyno= connect= service= status=503 bytes=
2015-08-10T02:55:43+00:00 app[heroku-postgres]: source=HEROKU_POSTGRESQL_GREEN sample#current_transaction=368608 sample#db_size=9137940bytes sample#tables=0 sample#active-connections=2 sample#waiting-connections=0 sample#index-cache-hit-rate=0 sample#table-cache-hit-rate=0 sample#load-avg-1m=0.625 sample#load-avg-5m=0.7 sample#load-avg-15m=0.66 sample#read-iops=234.98 sample#write-iops=226.24 sample#memory-total=15405668kB sample#memory-free=4200392kB sample#memory-cached=10112136kB sample#memory-postgres=470496kB
and the tail end output from heroku logs -t --ps postgres:
2015-08-10T02:46:35Z app[postgres.2427]: [GREEN] duration: 3538.409 ms statement: COPY flows (id, created_at, updated_at, medium_id, value_subunit, timing_id, ultimate_ratio_id, sheet_id, company_id, value_to_date_subunit, period_start, period_end, currency_code, local_title, kind_id, theaters, screens, flowable_id, flowable_type) FROM stdin;
2015-08-10T02:47:15Z app[postgres.11]: [GREEN] checkpoint starting: xlog
2015-08-10T02:47:53Z app[postgres.11]: [GREEN] checkpoint complete: wrote 14930 buffers (47.8%); 0 transaction log file(s) added, 0 removed, 55 recycled; write=30.562 s, sync=7.386 s, total=38.138 s; sync files=219, longest=5.843 s, average=0.033 s
2015-08-10T02:47:57Z app[postgres.2829]: [GREEN] statement: CREATE EXTENSION IF NOT EXISTS pg_stat_statements
2015-08-10T02:48:13Z app[postgres.11]: [GREEN] checkpoint starting: xlog
2015-08-10T02:48:42Z app[postgres.11]: [GREEN] checkpoint complete: wrote 3766 buffers (12.1%); 0 transaction log file(s) added, 0 removed, 38 recycled; write=26.669 s, sync=1.943 s, total=28.658 s; sync files=36, longest=1.118 s, average=0.053 s
2015-08-10T02:49:11Z app[postgres.11]: [GREEN] checkpoint starting: xlog
2015-08-10T02:49:25Z app[postgres.11]: [GREEN] checkpoint complete: wrote 80 buffers (0.3%); 0 transaction log file(s) added, 0 removed, 26 recycled; write=2.021 s, sync=11.500 s, total=13.529 s; sync files=3, longest=6.301 s, average=3.833 s
2015-08-10T02:50:15Z app[postgres.11]: [GREEN] checkpoint starting: xlog
2015-08-10T02:50:20Z app[postgres.11]: [GREEN] checkpoint complete: wrote 81 buffers (0.3%); 0 transaction log file(s) added, 0 removed, 38 recycled; write=1.586 s, sync=3.825 s, total=5.477 s; sync files=2, longest=3.823 s, average=1.912 s
2015-08-10T02:51:08Z app[postgres.2427]: [GREEN] canceling statement due to user request
Query: COPY rankings (id, person_id, max, min, current, date, period_years_min_max, created_at, updated_at) FROM stdin;
2015-08-10T02:51:08Z app[postgres.2427]: [GREEN] could not receive data from client: Connection reset by peer
2015-08-10T02:51:08Z app[postgres.2427]: [GREEN] incomplete message from client
2015-08-10T02:51:27Z app[postgres.11]: [GREEN] checkpoint starting: xlog
2015-08-10T02:54:17Z app[postgres.11]: [GREEN] checkpoint complete: wrote 2098 buffers (6.7%); 0 transaction log file(s) added, 0 removed, 58 recycled; write=169.538 s, sync=0.952 s, total=170.500 s; sync files=11, longest=0.887 s, average=0.086 s
2015-08-10T02:55:04Z app[postgres.3977]: [GREEN] statement: DROP SCHEMA public CASCADE
2015-08-10T02:55:05Z app[postgres.3675]: [GREEN] canceling autovacuum task
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] sending cancel to blocking autovacuum PID 3675
Detail: Process 3977 waits for AccessExclusiveLock on relation 35169 of database 16385.
Query: DROP SCHEMA public CASCADE
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] process 3977 still waiting for AccessExclusiveLock on relation 35169 of database 16385 after 1000.367 ms
Detail: Process holding the lock: 3675. Wait queue: 3977.
Query: DROP SCHEMA public CASCADE
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] process 3977 acquired AccessExclusiveLock on relation 35169 of database 16385 after 1000.779 ms
Query: DROP SCHEMA public CASCADE
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] drop cascades to 97 other objects
Detail: drop cascades to extension hstore
drop cascades to type daily_box_input
//<--84 tables, materialized views and views redacted for brevity-->//
drop cascades to table users
drop cascades to extension pg_stat_statements
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] statement: CREATE SCHEMA public
2015-08-10T02:55:05Z app[postgres.3977]: [GREEN] statement: ALTER SCHEMA public OWNER TO uca46vrebkj2l9
2015-08-10T02:56:02Z app[postgres.4009]: [GREEN] statement: CREATE EXTENSION IF NOT EXISTS pg_stat_statements
OK this is hard for me to post because I feel so stupid, but the direct solution to my problem was that I didn't make my link public on Amazon S3. That's why I was getting a 403 error.
The potential confusion is that you can use the non-public-link, which is a long string of encoded characters. But it uses a cached URL. Because the cache is temporary, the cache was thrown away before my restore process could finish.
So the answer is to make the Amazon S3 link public, and then to use the public link, e.g.: https://s3-us-west-2.amazonaws.com/your_bucket_name/your_file_name.dump
.
Then: heroku pg:backups restore 'https://s3-us-west-2.amazonaws.com/your_bucket_name/your_file_name.dump' HEROKU_DATABASE_URL
.
Having said that, concurrent to this problem I believe there still is a 'deadlocking' issue related to AccessExclusiveLock
that I need to resolve separately, that was making the pg_restore take long enough for the cache to timeout, but that's a separate issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With