Using Debian Wheezy, Postgresql 9.3
My database went down because the partition where it keeps the WAL files got full.
So, I deleted everything inside ./pg_xlog/
, because I didn't know what they were (yea, incredibly stupid of me). Now the Postgres service won't start, though the problem, according to syslog:
00000: could not open tablespace directory "pg_tblspc/16386/PG_9.3_201306121": File or directory not found
LOCAL: RelationCacheInitFileRemoveInDir, relcache.c:4895
00000: Primary checkpoint record is invalid
LOCAL: ReadCheckpointRecord, xlog.c:6543
00000: Secondary checkpoint record is invalid
LOCAL: ReadCheckpointRecord, xlog.c:6547
PANIC: XX000: could not locate a valid checkpoint record
LOCAL: StartupXLOG, xlog.c:5228
I'm not entirely sure whether the problem is that it can't find the proper pg_tblspc or the total lack of checkpoint WAL files. The actual path to where the databases are stored is /dados/PG_9.3_201306121
. What can I do to make the service start again?
EDIT1:
Okay, I've managed to get the thing back online. Some databases got corrupt. I've managed to DROPDB
two of them (couldn't even connect to them without them forcing a service restart). I tried doing it to another one that got corrupt, but the error was related to xlog again. I've tried doing a clean restore over it, but the restore was incomplete. Then, I've created a new database and tried to restore an older backup of this database. It also came incomplete.
Now I can't drop any databases, nor create new ones, I always get a xlog flush request not satisfied
error. I've tried running pg_resetxlog
, but it didn't seem to do anything. Another thing the error shows is cannot write to block 1 of pg_tblspc/16385/PG_9.3_201306121/36596452/11773
, write error may be permanent
.
EDIT2: Part of the problem above was with that 11773 file. I've renamed it to 11773.corrupt and now the database allows me to create and drop again.
Postgres won't start after deleting pg_xlog files
Um, yeah. Don't do that.
What can I do to make the service start again?
Well, you've corrupted your database. Restore from backups. You have backups, right? Preferably a handy PITR archive like from PgBarman where you can restore up to 5 mins ago. No?
OK, first, archive the damaged copy. https://wiki.postgresql.org/wiki/Corruption
Now. If you're lucky, pg_resetxlog
will get you up and running enough to successfully do a pg_dump
of the database, so you can then move the old damaged install's datadir aside, initdb
a new one, and restore the database to it.
If you're unlucky pg_dump
won't succeed, or you'll get restore failures due to things like duplicate primary keys. In the latter case might have to repair the dump by hand. If pg_dump
fails, appropriate action will depend on why it fails.
So yeah. Don't delete pg_xlog
.
There are discussions within the PostgreSQL community about renaming pg_xlog
to something that makes it more obvious that it's an important component of the database, and hopefully it'll get done in the 9.7 release.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With