Our database seems to be broken, normally it uses about 1-2% of cpu, but if we run some additional backend services making UPDATE and INSERT queries for 10M rows table (about 1 query per 3 second) everything is going to hell (including CPU increase from 2% to 98% usage).
We have decided to debug what's going on, run VACUUM and ANALYZE to learn what's wrong with db but...
production=# ANALYZE VERBOSE users_user;
INFO: analyzing "public.users_user"
INFO: "users_user": scanned 280 of 280 pages, containing 23889 live rows and 57 dead rows; 23889 rows in sample, 23889 estimated total rows
INFO: analyzing "public.users_user"
INFO: "users_user": scanned 280 of 280 pages, containing 23889 live rows and 57 dead rows; 23889 rows in sample, 23889 estimated total rows
ERROR: tuple already updated by self
we are not able to finish ANALYZE on ANY of the tables and could not find any information about this issue. Any suggestions what can be wrong?
PostgreSQL 9.6.8 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit
Additional info as requested in comments:
Maybe you have a corrupted pg_class
SELECT * FROM pg_class WHERE relname = 'users_user';
Output: https://pastebin.com/WhmkH34U
So the first thing to do would be to kick out all other sessions and try again
There are no additional sessions, we have dumped the whole DB on the new testing server, issue still occur, there are no clients connected to this DB
I'd recommend you to start the server with the following parameters before searching for duplicated rows:
enable_indexscan = off
enable_bitmapscan = off
ignore_system_indexes = on
If your server crashed, indexes could be in a different state of table data. This happens when corruption affects transaction visibility (pg_clog
), for example.
Then search for a duplicated row on pg_class
or pg_statistic
as mentioned in comments.
You could also try to clean up pg_statistic
. First, start the server with:
allow_system_table_mods = on
And then issue a TRUNCATE TABLE
and ANALYZE
afterward:
--Cleaning pg_statistic
TRUNCATE TABLE pg_catalog.pg_statistic;
--Analyze entire database
ANALYZE VERBOSE;
If the problem is in pg_statistic this should be enough.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With