Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL DELETE/INSERT throughput issue

I have a throughput problem with DELETE/INSERT sequences on PostgreSQL 9.0. I am looking for ideas to improve the situation.

On the hardware available to us, I can INSERT new rows into a database at a sustained rate of 3000/s (evenly across 10 tables) well beyond the 1m rows in each table that I usually test to. However, if I switch to a mode where we DELETE a row and re-INSERT it with different data, performance drops by more than an order of magnitude to 250 rows/s (again, evenly across 10 tables).

There are no constraints on any table. There are 2 indexed columns in each table with total index size (at 1m rows per table) of 1GB which is comfortably within the shared_buffers (2GB). Total data size (at 1m rows per table) is 12GB, which is much less than total system RAM. This is a shadow database where we can afford to rebuild in emergency, so we run with fsync off.

It would appear that when we’re in populate mode, we benefit from very low disk seek times because data is being appended. However, when we switch to update mode, there’s a lot of seeking going on (to delete the old rows presumably). Random disk seeks cost ~8ms (=~125 per second). Is there any way (without a change of hardware) that we can significantly improve the performance of the UPDATE/re-INSERT operations?

EDIT1: I am doing perf tests on two different spec hardware platforms. The numbers I previously quoted were from the higher spec platform. I have just completed a test run on the lower spec platform. In this test I insert new rows as fast as possible, logging the insert rate every 10 seconds, until I have inserted 1 million rows. At this point my test script switches to updating random rows.

Perf results graph

This graph shows the measured update rate was ~150 updates to all 10 tables/second during population and the update rate was <10 updates to all 10 tables/second.

@wildplasser - The machine is a real machine, not a VM. The 10 tables all have the following schema.

CREATE TABLE objecti_servicea_item1
(
  iss_scs_id text,
  iss_generation bigint,
  boolattr1 boolean,
  boolattr2 boolean,
  boolattr3 boolean,
  boolattr4 boolean,
  boolattr5 boolean,
  boolattr6 boolean,
  boolattr7 boolean,
  boolattr8 boolean,
  boolattr9 boolean,
  boolattr10 boolean,
  boolattr11 boolean,
  boolattr12 boolean,
  boolattr13 boolean,
  boolattr14 boolean,
  boolattr15 boolean,
  boolattr16 boolean,
  boolattr17 boolean,
  intattr1 bigint,
  intattr2 bigint,
  intattr3 bigint,
  intattr4 bigint,
  intattr5 bigint,
  intattr6 bigint,
  intattr7 bigint,
  intattr8 bigint,
  intattr9 bigint,
  intattr10 bigint,
  intattr11 bigint,
  intattr12 bigint,
  intattr13 bigint,
  intattr14 bigint,
  intattr15 bigint,
  intattr16 bigint,
  intattr17 bigint,
  strattr1 text[],
  strattr2 text[],
  strattr3 text[],
  strattr4 text[],
  strattr5 text[],
  strattr6 text[],
  strattr7 text[],
  strattr8 text[],
  strattr9 text[],
  strattr10 text[],
  strattr11 text[],
  strattr12 text[],
  strattr13 text[],
  strattr14 text[],
  strattr15 text[],
  strattr16 text[],
  strattr17 text[]
)
WITH (
  OIDS=FALSE
);
CREATE INDEX objecti_servicea_item1_idx_iss_generation
  ON objecti_servicea_item1
  USING btree
  (iss_generation );
CREATE INDEX objecti_servicea_item1_idx_iss_scs_id
  ON objecti_servicea_item1
  USING btree
  (iss_scs_id );

The "Updates" being performed involve the following SQL for each of the 10 tables.

DELETE FROM ObjectI_ServiceA_Item1 WHERE iss_scs_id = 'ObjUID39'
INSERT INTO ObjectI_ServiceA_Item1 
VALUES ('ObjUID39', '2', '0', NULL, '0'
, NULL, NULL, NULL, '1', '1', NULL, '0'
, NULL, NULL, NULL, NULL, '0', '1', '1'
, '-70131725335162304', NULL, NULL, '-5241412302283462832'
, NULL, '310555201689715409', '575266664603129486'
, NULL, NULL, NULL, NULL, NULL, NULL
, '-8898556182251816700', NULL, '3325820251460628173'
, '-3434461681822953613'
, NULL
, E'{pvmo2mt7dma37roqpuqjeu4p8b,"uo1kjt1b3eu9g5vlf0d02l6iaq\\\\\\",",45kfns1j80gc7fri0dm29hnrjo}'
, NULL, NULL
, E'{omjv460do8cb7abn8t3eg5b6ki,"a7hrlninbk1rmu6h3rd4787l7f\\\\\\",",24n3ipfua5spma2vrj2aji98g3}'
, NULL
, E'{1821v2n2ermm4jujrucu5tekmm,"ukgst224964uhthkhjj9v189ft\\\\\\",",6dfsaniq9mftvbdr8g1sr8e6as}'
, E'{c2a9gvf0fnd38m8vprlhkp2n74,"ts86vbat12lfr0d7l4tc29k9uk\\\\\\",",32b5j9r5evmrie4h21hi10dpot}'
, E'{18pve4cmcbrjiom9bpvoo1l4n0,"hrqcsane6r0n7u2oj79bj605rh\\\\\\",",32q5n18q3qbkuit605fv47270o}'
, E'{l3bf96shrpnnqgt35m7574t5n4,"cpol4k8296hbdqc9kac79oj0ua\\\\\\",",eqioulmb7vav10lbnc5jg752df}'
, E'{5fai108h163hpjcv0ofgfi7c28,"ci958009ddak3li7bp37slcs8i\\\\\\",",2itstj01tkprlul8f530uhs6s2}'
, E'{ueqfkdold8vc84jllr4b2cakt5,"t5vbea4r7tva091pa8j6886t60\\\\\\",",ul82aovhil1lpd290s14vd0p3i}'
, NULL, NULL, NULL, NULL, NULL)

Note that during the first phase of my perf test the DELETE command will always do nothing.

@Frank Heikens - In the perf test which I am running the updates are being done from 10 threads. However, the updates are assigned to threads in a way that ensures that multiple updates to the same row are always handled by the same thread.

like image 739
mchr Avatar asked Oct 10 '22 08:10

mchr


2 Answers

This datamodel isn't a beauty, the DELETE - INSERT either. What's wrong with an UPDATE? If iss_generation and iss_scs_id don't change in the UPDATE, the database can do a HOT update (Heap Overflow Tuple) to increase performance. UPDATE will also benefit from a lower fillfactor.

When you do a DELETE of a record, that record might be in different block than where the INSERT will go. Using a lower fillfactor and using UPDATE, might give the database the option to DELETE and INSERT the updated record in the same block on disk. This will result in less random I/O. When HOT can be used, things get even better because there is no need to update the indexes.

like image 181
Frank Heikens Avatar answered Oct 13 '22 11:10

Frank Heikens


Not sure, but maybe changing the fillfactor will help?

like image 25
Michael Krelin - hacker Avatar answered Oct 13 '22 12:10

Michael Krelin - hacker