I was trying out postgres google-cloud-sql and loaded a simple school schema
CREATE TABLE school (
id SERIAL NOT NULL PRIMARY KEY,
name TEXT NOT NULL
);
CREATE TABLE class (
id SERIAL NOT NULL PRIMARY KEY,
name TEXT,
school_id INTEGER NOT NULL REFERENCES school
);
CREATE TABLE student (
id SERIAL NOT NULL PRIMARY KEY,
name TEXT,
class_id INTEGER NOT NULL REFERENCES class
);
-- ALL id and foreign keys have indexs
Loaded ~15 millions row in total with 1500 school, 500 class per school, 200 student per class.
After that create a simple pgbench script
\setrandom sId1 1 20000000
\setrandom sId2 1 20000000
\setrandom sId3 1 20000000
select count(*) from school s
join class c on s.id=c.school_id
join student stu on c.id=stu.class_id where s.id=:sId1;
select count(*) from school s
join class c on s.id=c.school_id
join student stu on c.id=stu.class_id where s.id=:sId2;
select count(*) from school s
join class c on s.id=c.school_id
join student stu on c.id=stu.class_id where s.id=:sId3;
Now running the the script with
pgbench -c 90 -f ./sql.sql -n -t 1000
2 cores, 7.5 GB, 90 client --
OUTPUT:
number of transactions actually processed: 90000/90000
tps = 1519.690555 (including connections establishing)
tps = 2320.408683 (excluding connections establishing
26 cores, 30 GB, 90 client-
number of transactions actually processed: 90000/90000
tps = 1553.721286 (including connections establishing)
tps = 2405.664795 (excluding connections establishing)
Question: Why do we have only 80 tps increase from 2 core to 26 cores ?
PostgreSQL uses only one core.
PostgreSQL attempts to do a lot of its work in memory, and spread out writing to disk to minimize bottlenecks, but on an overloaded system with heavy writing, it's easily possible to see heavy reads and writes cause the whole system to slow as it catches up on the demands.
If you're simply filtering the data and data fits in memory, Postgres is capable of parsing roughly 5-10 million rows per second (assuming some reasonable row size of say 100 bytes). If you're aggregating then you're at about 1-2 million rows per second.
Some of the tricks we used to speed up SELECT-s in PostgreSQL: LEFT JOIN with redundant conditions, VALUES, extended statistics, primary key type conversion, CLUSTER, pg_hint_plan + bonus.
I asked same question on the postgres irc.
Community was sure that i was maxing out the client pgbench , they suggested to use -j4
in pgbench and tps increased to 23k per sec.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With