Configuration parameter work_mem in PostgreSQL on Linux

Tags:

I have to optimize queries by tuning basic PostgreSQL server configuration parameters. In documentation I've came across the work_mem parameter. Then I checked how changing this parameter would influence performance of my query (using sort). I measured query execution time with various work_mem settings and was very disappointed.

The table on which I perform my query contains 10,000,000 rows and there are 430 MB of data to sort. (Sort Method: external merge Disk: 430112kB).

With work_mem = 1MB, EXPLAIN output is:

Total runtime: 29950.571 ms (sort takes about 19300 ms).
Sort  (cost=4032588.78..4082588.66 rows=19999954 width=8) 
(actual time=22577.149..26424.951 rows=20000000 loops=1)
                 Sort Key: "*SELECT* 1".n
                 Sort Method:  external merge  Disk: 430104kB

With work_mem = 5MB:

Total runtime: 36282.729 ms (sort: 25400 ms).
Sort  (cost=3485713.78..3535713.66 rows=19999954 width=8) 
      (actual time=25062.383..33246.561 rows=20000000 loops=1)
      Sort Key: "*SELECT* 1".n
      Sort Method:  external merge  Disk: 430104kB

With work_mem = 64MB:

Total runtime: 42566.538 ms (sort: 31000 ms).
Sort  (cost=3212276.28..3262276.16 rows=19999954 width=8) 
(actual time=28599.611..39454.279 rows=20000000 loops=1)
                 Sort Key: "*SELECT* 1".n
                 Sort Method:  external merge  Disk: 430104kB

Can anyone explain why performance gets worse? Or suggest any other methods to makes queries execution faster by changing server parameters?

My query (I know it's not optimal, but I have to benchmark this kind of query):

SELECT n
FROM   (
    SELECT n + 1 AS n FROM table_name
    EXCEPT
    SELECT n FROM table_name) AS q1
ORDER BY n DESC;

Full execution plan:

Sort  (cost=5805421.81..5830421.75 rows=9999977 width=8) (actual time=30405.682..30405.682 rows=1 loops=1)
Sort Key: q1.n
Sort Method:  quicksort  Memory: 25kB
->  Subquery Scan q1  (cost=4032588.78..4232588.32 rows=9999977 width=8) (actual time=30405.636..30405.637 rows=1 loops=1)
    ->  SetOp Except  (cost=4032588.78..4132588.55 rows=9999977 width=8) (actual time=30405.634..30405.634 rows=1 loops=1)
           ->  Sort  (cost=4032588.78..4082588.66 rows=19999954 width=8) (actual time=23046.478..27733.020 rows=20000000 loops=1)
                 Sort Key: "*SELECT* 1".n
                 Sort Method:  external merge  Disk: 430104kB
                 ->  Append  (cost=0.00..513495.02 rows=19999954 width=8) (actual time=0.040..8191.185 rows=20000000 loops=1)
                       ->  Subquery Scan "*SELECT* 1"  (cost=0.00..269247.48 rows=9999977 width=8) (actual time=0.039..3651.506 rows=10000000 loops=1)
                             ->  Seq Scan on table_name  (cost=0.00..169247.71 rows=9999977 width=8) (actual time=0.038..2258.323 rows=10000000 loops=1)
                       ->  Subquery Scan "*SELECT* 2"  (cost=0.00..244247.54 rows=9999977 width=8) (actual time=0.008..2697.546 rows=10000000 loops=1)
                             ->  Seq Scan on table_name  (cost=0.00..144247.77 rows=9999977 width=8) (actual time=0.006..1079.561 rows=10000000 loops=1)
Total runtime: 30496.100 ms

677

asked Nov 12 '11 17:11

Grzes

1 Answers

I posted your query plan on explain.depesz.com, have a look.

The query planner's estimates are terribly wrong in some places. Have you run ANALYZE recently?

Read the chapters in the manual on Statistics Used by the Planner and Planner Cost Constants. Pay special attention to the chapters on random_page_cost and default_statistics_target.
You might try:

ALTER TABLE diplomas ALTER COLUMN number SET STATISTICS 1000;
ANALYZE diplomas;

Or go even a higher for a table with 10M rows. It depends on data distribution and actual queries. Experiment. Default is 100, maximum is 10000.

For a database of that size, only 1 or 5 MB of work_mem are generally not enough. Read the Postgres Wiki page on Tuning Postgres that @aleroot linked to.

As your query needs 430104kB of memory on disk according to EXPLAIN output, you have to set work_mem to something like 500MB or more to allow in-memory sorting. In-memory representation of data needs some more space than on-disk representation. You may be interested in what Tom Lane posted on that matter recently.

Increasing work_mem by just a little, like you tried, won't help much or can even slow down. Setting it to high globally can even hurt, especially with concurrent access. Multiple sessions might starve one another for resources. Allocating more for one purpose takes away memory from another if the resource is limited. The best setup depends on the complete situation.

To avoid side effects, only set it high enough locally in your session, and temporarily for the query:

SET work_mem = '500MB';

Reset it to your default afterwards:

RESET work_mem;

Or use SET LOCAL to set it just for the current transaction to begin with.

182

answered Oct 09 '22 10:10

Erwin Brandstetter

Related questions
                            
                                What package to use for database migrations in Go? [closed]
                            
                                Cannot connect to postgres server in docker
                            
                                Postgres closes connection during query after a few hundred seconds when using Psycopg2
                            
                                Writing more than 50 millions from Pyspark df to PostgresSQL, best efficient approach
                            
                                SQL Server - conditional aggregation with correlation
                            
                                Easy way to view data in pgAdmin (PostgreSQL)
                            
                                Referencing current row in FILTER clause of window function
                            
                                Postgres full text search across multiple related tables
                            
                                Why am I getting a "cross-database references are not implemented"?
                            
                                Rails migrations for postgreSQL schemas
                            
                                Postgres - How to debug/trace 'Idle in transaction' connection
                            
                                Rails 5 SQL Injection
                            
                                What does "tuple (0,79)" in postgres log file mean when a deadlock happened?
                            
                                Setting up PostgreSQL in GitLab CI
                            
                                Inserting and selecting PostGIS Geometry with Gorm
                            
                                Hibernate schema parameter doesn't work in @SequenceGenerator annotation
                            
                                Postgres 9.5+: UPSERT to return the count of updated and inserted rows
                            
                                postgres and docker-compose : can't create a custom role and database
                            
                                How to implement OneToMany, ManyToOne and ManyToMany with R2DBC in a project which previously used JPA?
                            
                                PostgreSQL under Lion: What's the password?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Configuration parameter work_mem in PostgreSQL on Linux

Tags:

postgresql

server-configuration

postgresql-performance

Grzes

People also ask

1 Answers

Erwin Brandstetter

Recent Activity

Donate For Us