POSTGRES: Disk bound IO - possible to keep table in memory?

Tags:

postgis

I'm a relative Postgres newbie, though have some basic experience with MSSQL.

I have a table on PostgreSQL (PostGIS, it's spatial) that contains about 10,000,000 polygons. The machine it's sitting on has 64gb RAM, 16 cores and a 1TB spinning HDD. This is only of the only tables in the database at the moment. Because access to the table is infrequent (perhaps once every few hours) I'm noticing that the table will not stay located in RAM, as I would expect with MSSQL. Instead, the table seems to be released from memory and sit on disc in an active state. This is leading to 100% HDD utilization for 15+ minutes when I want to query / join / interrogate / etc. When the table seemingly appears to be in memory subsequent operations are notably faster (seconds rather than minutes).

Is there a way to ask Postgres to keep a certain table in memory, or have the scheduler / whatever bits of postgres that perform the smarts keep the table in ram, rather than letting it go to disk then having to recall it into memory when required?

I have spatial indexes (and a couple of other columns that often require filtering / sorting indexed), so when being called from memory it's quite fast.

This same issue also seems to massively affect JOINS, because they too require the table to be read first. This is a separate issue for me, but seemingly affected by the same root problem. Disk IO bound.

My DB settings are as such - so generally I'm not inhibited by available memory / ram, so far as I can tell.

Edit: Table is 26gb

Postgres 13.2 with PostGIS 3.1.1

 max_connections = '20';
 shared_buffers = '8GB';
 effective_cache_size = '24GB';
 maintenance_work_mem = '2047MB';
 checkpoint_completion_target = '0.9';
 wal_buffers = '16MB';
 default_statistics_target = '500';
 random_page_cost = '4';
 work_mem = '26214kB';
 min_wal_size = '4GB';
 max_wal_size = '16GB';
 max_worker_processes = '16';
 max_parallel_workers_per_gather = '8';
 max_parallel_workers = '16';
 max_parallel_maintenance_workers = '4';

966

asked Apr 16 '21 00:04

anakaine

1 Answers

You didn't say how large the table is.

Data is never evicted from shared buffers just out of boredom. Only to make room for other things, or because they became invalid (table dropped or truncated, etc.). So if you make shared buffers large enough and read the whole table in with pg_prewarm, it will stay there until displaced. (There is a mechanism for sequential scans of large tables that will preferentially evict data it just read to make room for more data from the same table, but pg_prewarm is not subject to that). If the entire database fits in shared buffers, it will all be retained indefinitely.

Stock postgresql offers no other way to pin a table or list of tables into memory.

182

answered Oct 13 '22 00:10

jjanes

Related questions
                            
                                Python Pony ORM Insert multiple values at once
                            
                                Combining postgres query and log duration
                            
                                Materialized view using a function using a temporary table
                            
                                PostgreSQL row read lock
                            
                                What specific exceptions represent a serialization failure when Django is using serializable transaction isolation level with postgresql?
                            
                                conditional updating in postgreSQL
                            
                                Sequelize - does not contain a string within a PostgreSQL array query
                            
                                How to tell when a Postgres table was clustered and what indexes were used
                            
                                Logical Replication and Declarative Partitioning in PostgreSQL 11
                            
                                Stop PostgreSQL from spliting value(s) in multiple lines?
                            
                                Phoenix / Elixir testing when setting isolation level of transaction
                            
                                Postgres ON CONFLICT set column reference is ambiguous
                            
                                Why is Bitmap Scan faster than Index Scan when fetching a moderately large percentage of the table in PostgreSQL?
                            
                                How to read PostgreSQL wal file data? Is there is any command to convert PostgreSQL binary to readable text format?
                            
                                How to asynchronously query multiple databases in R
                            
                                Postgres Materialized Path - What are the benefits of using ltree?
                            
                                SQL: Update with all possible combinations
                            
                                Mikro-orm error: password authentication failed for user "postgres"
                            
                                Storing appointments in a SQL database such as Postgres for use with java.time framework
                            
                                After upgrade, raw sql queries return json fields as strings on postgres

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With