Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow PostgreSQL sequential scans on RDS?

I have an RDS PostgreSQL instance that's running simple queries, much slower than I would expect - particularly sequential scans, like copying a table or counting a table.

Eg. create table copied_table as (select * from original_table) or select count(*) from some_table

Running count(*) on a 30GB table takes ~15 minutes (with indexes, immediately following a vaccuum).

It's an RDS db.r3.large, 15 GB memory, 400GB SSD. Watching the metrics logs, I've never seen Read IOPS exceed 1,400 and it's usually around 500, well below my expected base.

Configuration: work_mem: 2GB, shared_buffers: 3GB, effective_cache_size: 8GB wal_buffers: 16MB, checkpoint_segments: 16

Is this the expected timing? Should I be seeing higher IOPS?

like image 244
jastr Avatar asked Sep 29 '16 03:09

jastr


People also ask

Why is Postgres using seq scan instead of index?

There are a few (normally good) reasons for Postgres choosing a sequential scan even when it could use an index scan: If the table is small. If a large proportion of the rows are being returned. If there is a LIMIT clause and it thinks it can abort early.

What is sequential scan in PostgreSQL?

Seq Scan. The Seq Scan operation scans the entire relation (table) as stored on disk (like TABLE ACCESS FULL ). Index Scan. The Index Scan performs a B-tree traversal, walks through the leaf nodes to find all matching entries, and fetches the corresponding table data.

How do I connect to a Postgres database from AWS?

Sign in to the AWS Management Console and open the Amazon RDS console at https://console.aws.amazon.com/rds/ . Open the RDS console and then choose Databases to display a list of your DB instances. Choose the PostgreSQL DB instance name to display its details. On the Connectivity & security tab, copy the endpoint.


1 Answers

There is not much you can do around plain count queries like that in Postgres, except in 9.6 that implemented parallel sequential scans, which is not available yet in RDS.

Event though, there is a some tips that you can find here. Generally, it's recommended to try to make Postgres to use Index Only Scan, by creating indexes and it's columns in the projection.

SELECT id FROM table WHERE id > 6 and id <100;
-- or
SELECT count(id) FROM table ...

Table should have an index on that column.

The queries that you exposed as example, won't avoid the sequential scan. For the CREATE TABLE, if you don't care about the order in the table, you can open a few backends and import in parallel by filtering by a key range. Also, the only way to speed up this on RDS is increasing IOPs.

like image 168
3manuek Avatar answered Sep 29 '22 21:09

3manuek