Concurrent select queries split by row ids Vs one query

Question

When SELECT querying one table l, no joins, with billions of rows, is it a good idea to run concurrent queries by splitting the query into multiple queries, split into distinct subsets/ranges by the indexes column, say integer primary key id? Or does Postgres internally do this already, leading to no significant gain in speed for the end user?

I have two use cases:

getting the total count of rows
getting the list of ids

Edit-1: The query has conditional clause on columns where one of the columns is not indexed, and the other columns are indexed

SELECT id 
FROM l 
WHERE indexed_column-1='A' 
  AND indexed_column-2='B' 
  AND not_indexed_column-1='C'

Erwin Brandstetter · Accepted Answer

Postgres has parallelization built in since version 9.6. (Improved in current versions.) It will be much more efficient than manually splitting a SELECT on a big table.

You can set the number of max_parallel_workers to your needs to optimize.

While you are only interested in the id column, it may help to have an index on (id) (given if it's the PK) and fulfill prerequisites for an index-only scan.

Laurenz Albe · Answer

In the case where you want to count the number of rows, you can just let PostgreSQL's internal query parallelization do the work. It will be faster, and the result will be consistent.

In the case where you want to get the list of primary keys, it depends on the WHERE conditions of the query. If you are selecting only a few rows, parallel query will do nicely.

If you want all ids of the table, PostgreSQL will probably not choose a parallel plan, because the cost of exchanging so many values between the worker processes will outweigh the advantages of parallelization. In that case, you may be faster with parallel sessions as you envision.

Concurrent select queries split by row ids Vs one query

Tags:

postgresql

database-performance

DevH

2 Answers

Erwin Brandstetter

Laurenz Albe

Recent Activity

Donate For Us

Concurrent select queries split by row ids Vs one query

Tags:

postgresql

database-performance

DevH

2 Answers

Erwin Brandstetter

Laurenz Albe

Related questions

Recent Activity

Donate For Us