How do I force Postgres to use a particular index?

People also ask

Can you force the database to use an index on a query?

In case the query optimizer ignores the index, you can use the FORCE INDEX hint to instruct it to use the index instead. In this syntax, you put the FORCE INDEX clause after the FROM clause followed by a list of named indexes that the query optimizer must use.

Why is Postgres not using my index?

As we saw above, running a couple of queries on our posts table reveals that even given an index to use, Postgres will not always choose to use it. The reason why this is the case is that indexes have a cost to create and maintain (on writes) and use (on reads).

Why does Postgres use seq scan instead of index?

If the SELECT returns more than approximately 5-10% of all rows in the table, a sequential scan is much faster than an index scan. This is because an index scan requires several IO operations for each row (look up the row in the index, then retrieve the row from the heap).

Does Postgres automatically use index?

PostgreSQL automatically creates a unique index when a unique constraint or primary key is defined for a table. The index covers the columns that make up the primary key or unique constraint (a multicolumn index, if appropriate), and is the mechanism that enforces the constraint.

Assuming you're asking about the common "index hinting" feature found in many databases, PostgreSQL doesn't provide such a feature. This was a conscious decision made by the PostgreSQL team. A good overview of why and what you can do instead can be found here. The reasons are basically that it's a performance hack that tends to cause more problems later down the line as your data changes, whereas PostgreSQL's optimizer can re-evaluate the plan based on the statistics. In other words, what might be a good query plan today probably won't be a good query plan for all time, and index hints force a particular query plan for all time.

As a very blunt hammer, useful for testing, you can use the enable_seqscan and enable_indexscan parameters. See:

Examining index usage
enable_ parameters

These are not suitable for ongoing production use. If you have issues with query plan choice, you should see the documentation for tracking down query performance issues. Don't just set enable_ params and walk away.

Unless you have a very good reason for using the index, Postgres may be making the correct choice. Why?

For small tables, it's faster to do sequential scans.
Postgres doesn't use indexes when datatypes don't match properly, you may need to include appropriate casts.
Your planner settings might be causing problems.

Fix your query planner configuration

This problem typically happens when the query planner's estimated cost of an index scan is too high and doesn't correctly reflect reality. To fix this, you need to lower the random_page_cost configuration parameter. From the Postgres documentation:

Reducing this value [...] will cause the system to prefer index scans; raising it will make index scans look relatively more expensive.

You can do a quick test whether this will actually make Postgres use the index:

EXPLAIN <query>;              # Uses sequential scan
SET random_page_cost = 1;
EXPLAIN <query>;              # May use index scan now

You can restore the default value with SET random_page_cost = DEFAULT; again. You can change the global default permanently with ALTER SYSTEM SET random_page_cost = 1;

Background

Index scans require non-sequential disk page fetches. Postgres uses random_page_cost to estimate the cost of such non-sequential fetches in relation to sequential fetches. The default value is 4.0, thus assuming an average cost factor of 4 compared to sequential fetches (taking caching effects into account).

The problem however is that this default value is unsuitable in the following scenarios:

1) Cached indices

If an index is already cached in RAM, then an index scan will always be significantly faster than a sequential scan. The query planner however doesn't exactly know which parts of the index are already cached, and thus might make an incorrect decision. The Postgres documentation says:

If your data is likely to be completely in cache, [...] decreasing random_page_cost can be appropriate.

So, how do you know whether "your data is likely to be cached"? Well, if a specific index is frequently used, and if the system has sufficient RAM, then data is likely to be cached eventually, and random_page_cost should be set to a lower value. You'll have to experiment with different values and see what works for you.

You could also use the pg_prewarm extension for explicit data caching.

2) Solid-state drives

As per the documentation:

Storage that has a low random read cost relative to sequential, e.g., solid-state drives, might also be better modeled with a lower value for random_page_cost, e.g., 1.1.

This slide from a speak at PostgresConf 2018 also says that random_page_cost should be set to something between 1.0 and 2.0 for solid-state drives.

Sometimes PostgreSQL fails to make the best choice of indexes for a particular condition. As an example, suppose there is a transactions table with several million rows, of which there are several hundred for any given day, and the table has four indexes: transaction_id, client_id, date, and description. You want to run the following query:

SELECT client_id, SUM(amount)
FROM transactions
WHERE date >= 'yesterday'::timestamp AND date < 'today'::timestamp AND
      description = 'Refund'
GROUP BY client_id

PostgreSQL may choose to use the index transactions_description_idx instead of transactions_date_idx, which may lead to the query taking several minutes instead of less than one second. If this is the case, you can force using the index on date by fudging the condition like this:

SELECT client_id, SUM(amount)
FROM transactions
WHERE date >= 'yesterday'::timestamp AND date < 'today'::timestamp AND
      description||'' = 'Refund'
GROUP BY client_id

The question on itself is very much invalid. Forcing (by doing enable_seqscan=off for example) is very bad idea. It might be useful to check if it will be faster, but production code should never use such tricks.

Instead - do explain analyze of your query, read it, and find out why PostgreSQL chooses bad (in your opinion) plan.

There are tools on the web that help with reading explain analyze output - one of them is explain.depesz.com - written by me.

Another option is to join #postgresql channel on freenode irc network, and talking to guys there to help you out - as optimizing query is not a matter of "ask a question, get answer be happy". it's more like a conversation, with many things to check, many things to be learned.

One thing to note with PostgreSQL; where you are expecting an index to be used and it is not being used, is to VACUUM ANALYZE the table.

VACUUM ANALYZE schema.table;

This updates statistics used by the planner to determine the most efficient way to execute a query. Which may result in the index being used.

There is a trick to push postgres to prefer a seqscan adding a OFFSET 0 in the subquery

This is handy for optimizing requests linking big/huge tables when all you need is only the n first/last elements.

Lets say you are looking for first/last 20 elements involving multiple tables having 100k (or more) entries, no point building/linking up all the query over all the data when what you'll be looking for is in the first 100 or 1000 entries. In this scenario for example, it turns out to be over 10x faster to do a sequential scan.

see How can I prevent Postgres from inlining a subquery?

Related questions
                            
                                Naming convention for unique constraint
                            
                                What's best SQL datatype for storing JSON string?
                            
                                What is the reason not to use select *?
                            
                                Insert all values of a table into another table in SQL
                            
                                How to detect if a stored procedure already exists
                            
                                Formatting Numbers by padding with leading zeros in SQL Server
                            
                                How to use Oracle ORDER BY and ROWNUM correctly?
                            
                                Good reasons NOT to use a relational database?
                            
                                Unknown Column In Where Clause
                            
                                How can I query a value in SQL Server XML column
                            
                                How do I drop table variables in SQL-Server? Should I even do this?
                            
                                How to list records with date from the last 10 days?
                            
                                Storing sex (gender) in database
                            
                                How do you copy a record in a SQL table but swap out the unique id of the new row?
                            
                                How to select only 1 row from oracle sql?
                            
                                Delete duplicate rows from small table
                            
                                SQL - many-to-many table primary key
                            
                                Entity Attribute Value Database vs. strict Relational Model Ecommerce
                            
                                Why do we need entity objects? [closed]
                            
                                How to connect an existing SQL Server login to an existing SQL Server database user of same name

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do I force Postgres to use a particular index?

Tags:

sql

indexing

postgresql

People also ask

Fix your query planner configuration

Background

Recent Activity

Donate For Us