Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PostgreSQL query very slow with limit 1

My queries get very slow when I add a limit 1.

I have a table object_values with timestamped values for objects:

 timestamp |  objectID |  value --------------------------------  2014-01-27|       234 | ksghdf 

Per object I want to get the latest value:

SELECT * FROM object_values WHERE (objectID = 53708) ORDER BY timestamp DESC LIMIT 1; 

(I cancelled the query after more than 10 minutes)

This query is very slow when there are no values for a given objectID (it is fast if there are results). If I remove the limit it tells me nearly instantaneous that there are no results:

SELECT * FROM object_values WHERE (objectID = 53708) ORDER BY timestamp DESC;   ...   Time: 0.463 ms 

An explain shows me that the query without limit uses the index, where as the query with limit 1 does not make use of the index:

Slow query:

explain SELECT * FROM object_values WHERE (objectID = 53708) ORDER BY timestamp DESC limit 1;   QUERY PLAN` ---------------------------------------------------------------------------------------------------------------------------- Limit  (cost=0.00..2350.44 rows=1 width=126) ->  Index Scan Backward using object_values_timestamp on object_values  (cost=0.00..3995743.59 rows=1700 width=126)      Filter: (objectID = 53708)` 

Fast query:

explain SELECT * FROM object_values WHERE (objectID = 53708) ORDER BY timestamp DESC;                                                   QUERY PLAN --------------------------------------------------------------------------------------------------------------  Sort  (cost=6540.86..6545.11 rows=1700 width=126)    Sort Key: timestamp    ->  Index Scan using object_values_objectID on working_hours_t  (cost=0.00..6449.65 rows=1700 width=126)          Index Cond: (objectID = 53708) 

The table contains 44,884,559 rows and 66,762 distinct objectIDs.
I have separate indexes on both fields: timestamp and objectID.
I have done a vacuum analyze on the table and I have reindexed the table.

Additionally the slow query becomes fast when I set the limit to 3 or higher:

explain SELECT * FROM object_values WHERE (objectID = 53708) ORDER BY timestamp DESC limit 3;                                                      QUERY PLAN --------------------------------------------------------------------------------------------------------------------  Limit  (cost=6471.62..6471.63 rows=3 width=126)    ->  Sort  (cost=6471.62..6475.87 rows=1700 width=126)          Sort Key: timestamp          ->  Index Scan using object_values_objectID on object_values  (cost=0.00..6449.65 rows=1700 width=126)                Index Cond: (objectID = 53708) 

In general I assume it has to do with the planner making wrong assumptions about the exectution costs and therefore chooses for a slower execution plan.

Is this the real reason? Is there a solution for this?

like image 303
pat Avatar asked Jan 27 '14 16:01

pat


People also ask

Why is my PostgreSQL so slow?

PostgreSQL attempts to do a lot of its work in memory, and spread out writing to disk to minimize bottlenecks, but on an overloaded system with heavy writing, it's easily possible to see heavy reads and writes cause the whole system to slow as it catches up on the demands.

Can we use limit in PostgreSQL?

The PostgreSQL LIMIT clause is used to get a subset of rows generated by a query. It is an optional clause of the SELECT statement. The LIMIT clause can be used with the OFFSET clause to skip a specific number of rows before returning the query for the LIMIT clause.

How many queries per second can Postgres handle?

In terms of business transactions, each business transactions is around 30-35 queries hitting the database. We are able to achieve ~ 150 business transactions with 4,500-5,000 QPS ( query per second ).

How to attack slow queries in PostgreSQL?

A more traditional way to attack slow queries is to make use of PostgreSQL’s slow query log. The idea is: If a query takes longer than a certain amount of time, a line will be sent to the log. This way slow queries can easily be spotted so that developers and administrators can quickly react and know where to look.

How to handle bad performance in PostgreSQL?

Here are my top three suggestions to handle bad performance: Each method has its own advantages and disadvantages, which will be discussed in this document A more traditional way to attack slow queries is to make use of PostgreSQL’s slow query log. The idea is: If a query takes longer than a certain amount of time, a line will be sent to the log.

What is a slow query log?

The idea is: If a query takes longer than a certain amount of time, a line will be sent to the log. This way slow queries can easily be spotted so that developers and administrators can quickly react and know where to look. In a default configuration the slow query log is not active.

When to use Auto_explain in PostgreSQL?

Finding a query, which takes too long for whatever reason is exactly when one can make use of auto_explain. Here is the idea: If a query exceeds a certain threshold, PostgreSQL can send the plan to the logfile for later inspection. The LOAD command will load the auto_explain module into a database connection.


1 Answers

You can avoid this issue by adding an unneeded ORDER BY clause to the query.

SELECT * FROM object_values WHERE (objectID = 53708) ORDER BY timestamp, objectID DESC limit 1; 
like image 55
Brendan Nee Avatar answered Sep 19 '22 22:09

Brendan Nee