Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speeding up PostgreSQL query where data is between two dates

I have a large table (> 50m rows) which has some data with an ID and timestamp:

id, timestamp, data1, ..., dataN

...with a multi-column index on (id, timestamp).

I need to query the table to select all rows with a certain ID where the timestamp is between two dates, which I am currently doing using:

SELECT * FROM mytable WHERE id = x AND timestamp BETWEEN y AND z

This currently takes over 2 minutes on a high end machine (2x 3Ghz dual-core Xeons w/HT, 16GB RAM, 2x 1TB drives in RAID 0) and I'd really like to speed it up.

I have found this tip which recommends using a spatial index, but the example it gives is for IP addresses. However, the speed increase (436s to 3s) is impressive.

How can I use this with timestamps?

like image 831
Roger Avatar asked Mar 17 '10 11:03

Roger


People also ask

How do I SELECT data between two dates in PostgreSQL?

In Postgresql, we can extract the date between two dates inclusively using BETWEEN clause. Let's understand through an example. SELECT name,end_date as left_date FROM employee WHERE end_date BETWEEN '1998-01-07' AND '2016-08-01';

How make PostgreSQL query run faster?

Some of the tricks we used to speed up SELECT-s in PostgreSQL: LEFT JOIN with redundant conditions, VALUES, extended statistics, primary key type conversion, CLUSTER, pg_hint_plan + bonus. Photo by Richard Jacobs on Unsplash.

How do I get the difference between two dates in a month in Postgres?

Use the PostgreSQL AGE() function to retrieve the interval between two timestamps or dates.

Does datediff work in PostgreSQL?

PostgreSQL provides a datediff function to users. The datediff means we can return the difference between two dates based on their specified interval. The datediff function plays an important role in the database management system because datediff functions as a calendar and it is very helpful to users.


2 Answers

That tip is only suitable when you have two columns A and B and use queries like:

where 'a' between A and B

That's not:

where A between 'a' and 'b'

Using index on date(column) rather than column could speed it up a little bit.

like image 59
Konrad Garus Avatar answered Sep 28 '22 04:09

Konrad Garus


Could you EXPLAIN the query for us? Then we know how the database executes your query. And what about the configuration? What are the settings for shared_buffers and work_mem? And when did you (or your system) the last vacuum and analyze? And last thing, what OS and pgSQL-version are you using?

You can create wonderfull indexes but without proper settings, the database can't use them very efficient.

like image 44
Frank Heikens Avatar answered Sep 28 '22 04:09

Frank Heikens