Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using BETWEEN operator with timestamp values in Postgres

Recently I've been informed by a StackOverflow user that using BETWEEN operator with values of data type timestamp without time zone should not be used. Below is the quote.

Between means >= and <= and shall not be used with ranges that contain timestamps.

When asked for an explanation of this thesis or a link to Postgres documentation where it states that I've got an answer saying

Why would such a simple thing need a site with documentation. I am sure you can find many anyway if you google (at least my detailed posts on various forums demonstrating the case)

Well I googled. And found nothing that would advise against using this operator with timestamp values. In fact this answer on SO uses them and so does this mailing group post.

I was informed that all these years I was doing it wrong. Is it really the case?

As far as I know Postgres max precision for a timestamp is 1 microsecond - correct me if I'm wrong. Thus aren't below statements equivalent ?

sample_date BETWEEN x AND y::timestamp - INTERVAL '1 microsecond'

and

sample_date >= x AND sample_date < y

Edit: The sample is just a consideration of the difference. I'm aware of the fact that developers can miss the time part, but assuming one knows how it behaves, why should it not be used? Generally speaking, this is merely a sample, but I'm wondering about the bigger scope. I've been investigating the planner and it seems to be parsing BETWEEN to >= AND <=.

Why does one preferably write >= AND <= than BETWEEN in the matter of results - not including the time to translate it?

like image 900
Kamil Gosciminski Avatar asked May 22 '26 13:05

Kamil Gosciminski


1 Answers

There is absolutely nothing wrong with using ts BETWEEN validfrom AND validto instead of ts >= validform AND ts <= validto. They are the same.

I can only guess, but I'd say that the warning targets something different, namely whether either of the (identical) clauses above are the right thing to use.

Now this of course depends on what you are trying to do, but very often clauses like this are used to identify the one valid row for a particular timestamp. In that case the clause as above is wrong, because for a value of ts when the row was changed, you would get two results.

Consider this:

CREATE TABLE names (
   id integer PRIMARY KEY,
   val text NOT NULL,
   validfrom timestamptz NOT NULL,
   validto timestamptz NOT NULL
);

INSERT INTO names VALUES (1, 'Smith', '1985-05-02 00:00:00', '2009-01-30 00:00:00');
INSERT INTO names VALUES (2, 'Jones', '2009-01-30 00:00:00', 'infinity');

This is meant to be a historized table of names for a person.

If you use a WHERE clause like above to query for the name valid at a certain time, it would work well for

SELECT val FROM names
WHERE current_timestamp BETWEEN validfrom AND validto;

But it would do the wrong thing for

SELECT val FROM names
WHERE '2009-01-30' BETWEEN validfrom AND validto;

That is because the end point of the interval of validity for a name is not part of the interval. For this case, it would be correct to write:

SELECT val FROM names
WHERE '2009-01-30' >= validfrom AND '2009-01-30' < validto;
like image 145
Laurenz Albe Avatar answered May 25 '26 04:05

Laurenz Albe