I have a query like this that nicely generates a series of dates between 2 given dates:
select date '2004-03-07' + j - i as AllDate
from generate_series(0, extract(doy from date '2004-03-07')::int - 1) as i,
generate_series(0, extract(doy from date '2004-08-16')::int - 1) as j
It generates 162 dates between 2004-03-07
and 2004-08-16
and this what I want. The problem with this code is that it wouldn't give the right answer when the two dates are from different years, for example when I try 2007-02-01
and 2008-04-01
.
Is there a better solution?
Enter the simple but handy set returning function of Postgres: generate_series . generate_series as the name implies allows you to generate a set of data starting at some point, ending at another point, and optionally set the incrementing value. generate_series works on two datatypes: integers. timestamps.
In PostgreSQL, the make_interval() function creates an interval from years, months, weeks, days, hours, minutes and seconds fields. You provide the years, months, weeks, days, hours, minutes and/or seconds fields, and it will return an interval in the interval data type.
Can be done without conversion to/from int (but to/from timestamp instead)
SELECT date_trunc('day', dd):: date
FROM generate_series
( '2007-02-01'::timestamp
, '2008-04-01'::timestamp
, '1 day'::interval) dd
;
To generate a series of dates this is the optimal way:
SELECT t.day::date
FROM generate_series(timestamp '2004-03-07'
, timestamp '2004-08-16'
, interval '1 day') AS t(day);
Additional date_trunc()
is not needed. The cast to date
(day::date
) does that implicitly.
But there is also no point in casting date literals to date
as input parameter. Au contraire, timestamp
is the best choice. The advantage in performance is small, but there is no reason not to take it. And you do not needlessly involve DST (daylight saving time) rules coupled with the conversion from date
to timestamp with time zone
and back. See below.
Equivalent, less explicit short syntax:
SELECT day::date
FROM generate_series(timestamp '2004-03-07', '2004-08-16', '1 day') day;
Or with the set-returning function in the SELECT
list:
SELECT generate_series(timestamp '2004-03-07', '2004-08-16', '1 day')::date AS day;
The AS
keyword is required in the last variant, Postgres would misinterpret the column alias day
otherwise. And I would not advise that variant before Postgres 10 - at least not with more than one set-returning function in the same SELECT
list:
(That aside, the last variant is typically fastest by a tiny margin.)
timestamp [without time zone]
?There are a number of overloaded variants of generate_series()
. Currently (Postgres 11):
SELECT oid::regprocedure AS function_signature , prorettype::regtype AS return_type FROM pg_proc where proname = 'generate_series';
function_signature | return_type :-------------------------------------------------------------------------------- | :-------------------------- generate_series(integer,integer,integer) | integer generate_series(integer,integer) | integer generate_series(bigint,bigint,bigint) | bigint generate_series(bigint,bigint) | bigint generate_series(numeric,numeric,numeric) | numeric generate_series(numeric,numeric) | numeric generate_series(timestamp without time zone,timestamp without time zone,interval) | timestamp without time zone generate_series(timestamp with time zone,timestamp with time zone,interval) | timestamp with time zone
(numeric
variants were added with Postgres 9.5.) The relevant ones are the last two in bold taking and returning timestamp
/ timestamptz
.
There is no variant taking or returning date
. An explicit cast is needed to return date
. The call with timestamp
arguments resolves to the best variant directly without descending into function type resolution rules and without additional cast for the input.
timestamp '2004-03-07'
is perfectly valid, btw. The omitted time part defaults to 00:00
with ISO format.
Thanks to function type resolution we can still pass date
. But that requires more work from Postgres. There is an implicit cast from date
to timestamp
as well as one from date
to timestamptz
. Would be ambiguous, but timestamptz
is "preferred" among "date/time types". So the match is decided at step 4d.:
Run through all candidates and keep those that accept preferred types (of the input data type's type category) at the most positions where type conversion will be required. Keep all candidates if none accept preferred types. If only one candidate remains, use it; else continue to the next step.
In addition to the extra work in function type resolution this adds an extra cast to timestamptz
- which not only adds more cost, it can also introduce problems with DST leading to unexpected results in rare cases. (DST is a moronic concept, btw, can't stress this enough.) Related:
I added demos to the fiddle showing the more expensive query plan:
db<>fiddle here
Related:
You can generate series directly with dates. No need to use ints or timestamps:
select date::date
from generate_series(
'2004-03-07'::date,
'2004-08-16'::date,
'1 day'::interval
) date;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With