Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Postgres birthdays selection

I work with a Postgres database. This DB has a table with users, who have a birthdate (date field). Now I want to get all users who have their birthday in the upcoming week....

My first attempt: SELECT id FROM public.users WHERE id IN (lange reeks) AND birthdate > NOW() AND birthdate < NOW() + interval '1 week'

But this does not result, obviously because off the year. How can I work around this problem?

And does anyone know what happen to PG would go with the cases at 29-02 birthday?

like image 530
Mike Avatar asked Aug 02 '11 14:08

Mike


4 Answers

We can use a postgres function to do this in a really nice way.

Assuming we have a table people, with a date of birth in the column dob, which is a date, we can create a function that will allow us to index this column ignoring the year. (Thanks to Zoltán Böszörményi):

CREATE OR REPLACE FUNCTION indexable_month_day(date) RETURNS TEXT as $BODY$
  SELECT to_char($1, 'MM-DD');
$BODY$ language 'sql' IMMUTABLE STRICT;

CREATE INDEX person_birthday_idx ON people (indexable_month_day(dob));

Now, we need to query against the table, and the index. For instance, to get everyone who has a birthday in April of any year:

SELECT * FROM people 
WHERE 
    indexable_month_day(dob) >= '04-01'
AND 
    indexable_month_day(dob) < '05-01';

There is one gotcha: if our start/finish period crosses over a year boundary, we need to change the query:

SELECT * FROM people 
WHERE 
    indexable_month_day(dob) >= '12-29'
OR 
    indexable_month_day(dob) < '01-04';

To make sure we match leap-day birthdays, we need to know if we will 'move' them a day forward or backwards. In my case, it was simpler to just match on both days, so my general query looks like:

SELECT * FROM people 
WHERE 
    indexable_month_day(dob) > '%(start)%'
%(AND|OR)%
    indexable_month_day(dob) < '%(finish)%';

I have a django queryset method that makes this all much simpler:

def birthday_between(self, start, finish):
    """Return the members of this queryset whose birthdays
    lie on or between start and finish."""
    start = start - datetime.timedelta(1)
    finish = finish + datetime.timedelta(1)
    return self.extra(where=["indexable_month_day(dob) < '%(finish)s' %(andor)s indexable_month_day(dob) > %(start)s" % {
        'start': start.strftime('%m-%d'),
        'finish': finish.strftime('%m-%d'),
        'andor': 'and if start.year == finish.year else 'or'
    }]

def birthday_on(self, date):
    return self.birthday_between(date, date)

Now, I can do things like:

Person.objects.birthday_on(datetime.date.today())

Matching leap-day birthdays only on the day before, or only the day after is also possible: you just need to change the SQL test to a `>=' or '<=', and not adjust the start/finish in the python function.

like image 179
Matthew Schinckel Avatar answered Sep 28 '22 05:09

Matthew Schinckel


I'm not overly confident in this, but it seems to work in my testing. The key here is the OVERLAPS operator, and some date arithmetic.

I assume you have a table:

create temporary table birthdays (name varchar, bday date);

Then I put some stuff into it:

insert into birthdays (name, bday) values 
('Aug 24', '1981-08-24'), ('Aug 04', '1982-08-04'), ('Oct 10', '1980-10-10');

This query will give me the people with birthdays in the next week:

select * from 
  (select *, bday + date_trunc('year', age(bday)) + interval '1 year' as anniversary from birthdays) bd 
where 
  (current_date, current_date + interval '1 week') overlaps (anniversary, anniversary)

The date_trunc truncates the date at the year, so it should get you up to the current year. I wound up having to add one year. This suggests to me I have an off-by-one in there for some reason. Perhaps I just need to find a way to get dates to round up. In any case, there are other ways to do this calculation. age gives you the interval from the date or timestamp to today. I'm trying to add the years between the birthday and today to get a date in the current year.

The real key is using overlaps to find records whose dates overlap. I use the anniversary date twice to get a point-in-time.

like image 38
Daniel Lyons Avatar answered Sep 28 '22 05:09

Daniel Lyons


Finally, to show the upcoming birthdays of the next 14 days I used this:

SELECT 
    -- 14 days before birthday of 2000
    to_char( to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') - interval '14 days' , 'YYYY-MM-dd')  as _14b_b2000,
    -- birthday of 2000
    to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') as date_b2000,
    -- current date of 2000
    to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') as date_c2000,
    -- 14 days after current date of 2000
    to_char( to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') + interval '14 days' , 'YYYY-MM-dd') as _14a_c2000,
    -- 1 year after birthday of 2000
    to_char( to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') + interval '1 year' , 'YYYY-MM-dd') as _1ya_b2000
FROM c
WHERE 
    -- the condition 
    -- current date of 2000 between 14 days before birthday of 2000 and birthday of 2000
    to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') between 
        to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') - interval '14 days' and 
        to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') 
    or 
    -- 1 year after birthday of 2000 between current date of 2000 and 14 days after current date of 2000
    to_date(to_char(c.birthdate, '2000-MM-dd'), 'YYYY-MM-dd') + interval '1 year' between 
        to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') and 
        to_date(to_char(current_date, '2000-MM-dd'), 'YYYY-MM-dd') + interval '14 days' 
;

So: To solve the leap-year issue, I set both birthdate and current date to 2000, and handle intervals only from this initial correct dates.

To take care of the near end/beginning dates, I compared first the 2000 current date to the 2000 birthday interval, and in case current date is at the end of the year, and the birthday is at the beginning, I compared the 2001 birthday to the 2000 current date interval.

like image 39
Cosmin Marian Avatar answered Sep 28 '22 04:09

Cosmin Marian


Here's a query that gets the right result, most of the time.

SELECT 
    (EXTRACT(MONTH FROM DATE '1980-08-05'),
     EXTRACT(DAY FROM DATE '1980-08-05')) 
IN (
    SELECT EXTRACT(MONTH FROM CURRENT_DATE + s.a) AS m,
           EXTRACT(DAY FROM CURRENT_DATE + s.a) AS d 
    FROM GENERATE_SERIES(0, 6) AS s(a)
);

(it doesn't take care of leap years correctly; but you could use extract again to work the subselect in terms of a leap year instead of the current year.

EDIT: Got it working for all cases, and as a useful query rather than a scalar select. I'm using some extra subselects so that I don't have to type the same date or expression twice for month and day, and of course the actual data would be in a table instead of the values expression. You might adapt this differently. It might still stand to improve by making a more intelligent series for weeks containing leap days, since sometimes that interval will only contain 6 days (for non-leap years).

I'll try to explain this from the inside-out; First thing I do is normalize the target date (CURRENT_DATE usually, but explicit in this code) into a year that I know is a leap year, so that February 29th appears among dates. The next step is to generate a relation with all of the month-day pairs that are under consideration; Since there's no easy way to do an interval check in terms of month-day, it's all happening using generate_series,

From there it's a simple matter of extracting the month and day from the target relation (the people alias) and filtering just the rows that are in the subselect.

SELECT * 
FROM 
    (select column1 as birthdate, column2 as name
    from (values 
        (date '1982-08-05', 'Alice'),
        (date '1976-02-29', 'Bob'),
        (date '1980-06-10', 'Carol'),
        (date '1992-06-13', 'David')
    ) as birthdays) as people 
WHERE 
    ((EXTRACT(MONTH FROM people.birthdate), 
     EXTRACT(DAY FROM people.birthdate)) IN (
        SELECT EXTRACT(MONTH FROM thedate.theday + s.a) AS m,
               EXTRACT(DAY FROM thedate.theday + s.a) AS d
        FROM 
                (SELECT date (v.column1 - 
                        (extract (YEAR FROM v.column1)-2000) * INTERVAL '1 year'
                       ) as theday
                 FROM (VALUES (date '2011-06-09')) as v) as thedate,
                 GENERATE_SERIES(0, 6) AS s(a)
        )
    )

Operating on days, as I've done here, should work splendidly all the way up until a two month interval (if you wanted to look out that far), since december 31 + two months and change should include the leap day. On the other hand, it's almost certainly more useful to just work on whole months for such a query, in which case you don't really need anything more than extract(month from ....

like image 38
SingleNegationElimination Avatar answered Sep 28 '22 06:09

SingleNegationElimination