I am storing data on journal papers from PubMed in Postgres, and I want to store the publication date of the paper. However, sometimes the date is just a year (see the PubDate
field), sometimes it is a month and a year, and sometimes it is a day, month and year.
What's a sensible way to store this data in my database?
If I use a Postgres date field, I obviously need to specify day and month as well, but I'll need to invent them in some cases, and I don't like the idea of losing the information about the date's precision.
For my purposes I think I will only need the year, but I don't know that to be true for all possible future purposes.
Perhaps I should just have text fields for day
, month
and year
, and then I can always convert them to a date field in the future if I need them?
I would store it as date
and store the precision as well.
For example:
CREATE TYPE date_prec AS ENUM ('day', 'month', 'year');
CREATE TABLE pub (
pub_id integer PRIMARY KEY,
pub_date date NOT NULL,
pub_date_prec date_prec NOT NULL
);
Then you can query the table like this:
SELECT pub_id, date_trunc(pub_date_prec::text, pub_date)::date FROM pub;
to ignore any “random” day and month values in pub_date
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With