Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to design Date-of-Birth in DB and ORM for mix of known and unknown date parts

Note up front, my question turns out to be similar to SO question 1668172.


This is a design question that surely must have popped up for others before, yet I couldn't find an answer that fits my situation. I want to record date-of-birth in my application, with several 'levels' of information:

  • NULL value, i.e. DoB is unkown
  • 1950-??-?? Only the DoB year value is known, date/month aren't
  • ????-11-23 Just a month, day, or combination of the two, but without a year
  • 1950-11-23 Full DoB is known

The technologies I'm using for my app are as follows:

  • Asp.NET 4 (C#), probably with MVC
  • Some ORM solution, probably Linq-to-sql or NHibernate's
  • MSSQL Server 2008, at first just Express edition

Possibilities for the SQL bit that crossed my mind so far:

  • 1) Use one nullable varchar column e.g. 1950-11-23, and replace unkowns with 'X's, e.g. XXXX-11-23 or 1950-XX-XX
  • 2) Use three nullable int columns e.g. 1950, 11, and 23
  • 3) Use an INT column for year, plus a datetime column for full known DoBs

For the C# end of this problem I merely got to these two options:

  • A) Use a string property to represent DoB, convert only for view purposes.
  • B) Use a custom(?) struct or class for DoB with three nullable integers
  • C) Use a nullable DateTime alongside a nullable integer for year

The solutions seem to form matched pairs at 1A, 2B or 3C. Of course 1A isn't a nice solution, but it does set a baseline.

Any tips and links are highly appreciated. Well, if they're related, anyhow :)


Edit, about the answers: I marked one answer as accepted, because I think it will work for me. It's worth looking at the other answers too though, if you've stumbled here with the same question.

like image 275
Jeroen Avatar asked Jun 21 '11 21:06

Jeroen


2 Answers

The SQL Side

My latest idea on this subject is to use a range for dates that are uncertain or can have different specificity. Given two columns:

DobFromDate (inclusive)
DobToDate (exclusive)

Here's how it would work with your scenarios:

Specificity   DobFromDate   DobToDate
-----------   -----------   ----------
YMD            2006-05-05   2006-05-06
YM             2006-05-01   2006-06-01
Y              2006-01-01   2007-01-01
Unknown        0000-01-01   9999-12-31
-> MD, M, D not supported with this scheme

Note that there's no reason this couldn't be carried all the way to hour, minute, second, millisecond, and so on.

Then when querying for people born on a specific day:

DECLARE @BornOnDay date = '2006-05-16'

-- Include lower specificity:
SELECT *
FROM TheTable
WHERE
   DobFromDate <= @BornOnDay
   AND @BornOnDay < DobToDate;

-- Exclude lower specificity:
SELECT *
FROM TheTable
WHERE
   DobFromDate = @BornOnDay
   AND DobToDate = DateAdd(Day, 1, @BornOnDay);

This to me has the best mix of maintainability, ease of use, and expressive power. It won't handle loss of precision in the more significant values (e.g., you know the month and day but not the year) but if that can be worked around then I think it is a winner.

If you will ever be querying by date, then in general the better solutions (in my mind) are going to be those that preserve the items as dates on the server in some fashion.

Also, note that if you're looking for a date range rather than a single day, with my solution you still only need two conditions, not four:

DECLARE
   @FromBornOnDay date = '2006-05-16',
   @ToBornOnDay date = '2006-05-23';

-- Include lower specificity:
SELECT *
FROM TheTable
WHERE
   DobFromDate < @ToBornOnDay
   AND @FromBornOnDay < DobToDate;

The C# Side

I would use a custom class with all the methods needed to do appropriate date math and date comparisons on it. You know the business requirements for how you will use dates that are unknown, and can encode the logic within the class. If you need something before a certain date, will you use only known or unknown items? What will ToString() return? These are things, in my mind, best solved with a class.

like image 179
ErikE Avatar answered Oct 17 '22 14:10

ErikE


I like the idea of 3 int nullable columns and a struct of 3 nullable int in C#.

it does take some effort in db handling but you can avoid parsing around strings and you can also query with SQL directly by year or year and month and so on...

like image 37
Davide Piras Avatar answered Oct 17 '22 14:10

Davide Piras