Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it worth breaking out address information into a separate database table?

I have a table called "Person" with the following fields

  • Id (Primary Key)
  • FirstName
  • LastName
  • DateOfBirth
  • City
  • State
  • Country

Should things like City, or State or Country be normalized and broken up into their own table and then this table have CityId and StateId columns. We were having a debate whether this was a good or bad decision.

To add, I do have a City and a State table (for other reasons not related to this person table). I am curious around answers with or without this additional fact.

like image 809
leora Avatar asked Jun 21 '13 01:06

leora


Video Answer


2 Answers

Normalizing address into a hierarchy is a questionable proposition. It really depends on what you mean to do with your address data.

The idea of normalizing to avoid update anomalies is a little dubious. How often do cities, states or countries actually change names? Furthermore, if this were to happen, how likely would it be that the change would be wholesale? (i.e. every instance of old name X changes to new name Y). I can tell you what happened in practice in Canada when there was a flurry of municipal amalgamations in the 2000's was that boundaries were redrawn, and that lots of old names stuck around, just with smaller territories than before.

The fact is that things like municipality names can be loosely defined. For example, where I grew up, my address had three officially recognized municipality names according to the postal authority: WILLOWDALE, NORTH YORK, TORONTO - all of which were valid options, although one was "more official" than the others. The problem is that all of Willowdale is in North York, but North York also contains "Downsview" and others.

Other frequent arguments for normalizing addresses include: ensure proper spelling and providing a basis for territory management. Given the vagaries of address data quality, these arguments are not convincing.

The best way to ensure address data quality is to keep your addresses in a relatively flat, relatively simple structure and to employ one or more address quality tools that use postal authority data to match and standardize your addresses. Keep city, state and postal code in their own fields, by all means, but don't keep them in distinct tables. This is actually more flexible than a normalized structure while producing more reliable results overall.

Similarly, territory management is best done at a more granular level than municipality. Some municipalities are enormous and names can be ambiguous. Instead use a postal code or ZIP+4 (depending on jurisdiction). This is much more granular and unambiguous. Again, an address data quality tool will ensure that you have proper postal coding on your addresses.

like image 163
Joel Brown Avatar answered Nov 02 '22 05:11

Joel Brown


From my experience, yes.

1 The city, state and country are entities in the real world so it is good to have them as entities in your database model. It keeps the names consistent as the other answerers have already mentioned

2 You may populate them and validate them from external open sources or standards bodies. Eg for countries it is international standard ISO3166

3 In your present or future versions of your app, you may even connect directly to external sources to maintain them.

4 If you ever go multi-lingual you will already have the names to translate all in one place

5 If you ever exchange or interface data with other parties or apps, you will need the common classifications

like image 23
Chris Allen Avatar answered Nov 02 '22 04:11

Chris Allen