Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I handle this generalization design issue?

In our database model we have a Beneficiary entity. A beneficiary can be a physical person or a corporate beneficiary; a phisical beneficiary has a number of attributes such as name, surname, sex, etc.; in addition, a beneficiary (either corporate or physical person) can either be foreign or not; this further distinction translate into different domain values for a "common" set of attributes (for example in Italy, where I live, tax ids may have a different data format than UK's tax ids).

We are now re-engineering our Beneficiary table, since the developer who initially worked on DB analysis & modeling did a (IMO) short-sighted choice. He put the primary key constraint on attribute BeneficiaryName, wich has been used to store either the Corporate name (e.g. "Microsoft Corporation") in case of Corporate beneficiary or the surname (e.g. Smith) for the physical beneficiary. This way we have the (unacceptable) constraint that we CAN'T have more than 1 beneficiary with surname "Smith" (or a corporate named "Smith") in our DB.

My approach for this "re-factoring" would introduce a generalization for the Beneficiary entity; I would

  1. Clean Beneficiary table, keeping only common data;
  2. Add a surrogate primary key to Beneficiary table, let's call it BeneficiaryID;
  3. Split Beneficiary table, creating two sub-entityes (CorporateBeneficiary & PhysicalBeneficiary, discriminated by a flag in master Beneficiary table), with a 1..1 association to Beneficiary table (a foreign key will reference BeneficiaryID)
  4. Find (significative) primary keys for CorporateBeneficiary & PhysicalBeneficiary;

This should address the aforementioned problem of uniqueness on BeneficiaryName. Seems ok so far?

The real problem I have is: how can/should I handle the further complication added by "foreign" attribute in this model? Should I leave Foreign as it tis, i.e. a flag attribute in Beneficiary? If so, how can I handle the need for different attributes' for a conceptually similar piece of information (i.e. zipcode, tax id) withoud duplicating the attributes (zipcode_foreign, zipcode, taxid_foreign, taxid etc.)? Should I really strive to accomodate different domain values into one field?

Any suggestion would be welcome...

like image 630
Andrea Pigazzini Avatar asked Jan 31 '26 12:01

Andrea Pigazzini


1 Answers

"Clean Beneficiary table, keeping only common data;"

Exactly what there is to do.

"Add a surrogate primary key to Beneficiary table, let's call it BeneficiaryID;"

May be useful, but don't forget that IF there exists a "natural" identifier, then the uniqueness of this should be enforced too.

"Split Beneficiary table, creating two sub-entityes (CorporateBeneficiary & PhysicalBeneficiary"

Yup. Observe that it will be hard to enforce "absolute" data integrity (enforcing at the same time that all NaturalBeneficiaries are Beneficiaries, that all NonNaturalBeneficiaries are Beneficiaries too, and that all Beneficiaries are either Natural or NonNatural Beneficairies).

"discriminated by a flag in master Beneficiary table"

Nope. Wouldn't do that. The flag is redundant, and redundancy adds complexity without adding value. If you want to know whether a Beneficiary is Natural or NonNatural, check the table where that fact is recorded.

"Find (significative) primary keys for CorporateBeneficiary & PhysicalBeneficiary;"

If You introduce a surrogate for Benficiaries in general, you don't need to replicate the natural identifiers in these other tables. That's once again redundancy, adding complexity without adding value.

"The real problem I have is: how can/should I handle the further complication added by "foreign" attribute in this model?""

You could apply the same approach, distinguishing National and ExtraNational (for both Corporate and Physical Benficiaries), and that might be anything from advisable to absolutely required if data integrity is of key importance when it concerns, say, at least the National Benficiaries. For example, legislation might apply that forces you to verify that National SSN numbers or National corporation identifying numbers are "valid" according to the National rules. If such legislation applies, it is likely to be crucial that such rules are checked in and by the DBMS, not just your app. Of course for Non-Nationals, similar checks are typically nor required, or even not possible in general.

If you take such a distinction between National and Non-National into account in your database structure, you will very likely also want to create a view that "unions" the two (National and Non-National) together, and then you will have to "transform" your data to a "unified" "common" format, which will likely be just CHAR (even if you know that, say, for the National PhysicalBeneficiaries, the contents will be their SSN number which you know consists of some fixed number of digits).

If you don't have to take such a distinction between National and Non-National into account in your database structure, then you will be forced to use that same "unified" "common" format in your single table that will be holding the data for both National and Foreign.

like image 50
Erwin Smout Avatar answered Feb 03 '26 00:02

Erwin Smout



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!