I'm on an optimization kick, at the moment.
I tend to use multiple tables, so I don't have empty columns.
My question is, are empty columns a big deal? I'm not talking for space. I'm referring to speed of indexing, data retrieval, etc...
My bet example is when I have a simple customers table, and some columns are not always filled. Like email, dob, ssn, or pic. I'd say most of the time they are not filled in.
That causes me to create a new table to house just the ancillary data. but would it really make a difference if I put these columns in the same table with the rest of the customer info?
If I do this, then there will be many records with empty columns. Which causes me to wonder how much this affects performance when the record count is huge.
If you're on an optimization kick, my advice is to get off it :-)
Optimization is something that should be done in response to a performance problem, not a whim. If there's no performance problem, all optimization is wasted effort.
Empty fields rarely make a large difference to data retrieval in a properly designed schema since most queries should, as much as possible, use indexes only for deciding which rows to get. Once the rows are discovered, that's when you go to the table to get the actual data.
And speed of indexing won't change just because the column is stored in another table. If it needs to be indexed, then it needs to be indexed.
I prefer my schema to be as simple as possible (while still mostly following 3NF) so as to avoid unnecessary joins.
If you store them as a variable-length field (eg: VARCHAR
), empty columns won't take up as much (any?) space. This comes at the expense of slower lookups compared to tables that only have fixed length fields.
I personally think it's fine to have empty columns, even when you have many of them (also known as a sparse table). Some databases even have optimizations for sparse tables. If you start having many extra tables your logic becomes more complex, and it makes it harder to maintain referential integrity.
What you could do in your customers
table is to have one extra customer_profiles
table with a 1:1 relationship with the customers
table. Store the essential information in customers
and the rest (ie: stuff you don't need every time you look up a customer) in the customer_profiles
table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With