Currently have a table .. and need to start adding new data columns to it. Not every record (even going forward with new data after adding the new data columns) will have data. So I am wondering if this is more suited for a new table since it is really an extension of some of the data rows and not applicable to every row.
In other words, since there will be a lot of unused columns for those new data elements, it seems like this would be more suited for a new table ?
EDIT (figured this was too limited)
The first table is a record of page views (currently 2million records) - id - IP address - times viewed - created_at timestamp - date
for every IP address, a record is made per day - and consecutive pageviews are added to the times views per day
additional field(s) would be for point of origin tracking (ie google analytics source/medium/campaign)
Not every visit will have that info. Im would assume about 10% of the rows will have the data (as it is usually only attributed on the first visit)
The main use for the data would be to attribute where people came from. This may wind up being used more frequently (which then seems to lend itself to the single table)
Appreciate the feedback - can add more if needed
The basic rule is this (simplified from more stringent normalisation rules).
If the attribute/column depends on the entire primary key and nothing else, it belongs in the table.
If it depends on something other than, or in addition to, the primary key, it belongs elsewhere and the tables it belongs in should have a relationship with the current table.
For example, your name depends on your SSN so, if SSN was the primary key, your name would belong in that table. Your car or phone number does not depend entirely on your SSN (since you may have more than one car or phone, so it would go in a different table (your primary phone number may go in the first table).
If you really want to learn about database design, forget about the syntax of the select
command and have a look into normalisation. My advice to others is that all database schemas should start at 3NF and only revert if needed for performance.
And then, only if you understand (and mitigate) the problems inherent in doing that.
If most of the column are of data type varchar
then the approach is fine.
because varchar
datatype take the space in table according to size of content in the table cell.
You can define the new column as SPARSE if using Sql server 2008.
Refer to know more about pros and cons of SPARSE Column
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With