Many table designs I see around has a id column as a primary key. For example log_id in some Logging table, event_id in some event table and so and so. This column would have no dependency on any other column in any other table and uniquely identifies the record. From the look up perspective often times the columns used to look up information are other columns in the table that could be indexed as well (status/event_type/etc etc). So, what is the need to have such a id column representing the record in the table? If I were to remove such id column from a log table and may be instead use composite key what crime am I committing? Why is it so prevalent practice to have such a unique id column in a table where otherwise that column is not utilized in the application? Hoping to hear experts views. :o
UPDATE: Thank you all for quick replies! Primarily I would like to understand why it is so common practice to have a surrogate key instead of composite key in tables such as audit tables (there are other examples but trying to keep conversation focused). In such a table I could easily identify unique record by combination of event,userid and timestamp. Still most of the designs I researched online utilizes keys such as event_id. I am trying to understand why that if there is any real reason? In fact, wouldn't that mean consuming up unnecessary storage of db?
I make a distinction between tables that implement a real Relation in my data model, and tables that are just data-dumps for temporary, logging, audit trails, etc.
These are tables that have no Natural key - i.e. there is no combination of columns that can be guaranteed unique, but the duplicates have meaning; and there is not even a theoretical, logical Natural key that could be applied. In other words, it's not a real Relation per the relational model of data. We're just using a table for convenience.
In rare cases a table needs no key at all - a simple example is a log table that merely records events as they happen. It is only ever inserted into, and purging is done based on a timestamp (which, by the way, cannot be guaranteed unique). If there is no need for a key or a surrogate key, there are no referential constraints, then I'll omit it.
But as soon as a table needs to be referred to by the application - e.g. if we need to refer to a particular record elsewhere - it is now part of the data model and we need to think about it as a Relation - i.e. what is its Natural key. Once that's established we can decide whether a surrogate key is needed or not.
Generally the only tables in my schemas that have no ID are ones that have no constraints at all - i.e. debug logs and audit trails (i.e. that log every insert/update/delete on a table). Everything else gets at least one unique constraint, if not more.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With