Imagine we have three tables in a MySQL database:
There is a one-to-many relationship between posts and categories so that a single post may have many categories.
The category_post table is the pivot table between categories and posts and has the following columns:
Let's also imagine that we have 1,000,000 rows in our category_post table.
My question is:
Is there any performance benefit to having the id column in the category_post table or does it just take up extra space?
To have an auto-increment PK makes it easy to create a key that never needs to change, which in turn makes it easy to reference in other tables. If your data is such that you have natural columns that are unique and can never change you can use them just as well.
One of the important tasks while creating a table is setting the Primary Key. The Auto Increment feature allows you to set the MySQL Auto Increment Primary Key field. This automatically generates a sequence of unique numbers whenever a new row of data is inserted into the table.
Auto-increment should be used as a unique key when no unique key already exists about the items you are modelling.
It is not obligatory for a table to have a primary key constraint. Where a table does have a primary key, it is not obligatory for that key to be automatically generated. In some cases, there is no meaningful sense in which a given primary key even could be automatically generated.
Posts and categories is probably many-to-many, not one-to-many.
A many-to-many relationship table is best done something like
CREATE TABLE a_b (
a_id ... NOT NULL,
b_id ... NOT NULL,
PRIMARY KEY (a_id, b_id),
INDEX(b_id, a_id) -- include this if you need to go both directions
) ENGINE = InnoDB;
With that, you automatically get "clustered" lookups both directions, and you avoid the unnecessary artificial id for the table.
(By the way, N.B., an implicit PK is 6 bytes, not 8. There is a lengthy post by Jeremy Cole on the topic.)
A one-to-many relationship does not need this extra table. Instead, have one id inside the other table. For example, a City table will have the id for the Country in it.
Having category_id and post_id as a compound primary key will have better performance than having an extra id as a primary key. This is because making it a primary key will also create an index on it automatically. If you really want an extra Id column you can improve performance by manually defining an index on category_id and post_id. There is no benefit of having an extra key column though and this is generally a bad practice.
not having id is good, but when you care about ordering by the pivot table you will need to have id or timestamp in pivot table
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With