Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it good to have 4 Columns as Primary key?

I have in my database tables Students with PK Student_ID, Course with PK Course_ID.

And two tables to save a feedback result for each course, table Questions I saved in the questions for the feedback with PK question_ID, and table feedback.

I'm wondering if I can use the 3 foreign keys in the feedback table (course_ID, student_Id, question_ID) with PK feedback_ID

I think it's useful to have result for each question or student or course but I don't know if using 4 Columns as Primary key possible and good or not.

like image 635
Interaoi Avatar asked Dec 27 '16 12:12

Interaoi


2 Answers

Because most people assume the Primary Key is the Clustering Key, I am going to interpret your question as "Is it good or bad to have 4 columns as a Clustering Key".

The situation you are considering is related to debates like The Clustered Index Debate and Surrogate Key vs. Natural Key.

In this situation, I would want to consider the impact of a 12 byte wide composite clustering key vs a 4 byte wide clustering key (double those if you are going to use bigint). My decision tree would look something like this:

  1. Will we be using Hekaton (In-Memory OLTP)?

    • Yes => composite key. Run away.

    • No => Good call, continue...

  2. How many rows will this table have?

    • Tens of millions, maybe more! => surrogate key (probably).

      • If the data length of each row will be variable and not narrow => surrogate key.

      • If the data length of the row will be fixed and narrow, and it results in optimal page usage => continue...

    • Less than that => continue...

  3. How will the feedback table be queried?

    • Various combinations of and not always all of course_id, student_Id, question_id => surrogate key.

      • In this case you may want to be able to have multiple supporting indexes the combinations of course_id, student_Id, question_id for your queries. The clustering key is included in all non-clustered indexes, and the larger it is the more space/pages each index entry will require. => surrogate key.
    • Almost always by all three course_id, student_Id, question_id or almost always by course_id or course_id, student_Id; but not by student_id without course_id and not question_id without course_id, student_id (zero or only a couple of non-clustered indexes on this table) => continue...

  4. Will any other table reference this table?

    • Yes: e.g. Course Instructors will be able to leave a comment or remark regarding the response to a feedback question. => surrogate key.

    • Kind of... e.g. Audit/History table will be tracking inserts/updates/deletes to rows in this table.

      • a surrogate key may make change tracking less complex to review, and the audit table's clustering key would most likely be the surrogate key & datetime or surrogate key vs composite key & datetime or surrogate key.
    • No => composite key is a reasonable option


Even if my first run through of the above decision tree leads me to a composite key, I would probably start my design using a surrogate key because it is easier to get rid of it (because it isn't being used) than to go back and add it and implement its use.

Just to clarify, I have had cases where I did find that the composite key was a better solution and did refactor the design to drop the surrogate key. I don't want to leave the impression that the surrogate key is always the better solution, even if it is a common default for many designers (including myself).


I would start out with something like this:

create table feedback (
    feedback_id    int       not null identity(1,1)
  , course_id      int       not null
  , student_id     int       not null
  , question_id    int       not null
  , response_added datetime  not null
      constraint df_feedback_response_added_gd default getdate()
  , response       nvarchar(max) null
  , constraint pk_feedback primary key clustered (feedback_id)
  , constraint fk_feedback_course   foreign key (course_id)   
      references course(course_id)
  , constraint fk_feedback_students foreign key (student_id)  
      references student(student_id)
  , constraint fk_feedback_question foreign key (question_id) 
      references question(question_id)
  , constraint uq_feedback_course_student_question 
      unique (course_id, student_Id, question_id)
   /* or create a unique index to use include() instead */
  );

  /* unique index that includes response */
  create unique nonclustered index ux_feedback_course_student_question_covering
      on feedback (course_id, student_Id, question_id) 
        include (response_added, response);

Reference:

  • A Simple Start – Table Creation Best Practices - Kimberly Tripp - concerning row size and page utilization
  • Ever-increasing clustering key – the Clustered Index Debate……….again! - Kimberly Tripp
  • The Clustered Index Debate Continues… - Kimberly Tripp
  • More considerations for the clustering key – the clustered index debate continues! - Kimberly Tripp
  • How much does that key cost? (plus sp_helpindex9) - Kimberly Tripp
  • 101 Things I Wish You Knew About Sql Server - Thomas LaRock
  • SQL Server: Natural Key Verses Surrogate Key - Database Journal - Gregory A. Larsen
  • Ten Common Database Design Mistakes - Simple Talk - Louis Davidson
  • Hekaton (In-Memory OLTP) - dbareactions.com
like image 126
SqlZim Avatar answered Nov 09 '22 03:11

SqlZim


If Feedback_Id uniquely identifies the record, that having it as a Primary Key should work just fine. Including multiple columns in the PK can create trouble in the future.

Let's say you want to persist other feedback details (like comments). You want to define a table called FeedbackComment that should have Feedback as a parent. FK can only go to one or more columns that have an UNIQUE constraint defined on them. Generally, the PK is the target of a PK.

Of course, you can have all columns of the PK defined (feedback_id, course_id etc.) in the child table, but this will make joins more complicated.

Also, if you are using some kind of ORM (i.e. Entity Framework) in the service layer of the application, having a single integer primary key might be useful (e.g. have generic methods that retrieve entity based on an integer identifier).

As Gordon mentioned, there is nothing wrong with composite primary keys, but think about what you do with the table and not to complicate your life when application extensions must be made.

like image 27
Alexei - check Codidact Avatar answered Nov 09 '22 04:11

Alexei - check Codidact