In my database, I have a lot of courses that are compulsory. Some are elective. However, there are courses of a third kind: A list from which you have to choose X courses. The list (and the number X) is different for each study program. How would you represent this relationally?
I find it interesting that the accepted answer says, "There is no way to represent the 'X of Y' relationally" when that is essentially what the question was asking for. It seems to me that 'X of Y' can indeed be modelled (and largely enforced) using SQL and here's a suggested way:
Example scenario: students taking the course 'French' must choose two components (x) out of a total of three possible compnents (y).
CREATE TABLE Components
(
component_name VARCHAR(100) NOT NULL,
UNIQUE (component_name)
);
INSERT INTO Components (component_name) VALUES
('Oral'),
('Writing'),
('Vocab'),
('Databases');
Clearly, 'Databases' doesn't belong on a course on French so we need a tables for course designers to model courses [these tables have many relevant candidate keys so for clarity I'll define them at the 'bottom' of the CREATE TABLE
statement):
CREATE TABLE XofYCourses
(
course_name VARCHAR(100) NOT NULL,
x_components_choice_tally INTEGER NOT NULL
CHECK (x_components_choice_tally > 0),
y_components_tally INTEGER NOT NULL
CHECK (y_components_tally > 0),
CHECK (x_components_choice_tally < y_components_tally),
UNIQUE (course_name),
UNIQUE (course_name, y_components_tally),
UNIQUE (course_name, x_components_choice_tally)
);
INSERT INTO XofYCourses (course_name, y_components_tally,
x_components_choice_tally) VALUES
('French', 2, 3);
The above allows us to model the 'two out of three' attribute of the French course. Now we need a table to model what the three possible components of that course actually are:
CREATE TABLE XofYCourseComponents
(
course_name VARCHAR(100) NOT NULL,
y_components_tally INTEGER NOT NULL,
FOREIGN KEY (course_name, y_components_tally)
REFERENCES XofYCourses (course_name, y_components_tally),
component_sequence INTEGER NOT NULL
CHECK (component_sequence > 0),
component_name VARCHAR(100) NOT NULL
REFERENCES Components (component_name),
CHECK (component_sequence <= y_components_tally),
UNIQUE (course_name, component_sequence),
UNIQUE (course_name, component_name)
);
INSERT INTO XofYCourseComponents (course_name,
component_sequence, y_components_tally, component_name)
VALUES
('French', 1, 3, 'Oral'),
('French', 2, 3, 'Writing'),
('French', 3, 3, 'Vocab');
Now for enrolment. Billy want to do the French course...
CREATE TABLE Students
(
student_name VARCHAR(20) NOT NULL,
UNIQUE (student_name)
);
INSERT INTO Students (student_name) VALUES ('Billy');
...and chooses 'Oral' and 'Vocab':
CREATE TABLE XofYCourseComponentChoices
(
student_name VARCHAR(20) NOT NULL
REFERENCES Students (student_name),
course_name VARCHAR(100) NOT NULL,
x_components_choice_tally INTEGER NOT NULL,
FOREIGN KEY (course_name, x_components_choice_tally)
REFERENCES XofYCourses (course_name, x_components_choice_tally),
component_name VARCHAR(100) NOT NULL,
FOREIGN KEY (course_name, component_name)
REFERENCES XofYCourseComponents (course_name, component_name),
x_component_sequence INTEGER NOT NULL
CHECK (x_component_sequence > 0),
CHECK (x_component_sequence <= x_components_choice_tally),
UNIQUE (student_name, course_name, component_name),
UNIQUE (student_name, course_name, x_component_sequence)
);
INSERT INTO XofYCourseComponentChoices (student_name, course_name,
component_name, x_component_sequence, x_components_choice_tally)
VALUES
('Billy', 'French', 'Oral', 1, 2),
('Billy', 'French', 'Vocab', 2, 2);
The above structure works a good way of enforcing the maximum values i.e. no more than three components for the French course and no more than two choices for each student.
What it doesn't do, however, is to ensure exact amounts e.g. that Billy doesn't choose just one component. Standard SQL has solutions to this problem e.g. CHECK
constraints that support subqueries (e.g. to count that there are a total of two rows for Billy...) and DEFERRABLE
constraints (...but delay the count until the point when the transaction is being committed). Having a 'multiple assignment' feature would be even better. However, most SQL products don't have these features.
Does this lack of support for the full solution mean we don't do anything and just trust the application will refrain from writing invalid data? of course not!
A good interim approach is to revoke the privileges from the base tables and provide helper stored procedures e.g. one to enrol a student which takes their chosen course components as parameters: the count is done at after the INSERT
s and if it violates the data rules (e.g. less than two for French) then the transaction is rolled back and an error returned.
You need 3 tables here: StudyPrograms, Courses and Components. Components represents the Courses which comprise each StudyProgram and is a junction table between Courses and StudyPrograms.
Each Component record can contain a field indicating if the Course is a compulsory part of the StudyProgram. You can also include a field to indicate whether the Course is one of a list that can be chosen.
There is no way to represent the 'X of Y' relationally, you will need some logic in your stored procedures to ensure this business rule is followed (or possibly in your data access code layer, depending on how you want to organise the application).
You have two options: you can model the data closer to reality, where some are single-course requirements, and others are X from Y courses requirements, or you could model all requirements as X from Y, where the single-course requirements are "1 from 1" requirements.
I would recommend something like this:
Course
---------------
CourseID
Description
...
Program
---------------
ProgramID
Description
...
CourseGroup
---------------
CourseGroupID
CourseID
ProgramCourseGroup
---------------
ProgramID
CourseGroupID
RequiredCourses
Course
and Program
are the two top-level tables. They define the simple list of all courses and programs, respectively, without any sort of relation between the two.
CourseGroup
defines a group of courses. This relates to Course
, but no other tables.
ProgramCourseGroup
relates course groups to programs. A program indicates that a particular group of courses is required, then RequiredCourses
indicates how many courses from that group must be taken in order to satisfy the requirement.
For instance, say you have a program called "Basket Weaving" that requires:
and two of the following four courses:
Your data would look like:
Course
------------------------------------
CourseID Description
1 Intro to baskets
2 Basic weaving techniques
3 Easter baskets
4 Handbaskets
5 Picnic baskets
6 SCUBA Diving
Program
--------------------------
ProgramID Description
1 Basket Weaving
CourseGroup
--------------------------
CourseGroupID CourseID
1 1
2 2
3 3
3 4
3 5
3 6
ProgramCourseGroup
-----------------------------------------
ProgramID CourseGroupID RequiredCourses
1 1 1
1 2 1
1 3 2
In principle one would expect to be able to create a constraint something like this to enforce the rule:
CHECK
(NOT EXISTS
(SELECT 1
FROM CourseEnrolement c, ProgramEnrolement p
WHERE c.StudentId = p.StudentId
AND c.ProgramId = p.ProgramId
GROUP BY p.StudentId, p.ProgramId, p.NumberOfCoursesRequired
HAVING COUNT(*) <> p.NumberOfCoursesRequired
))
Unfortunately, SQL makes it all but impossible to implement this, or at least makes it very difficult to update the database while the constraint is enforced. So if you really want to represent such a rule in the database then you may need a better model than SQL can offer. In practice such rules will often be enforced in application code instead.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With