Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Normalization: What does "repeating groups" mean?

Tags:

I have read different tutorials and seen different examples of normalization, specially the notion of "repeating groups" in the first normal form. From them I have have gathered that repeating groups are "kind-of" multi-valued attributes (e.g. here and here).

But we already make separate tables for each multi-valued attribute by including foreign keys from the parent table during the process of mapping an ERM (Entity relationship Model) to a RDM (Relational Data Model)? Reference: this

Secondly, are those "repeating groups" essentially laid out horizontally in the same row, or can the same value occurring in the same column again and again, i.e. the same value of an attribute again and again, also a repeating group and should be eliminated?

enter image description here In this example, the value English is repeating again and again. Is this a repeating group? If I eliminate it to make another table SUBJECT with Subject Name and Module_ID(Foreign key), this is what I get. Sure it gets rid of the repeating value, but I am not sure if this is the right thing. Is it right? enter image description here

like image 856
Solace Avatar asked Apr 21 '14 09:04

Solace


1 Answers

The term "repeating group" originally meant the concept in CODASYL and COBOL based languages where a single field could contain an array of repeating values. When E.F.Codd described his First Normal Form that was what he meant by a repeating group. The concept does not exist in any modern relational or SQL-based DBMS.

The term "repeating group" has also come to be used informally and imprecisely by database designers to mean a repeating set of columns, meaning a collection of columns containing similar kinds of values in a table. This is different to its original meaning in relation to 1NF. For instance in the case of a table called Families with columns named Parent1, Parent2, Child1, Child2, Child3, ... etc the collection of Child N columns is sometimes referred to as a repeating group and assumed to be in violation of 1NF even though it is not a repeating group in the sense that Codd intended.

This latter sense of a so-called repeating group is not technically a violation of 1NF if each attribute is only single-valued. The attributes themselves do not contain repeating values and therefore there is no violation of 1NF for that reason. Such a design is often considered an anti-pattern however because it constrains the table to a predetermined fixed number of values (maximum N children in a family) and because it forces queries and other business logic to be repeated for each of the columns. In other words it violates the "DRY" principle of design. Because it is generally considered poor design it suits database designers and sometimes even teachers to refer to repeated columns of this kind as a "repeating group" and a violation of the spirit of the First Normal Form.

This informal usage of terminology is slightly unfortunate because it can be a little arbitrary and confusing (when does a set of columns actually constitute a repetition?) and also because it is a distraction from a more fundamental issue, namely the Null problem. All of the Normal Forms are concerned with relations that don't permit the possibility of nulls. If a table permits a null in any column then it doesn't meet the requirements of a relation schema satisfying 1NF. In the case of our Families table, if the Child columns permit nulls (to represent families who have fewer than N children) then the Families table doesn't satisfy 1NF. The possibility of nulls is often forgotten or ignored in normalization exercises but the avoidance of unnecessary nullable columns is one very good reason for avoiding repeating sets of columns, whether or not you call them "repeating groups".

See also this article.

like image 101
nvogel Avatar answered Oct 16 '22 10:10

nvogel