Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Representation of dimension units in a standardized way

Suppose you want to write into a database that something is 30 meters long, or 50 feet, or the temperature was 50 kelvin, the speed was 50 kilometers per hour. How would you represent the units ?

To clarify, two points:

  • any kind of units, not a predefined, well defined subset of them.
  • my question is more relative to the existence of an ontology of units. I took the database example because it was the first that crossed my mind, but scenarios like representing the unit in XML or JSON are equally likely.
like image 552
Stefano Borini Avatar asked Feb 28 '23 07:02

Stefano Borini


2 Answers

One of the fundamental concepts of relational database design is that all values in a given column should represent some logically compatible type of data. Formally, a column has exactly one single type, and any two values in a type can be compared to each other in an equality predicate. This is a crucial part of type theory.

So if the measurements are not comparable, i.e. length vs. temperature, you shouldn't store them in the same column.

You might want to look at ISO 2955, "Information processing - Representation of SI and other units in Systems with limited Character sets."

Also see "Joe Celko's SQL Programming Style," chapter 4, Scales and Measurements.

like image 104
Bill Karwin Avatar answered Mar 05 '23 17:03

Bill Karwin


Relational theory has it that each relvar ("table") has an associated predicate that defines the meaning of the tuples therein. That predicate ought to be part of the formal documentation of the database, such that no one who actually consults the documentation can have any excuse for "having misunderstood something" (unless the documentation is incomplete of course).

Including the definition of units in that predicate (e.g. "The length of person ... is FEET.", "The measured temperature was ... KELVIN", ...) achieves that completeness and avoids having to resort to those rather ugly attribute ("column") names.

I don't understand why "just storing the numbers" (in a standard unit that is agreed upon by all users) would be "not easy".

If foobaricity exists as a unit, and someone comes up with a new unit fluffyperception, then that someone will first have to formally establish the correspondance between quantities of foobaricity and quantities of fluffyperception anyway, or nothing he states will/can be understood by anyone.

EDIT

I saw this added : "I need to preserve the information about the original unit."

Nothing stops you from doing that. Two extra columns (original quantity and original unit name) alongside the "canonicalized" value. You can constrain "original unit name" as strong or as lax as you want.

like image 38
Erwin Smout Avatar answered Mar 05 '23 17:03

Erwin Smout