I have a database which stores (among other things), the following pieces of information:
BIGINT
sBIGINT
sVARCHAR
sVARCHAR
sI'd like to be able to capture a more refined definition of these datatypes. For instance, the hardware IDs have no numerical significance, so I don't care how they are formatted when displayed. The Storage Capacities, however, are cardinal numbers and, at a user's request, I'd like to present them with thousands and decimal separators, e.g. 123,456.789. Thus, I'd like to refine BIGINT
into, say ID_NUMBER
and CARDINAL
.
The same with Hardware Names, which are simple text and WWPNs, which are hexstrings, e.g. 24:68:AC:E0. Thus, I'd like to refine VARCHAR
into ENGLISH_WORD
and HEXSTRING
.
The specific datatypes I made up are just for illustrative purposes.
I'd like to keep all this information in one place and I'm wondering if anybody knows of a good way to hold this all in my MySQL table definitions. I could use the Comment field of the table definition, but that smells fishy to me.
One approach would be to define the data structure elsewhere and use that definition to generate my CREATE TABLE
s, but that would be a major rework of the code that I currently have, so I'm looking for alternatives.
Any suggestions? The application language in use is Perl, if that helps.
MySQL supports SQL data types in several categories: numeric types, date and time types, string (character and byte) types, spatial types, and the JSON data type.
A user-defined data type (UDT) is a data type that derived from an existing data type. You can use UDTs to extend the built-in types already available and create your own customized data types.
The only real advantage to using user-defined data types is that they add the ability to self-document your data structures. For example, in several different tables you might have columns defined as varchar(30), and those columns might have nothing to do with each other.
In MySQL there are three main data types: string, numeric, and date and time.
A good way to do this is to use views. For example, to insert commas into a cardinal number, you can use:
mysql> create table foo (id int);
Query OK, 0 rows affected (0.12 sec)
mysql> insert into foo (id) values ( 123456789);
Query OK, 1 row affected (0.00 sec)
mysql> create view v_foo as select format(id, 0) as id from foo;
Query OK, 0 rows affected (0.10 sec)
mysql> select * from v_foo;
+---------------+
| id |
+---------------+
| 123,456,789 |
+---------------+
1 row in set (0.02 sec)
You can use other string functions to format your other fields, and store them in the view definition.
I'll propose an answer that questions the question.
One of the mantras that people who model the databases like to hum is the separation of the presentation layer (formatting) and data and I believe that relevant part from like goes something like:
'Thou shall not store formatted data in your databases, nor shall you discriminate against any formatting choice. Thou shall store the data in the native supported data types. Thou applications shall provide presentation layer and format your columns.'
Well, friedo's answer does not go directly against this - data is only presented through a view, the storage is still native.
Still, it depends how do you define presentation layer there - if the view and the server settings are considered part of presentation layer then all is fine, otherwise there is potential trouble as I, possible user of your system, will not be able to specify the fact that my thousand separator is a single quote (and it is, at least at the place of my current residence).
Also, once you go that road, how long do you think it will pass until you will have to deal with requests to re-parse the data back from text into a number and possibly end up in situations where this might be ambiguous (such as DD/MM/YY vs MM/DD/YY)?
The above rant is only regarding formatting, determining the number of decimal digits defines the domain of your data and is a good thing as it limits down possibility of inconsistent data entering your database.
EDIT: (entertaining the purist point of view a bit further, regarding number bases) Saying that hexadecimal number data has no meaning in other bases is generally a false statement. Number values have no base and can be represented in any base. Their domain (the set of allowed values) is the same.
The choice of hexadecimal for MAC address is a natural one due to historical reasons and the fact that it is for example easy to read the vendor part in that format. The choice of 'funny' format for IPv4 addresses is a historical with probably an anecdotal reason.
But both are only a choice and internally a good system will store them without bias (for example storing IPv4 as text is not a good thing). When RDBMS present you the results of a query (on a screen) it already takes a role of an application and format the results in some way.
This is not significant and the format you'll use in your application should not influence how you store the Storage Capacities or other entity properties.
So I am saying that this is application configuration data (metadata to the core date) and of course it can/should be stored in the database, but with MySQL (which is not so rich in defining custom types) it can't fit in the table definition and should be simply stored in another table that application will read and apply to your columns when presenting data to the user and not in some hackish way which will not be portable.
For example the view idea works, but can you query the view easily to get the formats that are applied to fields? Or lets say you want to change the formatting in all occurrences of field WWPN in all queries that use it (hexstring also sounds as already wrong), would that be easy? Or if there are other queries that transform the data and write it down in another table will you write it down with applied format or without it (re-parsing)? Etc...
Now if you had a table that stores application configuration data such as FieldFormatting: Table, Field, Format, CheckRules, LongFormat (or whatever makes most sense in your situation) then the above questions become a bit easier to deal with and you get to choose extra options for your application and business logic.
If you really (really, really) have to provide direct access to the database and the native types would make data unreadable for the users and you simply must preformat then you could even use the above table to generate and update the views/queries semi-automatically.
NOTE: I am taking a purist point of view here since I have a feeling that you are making design decisions here and not chasing last drop of performance or convenience (for example between application data types and database data types) when practical issues can be more important than modelling guidelines and rules. But the questions from the last paragraph still stand.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With