I have a budding developer who is very enthusiastic about something he is calling “the matrix”
I am looking for peer insight
In a nutshell this is what we have:
- 1 highly denormalized table with about 120 columns
- Data points range from account, customer, household, relationship, product, employee, etc…
- One index per column: about 120 non-clustered indexes
- About 90% of all space in the database used by indexes today are indexes on this table
- Today about 1.5 million rows with a lot of nulls
- Table loaded with a stored procedure whose core is dynamic SQL
- All Field names are generic and do not describe the data
- A data dictionary type table is used with the dynamic SQL to load any data point to any field
- Field mapping is not static: today column dim_0001 is customer name, but tomorrow maybe something else
- No primary key
- No foreign keys
- No real constraints (For example all fields are nullable)
The argument for the table:
- Makes writing queries simpler because it eliminates the needs to write some join
The intended use:
- An End User Layer and would be a core component of a Universe build in Business Objects
- Post ETL process development
My recommendation will either kill the process where it is today (early development in a test environment) or move it to the next step in test.
Based on the research I have done, my education, and experience I do not support it and want the tables dropped as soon as the one or two processes that depend on these tables have been migrated to another solution.
Script below for your reference (I limited to one index example).
Any insight you can offer (even just a one word opinion) is valuable
-- The Matrix
CREATE TABLE [z005497].[tblMatrix](
[as_of_dt] [datetime] NOT NULL,
[dim_0001] [varchar](100) NULL,
[dim_0002] [varchar](103) NULL,
[dim_0003] [varchar](100) NULL,
[dim_0004] [varchar](100) NULL,
[dim_0005] [varchar](100) NULL,
[dim_0006] [varchar](100) NULL,
[dim_0007] [varchar](100) NULL,
[dim_0008] [varchar](100) NULL,
[dim_0009] [varchar](100) NULL,
[dim_0010] [varchar](100) NULL,
[dim_0011] [varchar](100) NULL,
[dim_0012] [varchar](100) NULL,
[dim_0013] [varchar](100) NULL,
[dim_0014] [varchar](100) NULL,
[dim_0015] [varchar](100) NULL,
[dim_0016] [varchar](100) NULL,
[dim_0017] [varchar](103) NULL,
[dim_0018] [varchar](103) NULL,
[dim_0019] [varchar](103) NULL,
[dim_0020] [varchar](103) NULL,
[dim_0021] [varchar](103) NULL,
[dim_0022] [varchar](103) NULL,
[dim_0023] [varchar](103) NULL,
[dim_0024] [varchar](103) NULL,
[dim_0025] [varchar](103) NULL,
[dim_0026] [varchar](11) NULL,
[dim_0027] [varchar](11) NULL,
[dim_0028] [varchar](11) NULL,
[dim_0029] [varchar](11) NULL,
[dim_0030] [varchar](11) NULL,
[dim_0031] [varchar](11) NULL,
[dim_0032] [varchar](11) NULL,
[dim_0033] [varchar](11) NULL,
[dim_0034] [varchar](11) NULL,
[dim_0035] [varchar](11) NULL,
[dim_0036] [varchar](11) NULL,
[dim_0037] [varchar](11) NULL,
[dim_0038] [varchar](11) NULL,
[dim_0039] [varchar](11) NULL,
[dim_0040] [varchar](11) NULL,
[dim_0041] [varchar](11) NULL,
[dim_0042] [varchar](11) NULL,
[dim_0043] [varchar](11) NULL,
[dim_0044] [varchar](11) NULL,
[dim_0045] [varchar](11) NULL,
[dim_0046] [varchar](11) NULL,
[dim_0047] [varchar](11) NULL,
[dim_0048] [varchar](11) NULL,
[dim_0049] [varchar](11) NULL,
[dim_0050] [varchar](11) NULL,
[dim_0051] [varchar](11) NULL,
[dim_0052] [varchar](11) NULL,
[dim_0053] [varchar](11) NULL,
[dim_0054] [varchar](5) NULL,
[dim_0055] [varchar](5) NULL,
[dim_0056] [varchar](5) NULL,
[dim_0057] [varchar](5) NULL,
[dim_0058] [varchar](5) NULL,
[dim_0059] [varchar](5) NULL,
[dim_0060] [varchar](5) NULL,
[dim_0061] [varchar](5) NULL,
[dim_0062] [varchar](5) NULL,
[dim_0063] [varchar](5) NULL,
[dim_0064] [varchar](5) NULL,
[dim_0065] [varchar](5) NULL,
[dim_0066] [varchar](5) NULL,
[dim_0067] [varchar](5) NULL,
[dim_0068] [varchar](5) NULL,
[dim_0069] [varchar](5) NULL,
[dim_0070] [varchar](5) NULL,
[dim_0071] [varchar](5) NULL,
[dim_0072] [varchar](5) NULL,
[dim_0073] [varchar](5) NULL,
[dim_0074] [varchar](5) NULL,
[dim_0075] [varchar](5) NULL,
[dim_0076] [varchar](5) NULL,
[dim_0077] [varchar](5) NULL,
[dim_0078] [varchar](5) NULL,
[dim_0079] [varchar](5) NULL,
[dim_0080] [varchar](5) NULL,
[dim_0081] [varchar](5) NULL,
[dim_0082] [varchar](5) NULL,
[dim_0083] [varchar](5) NULL,
[dim_0084] [int] NULL,
[dim_0085] [int] NULL,
[dim_0086] [int] NULL,
[dim_0087] [int] NULL,
[dim_0088] [int] NULL,
[dim_0089] [int] NULL,
[dim_0090] [int] NULL,
[dim_0091] [int] NULL,
[dim_0092] [int] NULL,
[dim_0093] [int] NULL,
[dim_0094] [varchar](12) NULL,
[dim_0095] [varchar](12) NULL,
[dim_0096] [varchar](12) NULL,
[dim_0097] [varchar](120) NULL,
[dim_0098] [varchar](120) NULL,
[dim_0099] [varchar](120) NULL,
[dim_0100] [numeric](20, 0) NULL,
[dim_0101] [varchar](20) NULL,
[dim_0102] [varchar](20) NULL,
[dim_0103] [varchar](20) NULL,
[dim_0104] [varchar](20) NULL,
[dim_0105] [varchar](20) NULL,
[dim_0106] [varchar](20) NULL,
[dim_0107] [varchar](20) NULL,
[dim_0108] [varchar](20) NULL,
[dim_0109] [varchar](20) NULL,
[dim_0110] [varchar](20) NULL,
[dim_0111] [varchar](20) NULL,
[dim_0112] [varchar](20) NULL,
[dim_0113] [varchar](20) NULL,
[dim_0114] [varchar](20) NULL,
[dim_0115] [varchar](20) NULL,
[dim_0116] [varchar](20) NULL,
[dim_0117] [varchar](20) NULL,
[dim_0118] [varchar](20) NULL,
[dim_0119] [varchar](20) NULL,
[dim_0120] [varchar](20) NULL,
[lastLoad] [datetime] NULL
) ON [PRIMARY]
-- Index example
CREATE NONCLUSTERED INDEX [idx_dim_0001 (not unique)] ON [z005497].[tblMatrix]
(
[dim_0001] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
-- The configuration table from which developers would find out what is in the Matrix
CREATE TABLE [z005497].[tblMatrixCfg](
[dimId] [int] IDENTITY(100000,1) NOT NULL,
[colName] [varchar](25) NOT NULL,
[dataType] [varchar](25) NOT NULL,
[dimName] [varchar](25) NOT NULL,
[dimDesc] [varchar](500) NOT NULL,
[dimpath] [varchar](5000) NOT NULL,
[loadDate] [datetime] NOT NULL,
[modUser] [varchar](100) NOT NULL,
[modDate] [datetime] NOT NULL,
CONSTRAINT [PK_tblMatrixCfg_1] PRIMARY KEY CLUSTERED
(
[dimId] ASC,
[colName] ASC,
[dimName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Kill it if you can.
Also, that developer needs a lot more experience. And he/she should get it at another company.
It's basically violating so many things I don't know where to start.
Even if you end up fighting a highly normalized model which is following someone's best practices slavishly, it won't compare to the disaster which this design is going to create.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With