Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Denormalizing for sanity or performance?

I've started a new project and they have a very normalized database. everything that can be a lookup is stored as the foreign key to the lookup table. this is normalized and fine, but I end up doing 5 table joins for the simplest queries.

    from va in VehicleActions
    join vat in VehicleActionTypes on va.VehicleActionTypeId equals vat.VehicleActionTypeId
    join ai in ActivityInvolvements on va.VehicleActionId equals ai.VehicleActionId
    join a in Agencies on va.AgencyId equals a.AgencyId
    join vd in VehicleDescriptions on ai.VehicleDescriptionId equals vd.VehicleDescriptionId
    join s in States on vd.LicensePlateStateId equals s.StateId
    where va.CreatedDate > DateTime.Now.AddHours(-DateTime.Now.Hour)
    select new {va.VehicleActionId,a.AgencyCode,vat.Description,vat.Code,
vd.LicensePlateNumber,LPNState = s.Code,va.LatestDateTime,va.CreatedDate}

I'd like to recommend that we denormaize some stuff. like the state code. I don't see the state codes changing in my lifetime. similar story with the 3-letter agency code. these are handed out by the agency of agencies and will never change.

When I approached the DBA with the state code issue and the 5 table joins. i get the response that "we are normalized" and that "joins are fast".

Is there a compelling argument to denormalize? I'd do it for sanity if nothing else.

the same query in T-SQL:

    SELECT VehicleAction.VehicleActionID
      , Agency.AgencyCode AS ActionAgency
      , VehicleActionType.Description
      , VehicleDescription.LicensePlateNumber
      , State.Code AS LPNState
      , VehicleAction.LatestDateTime AS ActionLatestDateTime
      , VehicleAction.CreatedDate
FROM VehicleAction INNER JOIN
     VehicleActionType ON VehicleAction.VehicleActionTypeId = VehicleActionType.VehicleActionTypeId INNER JOIN
     ActivityInvolvement ON VehicleAction.VehicleActionId = ActivityInvolvement.VehicleActionId INNER JOIN
     Agency ON VehicleAction.AgencyId = Agency.AgencyId INNER JOIN
     VehicleDescription ON ActivityInvolvement.VehicleDescriptionId = VehicleDescription.VehicleDescriptionId INNER JOIN
     State ON VehicleDescription.LicensePlateStateId = State.StateId
Where VehicleAction.CreatedDate >= floor(cast(getdate() as float))
like image 335
MarkDav.is Avatar asked Oct 15 '09 21:10

MarkDav.is


People also ask

What is the purpose of denormalization?

Denormalization is the process of adding precomputed redundant data to an otherwise normalized relational database to improve read performance of the database. Normalizing a database involves removing redundancy so only a single copy exists of each piece of information.

How denormalization improves the performance?

Denormalization can improve performance by: Minimizing the need for joins. Precomputing aggregate values, that is, computing them at data modification time, rather than at select time. Reducing the number of tables, in some cases.

What is the drawback of denormalization?

Disadvantages of DenormalizationUpdates and inserts are more expensive. If a piece of data is updated in one table, all values duplicated in other tables need to be updated as well. Similarly, when inserting new values, we need to store data both in the normalized table and in the denormalized table.


2 Answers

I don't know if I would even call what you want to do denormalization -- it looks more like you just want to replace artificial foreign keys (StateId, AgencyId) with natural foreign keys (State Abbreviation, Agency Code). Using varchar fields instead of integer fields will slow down join/query performance, but (a) if you don't even need to join the table most of the time because the natural FK is what you want anyway it's not a big deal and (b) your database would need to be pretty big/have a high load for it to be noticeable.

But djna is correct in that you need a complete understanding of current and future needs before making a change like this. Are you SURE the three letter agency codes will never change, even five years from now? Really, really sure?

like image 119
Paul Abbott Avatar answered Sep 21 '22 17:09

Paul Abbott


Some denormalization can be needed for performance (and sanity) reasons at some times. Hard to tell wihout seeing all your tables / needs etc...

But why not just build a few convenience views (to do a few joins) and then use these to be able to write simpler queries?

like image 21
ChristopheD Avatar answered Sep 19 '22 17:09

ChristopheD