I've got a table in SQL holding Alarm data from an iFix SCADA system that is designed as follows:
CREATE TABLE [dbo].[FIXALARMS](
[ALM_ID] [int] IDENTITY(1,1) NOT NULL,
[ALM_NATIVETIMEIN] [datetime] NOT NULL,
[ALM_PHYSLNODE] [char](8) NOT NULL,
[ALM_TAGNAME] [varchar](32) NOT NULL,
[ALM_VALUE] [varchar](16) NOT NULL,
[ALM_MSGTYPE] [varchar](8) NOT NULL,
[ALM_DESCR] [varchar](128) NOT NULL,
[ALM_ALMSTATUS] [varchar](4) NOT NULL,
[ALM_ALMPRIORITY] [varchar](8) NOT NULL,
[ALM_ALMAREA] [varchar](32) NOT NULL,
)
The SCADA designates what the columns must be named and the datatypes and gives no other options to split the data into multiple tables. Pretty much, I'll be forced to take in information in this format and it's become a pain because a lot of my queries are using multiple like's and string comparisons on data that really should be id'd.
Now, I would like to normalize this table and split it into multiple tables with keyed relationships so that I can save database space, query performance, and add a bit of flexibility.
Would this be best achieved by using database triggers? The database is growing at about 40Mb/Day (~300k rows) and my lack of SQL experience makes me fear adding extra load to the server when an entry is being added. Instead should I use an agent to just clean the table up every now and then? What would be some other options that I might not know about?
If you don't need the data realtime, use SSIS to set up an ETl process to transform the data to your normalized form.
If you must have the data realtime, then use triggers but use them very carefully and make sure they handle sets of data. Sql server triggers should never be expected to handle only one row of data. You would want to make sure your trigger code is as performant as it can be since you have high data entry onteh table you are grabbing the data from. That means you need to read a good book on performance tuning techniques, so you know what query forms to avoid using such as correlated subqueries, non-sargable where clauses, cursors. You would want to expensively test under load for blocking issues as well.
Triggers are going to add some processing overhead and possibly introduce contention and transactional locks. If the complicated queries are not being executed against data that is complete up-to-the-minute, then you could get by with a regularly scheduled ETL process that extracts the data and translates if to a more usable form. You could schedule the ETL to run daily or every few hours as required.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With