I am working on a legacy small-business automation system (inventory, sales, procurement, etc.) that has a single database hosted by SQL Server 2005 and a bunch of client applications. The main client (used by all users) is an MS Access 2003 application (ADP), and other clients include various VB/VBA applications like Excel add-ins and command-line utilities.
In addition to 60 or so tables (mostly in 3NF), the database contains about 200 views, about 170 UDFs (mostly scalar and table-valued inline ones), and about 50 stored procedures. As you might have guessed, some portion of so-called "business logic" is encapsulated in this mass of T-SQL code (and thus is shared by all clients).
Overall, the system's code (including the T-SQL code) is not very well organized and is very refactoring-resistant, so to speak. In particular, schemata of most of the tables cry for all kinds of refactorings, small (like column renamings) and large (like normalization).
FWIW, I have pretty long and decent application development experience (C/C++, Java, VB, and whatnot), but I am not a DBA. So, if the question will look silly to you, now you know why it is so. :-)
While thinking about refactoring all this mess (in a peacemeal fashion of course), I've come up with the following idea:
For each table, create a "wrapper" view that (a) has all the columns that the table has; and (b) in some cases, has some additional computed columns based on the table's "real" columns.
A typical (albeit simplistic) example of such additional computed column would be sale price of a product derived from the product's regular price and discount.
Reorganize all the code (both T-SQL and VB/VBA client code) so that only the "wrapper" views refer to tables directly.
So, for example, even if an application or a stored procedure needed to insert/update/delete records from a table, they'd do that against the corresponding "table wrapper" view, not against the table directly.
So, essentially this is about isolating all the tables by views from the rest of the system.
This approach seems to provide a lot of benefits, especially from maintainability viewpoint. For example:
When a table column is to be renamed, it can be done without rewriting all the affected client code at once.
It is easier to implement derived attributes (easier than using computed columns).
You can effectively have aliases for column names.
Obviously, there must be some price for all these benefits, but I am not sure that I am seeing all the catches lurking out there.
Did anybody try this approach in practice? What are the major pitfalls?
One obvious disadvantage is the cost of maintaining "wrapper" views in sync with their corresponding tables (a new column in a table has to be added to a view too; a column deleted from a table has to be deleted from the view too; etc.). But this price seems to be small and fair for making the overall codebase more resilient.
Does anyone know any other, stronger drawbacks?
For example, usage of all those "wrapper" views instead of tables is very likely to have some adverse performance impact, but is this impact going to be substantial enough to worry about it? Also, while using ADODB, it is very easy to get a recordset that is not updateable even when it is based just on a few joined tables; so, are the "wrapper" views going to make things substantially worse? And so on, and so forth...
Any comments (especially shared real experience) would be greatly appreciated.
Thank you!
P.S. I stepped on the following old article that discusses the idea of "wrapper" views:
The Big View Myth
The article advises to avoid the approach described above. But... I do not really see any good reasons against this idea in the article. Quite the contrary, in its list of good reasons to create a view, almost each item is exactly why it is so tempting to create a "wrapper" view for each and every table (especially in a legacy system, as a part of refactoring process).
The article is really old (1999), so whatever reasons were good then may be no longer good now (and vice versa). It would be really interesting to hear from someone who considered or even tried this idea recently, with the latest versions of SQL Server and MS Access...
The falsehood is that Views are slower because the database has to calculate them BEFORE they are used to join to other tables and BEFORE the where clauses are applied. If there are a lot of tables in the View, then this process slows everything down.
Although there are many advantages to views, the main disadvantage to using views rather than real tables is performance degradation. Because views only create the appearance of a table, not a real table, the query processor must translate queries against the view into queries against the underlying source tables.
Yes, when you update a view it affects original table, because SELECT opens an implicit cursor which points to the records from the base table. When ever you update the records it updates the original data since they are pointers to the original.
When designing a database, I prefer the following:
this allows the DBA to work directly with the table (to add columns, clean things up, inject data, etc.) without disturbing the code base, and it insulates the code base from any changes made to the table (temporary or otherwise)
there may be performance penalties for doing things this way, but so far they have not been significant - and the benefits of the layer of insulation have been life-savers several times
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With