Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices for creating a data model [closed]

For a current project I'm creating a data model. Are there any sources where I can find "best practices" for a good data model? Good means flexible, efficient, with good performance, style, ... Some example questions would be "naming of columns", "what data should be normalized", or "which attributes should be exported into an own table". The source should be a book :-)

like image 785
strauberry Avatar asked Aug 17 '11 15:08

strauberry


People also ask

What are the 5 data modeling techniques?

The following are the types of data modeling techniques: hierarchical, network, relational, object-oriented, entity-relationship, dimensional, and graph.

What are the three 3 major steps in the data Modelling phase?

They help an organization's efforts in organizing, understanding, and making productive use of enterprise data resources. There are three stages of data modeling, with each stage pertaining to its own type of data model – conceptual data models, logical data models and physical data models.


2 Answers

Personally I think you should read a book on performance tuning before beginning to model a database. The right design can make a world of difference. If you are not expert in performance tuning, you aren't qualified to design a database.

These books are Database specific, here is one for SQl Server. http://www.amazon.com/Server-Performance-Tuning-Distilled-Experts/dp/1430219025/ref=sr_1_1?s=books&ie=UTF8&qid=1313603282&sr=1-1

Another book that you should read before starting to design is about antipatterns. Always good to know what you should avoid doing. http://www.amazon.com/SQL-Antipatterns-Programming-Pragmatic-Programmers/dp/1934356557/ref=sr_1_1?s=books&ie=UTF8&qid=1313603622&sr=1-1

Do not get stuck in the trap of designing for flexibility. People use that as a way to get out of doing the work to design correctly and flexible databases almost always perform badly. If more than 5% of your database design depends on flexibility, you haven't modeled correctly in my opinion. All the worst COTS products I've had to work with were designed for flexibility first.

Any decent database book will discuss normalization. You can also find that information easily on the web. Be sure to actually create FK/PK relationships.

As far as naming columns, pick a standard and stick with it consistently. Consistency is more important than the actual standard. Don't name columns ID (see SQL antipatterns book). Use the same name and datatypes if columns are going to be in several different tables. What you are going for is to not have to use functions to do joins because of datatype mismatches.

Always remember that databases can (and will) be changed outside the application. Anything that is needed for data integrity must be in the database not the application code. The data will be there long after the application has been replaced.

The most important things for database design:

  • Thorough definition of the data needed (including correct datatypes) and the relationships between pieces of data (including correct normalization)
  • data integrity
  • performance
  • security
  • consistency (of datatypes, naming standards etc.)
like image 85
HLGEM Avatar answered Oct 02 '22 16:10

HLGEM


The best book I've read on the design of database systems was "An Introduction to Database Systems". Joe Celko's SQL for Smarties books are also worth reading. Assuming you're building an application and not just a database, and assuming you're using an Object Oriented language, Applying UML and Patterns by Craig Larman has a good discussion on mapping databases to objects.

In terms of defining "good", in my experience "maintainable" is probably top of the list. Maintainability in database design means many things, such as sticking to conventions - I often recommend http://justinsomnia.org/2003/04/essential-database-naming-conventions-and-style/. Normalization is another obvious maintainability strategy. I often recommend being generous with column types - it's hard to change an application if you find out that postal codes in different countries are longer than in the US. I often recommend using views to abstract complex data relations away for less experienced developers.

A key thing with maintainability is the ability to test and deploy. It's worth reading up about Continuous Database Integration (http://www.codeproject.com/KB/architecture/Database_CI.aspx) - whilst not strictly associated with the design of the database schema, it's important context.

As for performance - I believe you should design for maintainability first, and only design for performance if you know you have a problem. Sometimes, you know in advance that performance will be a major problem - designing a database for Facebook (or Stack Exchange), designing a database with huge amounts of data (terabytes and up), or huge numbers of users. Most systems don't fall into that camp - so I recommend regular performance tests, with representative data, to find if you have a problem, and only tune when you can prove you have to. Many performance optimizations are at the expense of maintainability - denormalization, for instance.

Oh, and in general, avoid triggers and stored procedures if you can. That's just my opinion, though...

like image 30
Neville Kuyt Avatar answered Oct 02 '22 16:10

Neville Kuyt