Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the best practices for database scripts under code control

Tags:

We are currently reviewing how we store our database scripts (tables, procs, functions, views, data fixes) in subversion and I was wondering if there is any consensus as to what is the best approach?

Some of the factors we'd need to consider include:

  • Should we checkin 'Create' scripts or checkin incremental changes with 'Alter' scripts
  • How do we keep track of the state of the database for a given release
  • It should be easy to build a database from scratch for any given release version
  • Should a table exist in the database listing the scripts that have run against it, or the version of the database etc.

Obviously it's a pretty open ended question, so I'm keen to hear what people's experience has taught them.

like image 362
Brett Hannah Avatar asked Dec 04 '08 13:12

Brett Hannah


People also ask

How do you maintain a database script?

The approach that seems to be the most common is to create a table in the database which keeps track of the version of a schema and reference data. This version is not to be confused with the version of the DBMS. Then you create scripts which incrementally modify the schema and update that table.

What is the best practice for database design?

The database belongs to its future users, not its creator, so design with them in mind. Stay away from shortcuts, abbreviations, or plurals. Use consistent naming conventions. Don't reinvent the wheel or make things difficult for those who may need to modify the database at some point, which will certainly happen.

What are database scripts?

The Database Scripts project is a series of command line scripts which will dump, erase, restore and merge databases.


2 Answers

After a few iterations, the approach we took was roughly like this:

One file per table and per stored procedure. Also separate files for other things like setting up database users, populating look-up tables with their data.

The file for a table starts with the CREATE command and a succession of ALTER commands added as the schema evolves. Each of these commands is bracketed in tests for whether the table or column already exists. This means each script can be run in an up-to-date database and won't change anything. It also means that for any old database, the script updates it to the latest schema. And for an empty database the CREATE script creates the table and the ALTER scripts are all skipped.

We also have a program (written in Python) that scans the directory full of scripts and assembles them in to one big script. It parses the SQL just enough to deduce dependencies between tables (based on foreign-key references) and order them appropriately. The result is a monster SQL script that gets the database up to spec in one go. The script-assembling program also calculates the MD5 hash of the input files, and uses that to update a version number that is written in to a special table in the last script in the list.

Barring accidents, the result is that the database script for a give version of the source code creates the schema this code was designed to interoperate with. It also means that there is a single (somewhat large) SQL script to give to the customer to build new databases or update existing ones. (This was important in this case because there would be many instances of the database, one for each of their customers.)

like image 88
pdc Avatar answered Nov 09 '22 15:11

pdc


There is an interesting article at this link: https://blog.codinghorror.com/get-your-database-under-version-control/

It advocates a baseline 'create' script followed by checking in 'alter' scripts and keeping a version table in the database.

like image 44
Brett Hannah Avatar answered Nov 09 '22 15:11

Brett Hannah