I am starting to build a new database for my project using PostgreSQL. (I am new to PostgreSQL and database by the way.)
I think my development workflow is very bad, and here is a part of it:
Here are some bads I can think of:
So my workflow is bad. I was wondering what other Postgres developers' workflow looks like.
Are there any good tools (free or cheap) for editing and saving scripts? good IDE maybe?
It would be great if I can create automated unit tests for the database.
Any tool for recreating the database? CI server tool?
Basically I am looking for any advice, good practice, or good tool for database development.
(Sorry, this question may not fit for the Q&A format, but I do not know where else to ask this question.)
Check out liquibase. We use it in the company I work at to setup our PostgreSQL database. It's open source, easy to use and the changelog file you end up with can be added to source control. Each changeset gets an id, so that each changeset is only run once. You end up with two extra tables for tracking the changes to the database when it's run.
While it's DB agnostic, you can use PostgreSQL SQL directly in each changeset and each changeset can have it's own comments.
The only caveat from having used it is that you have to caution yourself and others not to re-use a changeset once it's been applied to a database. Any changes to an already applied changeset result in a different checksum (even whitespace) which can cause liquibase to abort it's updates. This can end up in failed DB updates in the field, so each update to any of the changelogs should be tested locally first. Instead all changes, however minor should be inserted into a new changeset with a new id. They have a changeset sub-tag called "validCheckSum" to let you work around this, but I think it's better to try to enforce always making a new changeset.
Here are the doc links for creating a table and creating a view for example.
Well, your question is actually quite relevant to any database developer, and, if I understand it correctly, there is another way to get to your desired results.
One interesting thing to mention is that your idea of separating different changes into different files is the concept of migrations of Ruby On Rails. You might even be able to use the rake utility to keep track of a workflow like yours.
But now to what I think migh be your solution. PostgreSQL, and others to be sincere, have specific utilities to handle data and schemas like what you probably need.
The pg_dumpall command-line executable will dump the whole database into a file or the console, in a way that the psql utility can simply "reload" into the same, or into another (virgin) database.
So, if you want to keep only the current schema (no data!) of a running database cluster, you can, as the postgres-process owner user:
$ pg_dumpall --schema-only > schema.sql
Now the schema.sql will hold exactly the same users/databases/tables/triggers/etc, but not data. If you want to have a "full-backup" style dump (and that's one way to take a full backup of a database), just remove the "--schema-only" option from the command line.
You can reload the file into another (should be virgin, you might mess up a database with other data doing this):
$ psql -f schema.sql postgres
Now if you only want to dump one database, one table, etc. you should use the pg_dump utility.
$ pg_dump --schema-only <database> > database-schema.sql
And then, to reload the database into a running postgresql server:
$ psql <database> < database-schema.sql
As for version control, you can just keep the schema.sql file under it, and just dump the database again into the file before every vc commit. So at some particular version control state will you have the code and the working database schema that goes with it.
Oh, and all the tools I mentioned are free, and pg_dump and pg_dumpall come with the standard PostgreSQL installation.
Hope that helps,
Marco
You're not far off. I'm a Java developer, not a DBA, but building out the database as a project grows is an important task for the teams I've been on, and here's how I've seen it done best:
#!/bin/bash
export PGDATABASE=your_db export
export PGUSER=your_user export
export PGPASSWORD=your_password
for SQL_SCRIPT in $( find ./ -name "*.sql" -print | sort);
doecho "**** $SQL_SCRIPT ****" psql -q < $SQL_SCRIPT
done
As the project grows, use new alter scripts to change the table, don't redefine the table in the initial script.
All scripts are checked into source control. Each release is tagged so you can regenerate that version of the database in the future.
For unit testing and CI, most CI servers can run a script to drop and recreate a schema. One oft-cited framework for PostGresql unit testing is pgTAP
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With