Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to automate source control with Oracle database

I work in an Oracle instance that has hundreds of schemas and multiple developers. We have a development instance where developers can integrate their work before test or production.

We want to have source control for all the DDL run in this integrated development database. Currently this is done through a product Red Gate which we run manually after we make a change to the database. Redgate finds the changes between what is in the schema and what was last checked into source control and makes a script of the differences and puts this into source control.

The problem however is of course that running regdate can take some time and people run it infrequently or not at all for small changes. Also redgate will only look in one schema at a time and it would be VERY time consuming to manually run it against all schemas to guarantee that they are up to date. However if the source controlled code cannot be relied upon it becomes less useful...

What would seem to be ideal would be to have some software that could periodically (even once a day), or when triggered by DDL being run, update the source control (preferably github as this is used by other teams) from all the schemas.

I cannot seem to see any existing software which can be simply used to do this.

Is there a problem with doing this? (there is no need to address multiple developers overwriting each others work on the same day as we have this covered in a separate process) Is anyone doing this? Can anyone recommend a way to do this?

like image 252
Emu Avatar asked Dec 11 '22 00:12

Emu


2 Answers

We do this with help of a PL/SQL function, a python script and a shell script:

  • The PL/SQL function can generate the DDL of a whole schema and returns this as CLOB
  • The python script connects to the database, fetches the DDL and stores it in files
  • The shell script runs the Source Control to add the modifications (we use Bazaar here).

You can see the scripts on PasteBin:

  • The PL/SQL function is here: http://pastebin.com/AG2Fa9zL
  • The python program (schema_exporter.py): http://pastebin.com/nd8Lf0gK
  • The shell script:

The shell script:

python schema_exporter.py
d=$(date +%Y-%m-%d__%H_%M_%S)
bzr add
bzr st | grep -q -E 'added|modified' &&  commit -m "Database objects on $d"
exit 0

This shell script is configured to run from cron every day.

like image 129
guthy Avatar answered Dec 23 '22 11:12

guthy


Being in the database version control space for 5 years (as director of product management at DBmaestro) and having worked as a DBA for over two decades, I can tell you the simple fact that you cannot treat the database objects as you treat your Java, C# or other files and save the changes in simple DDL scripts.

There are many reasons and I'll name a few:

  • Files are stored locally on the developer’s PC and the change s/he makes do not affect other developers. Likewise, the developer is not affected by changes made by her colleague. In database this is (usually) not the case and developers share the same database environment, so any change that were committed to the database affect others.
  • Publishing code changes is done using the Check-In / Submit Changes / etc. (depending on which source control tool you use). At that point, the code from the local directory of the developer is inserted into the source control repository. Developer who wants to get the latest code need to request it from the source control tool. In database the change already exists and impacts other data even if it was not checked-in into the repository.
  • During the file check-in, the source control tool performs a conflict check to see if the same file was modified and checked-in by another developer during the time you modified your local copy. Again there is no check for this in the database. If you alter a procedure from your local PC and at the same time I modify the same procedure with code form my local PC then we override each other’s changes.
  • The build process of code is done by getting the label / latest version of the code to an empty directory and then perform a build – compile. The output are binaries in which we copy & replace the existing. We don't care what was before. In database we cannot recreate the database as we need to maintain the data! Also the deployment executes SQL scripts which were generated in the build process.
  • When executing the SQL scripts (with the DDL, DCL, DML (for static content) commands) you assume the current structure of the environment match the structure when you create the scripts. If not, then your scripts can fail as you are trying to add new column which already exists.
  • Treating SQL scripts as code and manually generating them will cause syntax errors, database dependencies errors, scripts that are not reusable which complicate the task of developing, maintaining, testing those scripts. In addition, those scripts may run on an environment which is different from the one you though it would run on.
  • Sometimes the script in the version control repository does not match the structure of the object that was tested and then errors will happen in production!

There are many more, but I think you got the picture.

What I found that works is the following:

  1. Use an enforced version control system that enforces check-out/check-in operations on the database objects. This will make sure the version control repository matches the code that was checked-in as it reads the metadata of the object in the check-in operation and not as a separated step done manually. This also allow several developers to work in parallel on the same database while preventing them to accidently override each other code.
  2. Use an impact analysis that utilize baselines as part of the comparison to identify conflicts and identify if a difference (when comparing the object's structure between the source control repository and the database) is a real change that origin from development or a difference that was origin from a different path and then it should be skipped, such as different branch or an emergency fix.
  3. Use a solution that knows how to perform Impact Analysis for many schemas at once, using UI or using API in order to eventually automate the build & deploy process.

An article I wrote on this was published here, you are welcome to read it.

like image 24
Uri Avatar answered Dec 23 '22 10:12

Uri