One thing I have continually found very confusing about using an object database like db4o is how you are supposed to handle complex migrations that would normally be handled by SQL/PL-SQL.
For example imagine you had a table in a relational database called my_users. Originally you had a column named "full_name", now that your software is in V2 you wish to remove this column, split the full names on a blank space and put the first part in a column named "first_name" and the second in a column named last_name. In SQL I would simply populate the "first_name" and "second_name" columns then remove the original column named "full_name".
How would I do this in something like db4o? Do I write a Java program that scripts looking up all objects of User.class, setting full_name to null while setting first_name and last_name? When I do my next svn commit there will be no field/bean-property corresponding to full_name, would this be a problem? It seems as though to use it in a production application where my "schema" changes I would want to write a script to migrate data from version x to version x+1 and then in version x+2 actually remove the properties I am trying to get rid of for version x+1 as I cannot write a Java script to modify properties that no longer are part of my type.
It seems that part of the problem is that an RDBMS resolves what object you are referring to based on a simple case insensitive string-based name, in a language like Java typing is more complicated than this, you cannot refer to a property if the getter/setter/field are not a member of the class loaded at runtime so you essentially need to have 2 versions of your code in the same script (hmm, custom classloaders sound like a pain), have the new version of your class stored belong to another package (sounds messy), or use the version x+1 x+2 strategy I mentioned (requires a lot more planning). Perhaps there is some obvious solution I never gleaned from the db4o documents.
Any ideas? Hopefully this makes some sense.
An object-oriented database (OODBMS) or object database management system (ODBMS) is a database that is based on object-oriented programming (OOP). The data is represented and stored in the form of objects. OODBMS are also called object databases or object-oriented database management systems.
Examples of object-oriented databases are ObjectStore (www.ignitetech.com) and Versant Object Database (www.versant.com). Relational databases have also added object-oriented features; for example, UniSQL was one of the first products to support both structures.
An object-oriented database (OOD) is a database system that can work with complex data objects — that is, objects that mirror those used in object-oriented programming languages. In object-oriented programming, everything is an object, and many objects are quite complex, having different properties and methods.
Object-oriented databases are databases that are based on object-oriented features including objects, complex objects, classes, abstraction, inheritance, encapsulation, and object persistence.
First, db4o handles the 'simple' scenarios like adding or removing a field automatically. When you adding the field, all existing object have the default value stored. When you remove a field, the data of existing object is still in the database and you can still access it. Renaming field etc are special 'refactoring'-calls.
Now your scenario you would do something like this:
Let's assume we have a 'Address'-class. The 'full_name' field has been removed. Now we wan't to copy it to the 'firstname' and 'surname'. Then it could go like this (Java):
ObjectSet<Address> addresses = db.query(Address.class);
StoredField metaInfoOfField = db.ext().storedClass(Address.class).storedField("full_name", String.class);
for (Address address : addresses) {
String fullName = (String)metaInfoOfField.get(address);
String[] splitName = fullName.split(" ");
address.setFirstname(splitName[0]);
address.setSurname(splitName[1]);
db.store(address);
}
As you suggested, you would write migration-code for each version-bump. It a field isn't part of your class anymore, you have to access it with 'StoredField'-API like above.
You can get a list of all 'stored' classes with ObjectContainer.ext().storedClasses()
. With StoredClass.getStoredFields()
you can get a list of all store fields, no mather is the field doesn't exist anymore in your class. If a class doesn't exist anymore, you can still get the objects and access it via 'GenericObject'-class.
Update: For complexer scenarios where a database needs to migrated over multiple-version-steps.
For example it in the version v3 the address-object looks completely different. So the 'migration-script' for v1 to v2 hasn't got the fields anymore it requires (firstname and surename in my example). I think there are multiple possibilities for handling this.
I'm taking a bit of a wild shot here, because I didn't refactor too much data in my life.
You're making a strange comparison: If you wanted to 'hot-migrate' the db, you'd probably have to do the x+1
, x+2
versioning approach you described, but I don't really know - I wouldn't know how to do this with SQL either since I'm not a db expert.
If you're migrating 'cold', however, you could just do it in one step by instantiating a new object from the old data, store the new object, delete the old object for each object in the store. See db4o reference.
But honestly: the same process in a RDBMS is complicated, too, because you will have to de-activate constraint checks (and possibly triggers, etc.) to actually perform the operation - perhaps not in the example you provided, but for most real-world cases. After all, the string split is so easy that there will be little gain.
In SQL I would simply populate "first_name" and "second_name" columns
Yes, with a simple string split operation, you can simply do that. But in a typical refactoring scenario, you're re-structuring objects based on large and complicated sets of rules that might not be easily expressed in SQL, might need complex calculation, or external data sources.
To do that, you'd have to write code, too.
After all, I don't see too much difference in the two processes. You will always have to be careful with live data, and you will certainly make a backup in both cases. Refactoring is fun, but persistence is tricky so synchronizing it is a challenge in any case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With