Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement an append-only versioned model in SQLAlchemy

I would like to re-implement some of my existing SQLAlchemy models in an append-only datastore; append-only meaning that object are only updated with INSERT statements, not using UPDATE or DELETE statements.

The UPDATE and DELETE statements would be replaced with another INSERT that increments the version. There would be an is_deleted flag and instead of DELETE, a new version with is_deleted=True would be created:

id  | version | is_deleted | name      | description ...
---- --------- ------------ ----------- ---------------
  1 |       1 |          F | Fo        | Text text text.
  1 |       2 |          F | Foo       | Text text text.
  2 |       1 |          F | Bar       | null 
  1 |       3 |          T | Foo       | Text text text.         

Additionally,

  • All SELECT statements will need to be rewritten to only the maximum version number for each id, as described in this question: PostgreSQL - fetch the row which has the Max value for a column
  • All (unique) indexes need to be rewritten to be unique by the "id" primary key, as each id may be present more than once.

I know how to solve most of these issues, but I am struggling with the event hooks in SQLAlchemy that would handle certain things that need to be done on update & delete.

The SQLAlchemy documentation already has some basic examples for versioning. The versioned rows example comes close to what I want, but they do not handle (1) deletion and (2) foreign key relationships.

(1) Deletion. I know there is a session.deleted field, and I would iterate over it in a similar way to how session.dirty is iterated over in the versioned_rows.py example—but how would I unflag the item from the to-be-deleted list & create a new item?

(2) The above-mentioned example only deals with a parent-child relationship, and the way it does (expiring the relationship) seems to require custom code for each model. (2.1) Is there a way to make this more flexible? (2.2) is it possible to configure SQLAlchemy's relationship() to return the object with max(version) for a given foreign key?

like image 331
lyschoening Avatar asked Oct 02 '14 13:10

lyschoening


People also ask

How do I add a column to an existing table in SQLAlchemy?

The use of each clause is as follows : ADD: used to add a new column. RENAME: used to rename the table.

What is lazy dynamic SQLAlchemy?

lazy = 'dynamic': When querying with lazy = 'dynamic', however, a separate query gets generated for the related object. If you use the same query as 'select', it will return: You can see that it returns a sqlalchemy object instead of the city objects.

How do I drop a column in SQLAlchemy?

There is a tool called sqlalchemy-migrate which works this way; you have a Table object, you say, "table. remove_column(col)", and it emits an "ALTER TABLE" which drops that column.

What is model in SQLAlchemy?

SQLAlchemy (source code) is a Python library for accessing persistent data stored in relational databases either through raw SQL or an object-relational mapper.


1 Answers

One helpful thing that would be ORM tool agnostic might be "instead of" triggers. For instance, You could catch a before update event, and open a increment a version number with the newly updated data.

For postgresql they are detailed here.

Of course, you would have to have model changes (on PK's, etc.).

Also, it would be worth studying the performance impacts, as you would likely have to have a recursive query in order to fetch the "latest version" (through either a view layer, or in sql alchemy where clauses/etc.)

like image 85
joefromct Avatar answered Oct 18 '22 00:10

joefromct