I would like to know what the best practices are for using SQLALchemy declarative models within business logic code. Perhaps stackexchange.codereview may have a been a better place to ask this, but I'm not sure.
Here's some background.
Let's say I have a bunch of classes doing various things. Most of them have little or nothing to do with each other.Each such class has between a hundred to thousand lines of code doing things that have precious little to do with the database. In fact, most of the classes aren't even database aware so far. I've gotten away with storing the actual information in flat files (csv, yaml, so on), and only maintaining a serial number table and a document path - serial number mapping in the database. Each object retrieves the files it needs by getting the correct paths from the database (by serial number) and reconstructs itself from there. This has been exceedingly convenient so far, since my 'models' have been (and admittedly, continue to be) more than fluid.
As I expand the involvement of the database in the codebase I currently have, I seem to have settled on the following model, separating the database bits and the business logic into two completely separate parts, and joining them using specific function calls instead of inheritance or even composition. Here is a basic example of the kind of code I have now (pseudocode-quality):
module/db/models.py:
class Example(Base):
id = Column(...)
some_var = Column(...)
module/db/controller.py:
from .models import Example
def get_example_by_id(id, session):
return session.query(Example).filter_by(id=id).one()
def upsert_example(id=None, some_var=None, session):
if id is not None:
try:
example_obj = get_example_by_id(id, session)
example_obj.some_var = some_var
return
except:
pass
example_obj = Example(some_var=some_var)
session.add(example_obj)
session.flush()
module/example.py:
from db import controller
class Example(object):
def __init__(self, id):
self._id = id
self._some_var = None
try:
self._load_from_db()
self._defined = True
except:
self._defined = False
def _load_from_db(self, session):
db_obj = controller.get_example_by_id(self._id, session)
self._some_var = db_obj.some_var
def create(some_var, session):
if self._defined is True:
raise Exception
self._some_var = some_var
self._sync_to_db(session)
def _sync_to_db(self, session):
controller.upsert_example(self._some_var, session)
@property
def some_var(self):
return self._some_var
...
I'm not convinced this is the way to go.
I have a few models following this pattern, and many more that I should implement in time. The database is currently only used for persistence and archiving. Once something is in the database, it's more or less read only from there on in. However, querying on it is becoming important.
The reason I'm inclined to migrate from the flatfiles to the database is largely to improve scalability.
Thus far, if I wanted to find all instances (rows) of Example with some_var = 3, I'd have to construct all of the instances from the flat files and iterate through them. This seems like a waste of both processor time and memory. In many cases, some_var is actually a calculated property, and reached by a fairly expensive process using source data contained in the flat file.
With the structure above, what I would do is query on Example, obtain a list of 'id's which satisfy my criterion, and then reconstruct just those module instances.
The ORM approach, however, as I understand it, would use thick models, where the objects returned by the query are themselves the objects I would need. I'm wondering whether it makes sense to try to move to that kind of a structure.
To that end, I have the following 'questions' / thoughts:
My instinct is that the code snippets above are anti-patterns more than they are useful patterns. I can't put my finger on why, exactly, but I'm not very happy with it. Is there a real, tangible disadvantage to the structure as listed above? Would moving to a more ORM-ish design provide advantages in functionality / performance / maintainability over this approach?
I'm paranoid about tying myself down to a database schema. I'm also paranoid about regular DB migrations. The approach listed above gives me a certain peace of mind in knowing that if I do need to do some migration, it'll be limited to the _load_from_db and _sync_to_db functions, and let me mess around willy nilly with all the rest of the code.
If I wanted to integrate Example from module/db/models.py with Example from module/example.py in the example above, what would be the cleanest way to go about it. Alternatively, what is an accepted pattern for handling business-logic heavy models with SQLAlchemy?
In the code above, note that the business logic class keeps all of it's information in 'private' instance variables, while the Model class keeps all of it's information in class variables. How would integrating these two approaches actually work? Theoretically, they should still 'just work' even if put together in a single class definition. In practice, does it?
(The actual codebase is on github, though it's not likely to be very readable)
I think it's natural (at least for me) to be critical of our own designs even as we are working on them. The structures you have here seem fine to me. The answer to whether they are a good fit depends on what you plan to do.
If you consolidate your code into thick models then all of it will be one place and your architecture will be simpler, however, it will also probably mean that your business logic will then be tightly bound to the schema created in the database. Rethinking the database means rethinking large portions of other areas in the app.
Following the code sample provided here means separating the concerns which has the negative side effects such as more lines of code in more places and increased complexity, but it also means that the coupling is looser. If you stay true then you should have significantly less trouble if you decide to change your database schema or move to an entirely different form of storage. Since your business logic class is a plain old object it serves as a nice detached state container. If you move to something else you would still have to redesign the model layer and possibly parts of the controllers, but your business logic and UI layers could remain largely unchanged.
I think the real test lies in asking how big is this application going to be and how long do you plan to have it in service? If we're looking at a small application with a short life span then the added complexity of loose couplings is a waste unless you are doing it for educational purposes. If the application is expected to grow to be quite large or be in service for a number of years then the early investment in complexity should pay off in a lower cost of ownership over the long term since making changes to the various components should be easier.
If it makes you feel any better it's not uncommon to see POCO's and POJO's when working with ORM's such as entity framework and hybernate for the same reason.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With