Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy: does the mapper change my objects?

I am trying to use SQLAlchemy version to store my objects in a database. I have a save(...) function for that purpose:

#!/usr/bin/env python
# encoding: utf-8

from sqlalchemy     import Column, Integer, MetaData, String, Table, create_engine
from sqlalchemy.orm import mapper, sessionmaker

class MyClass(object):
    def __init__(self, title):
        self.title = title
    def __str__(self):
        return '%s' % (self.title)

def save(object_list):
    metadata = MetaData()
    my_class_table = Table('my_class',
                            metadata,
                            Column('id',    Integer,     primary_key=True),
                            Column('title', String(255), nullable=False))

    # everything is OK here, output:
    # some title
    # another title
    # yet another title

    for obj in object_list:
        print obj

    mapper(MyClass, my_class_table)

    # on Linux / SQLAlchemy 0.6.8, this fails with
    # Traceback (most recent call last):
    #   File "./test.py", line 64, in <module>
    #     save(my_objects)
    #   File "./test.py", line 57, in save
    #     print obj
    #   File "./test.py", line 11, in __str__
    #     return '%s' % (self.title)
    #   File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py", line 167, in __get__
    #     return self.impl.get(instance_state(instance),
    # AttributeError: 'NoneType' object has no attribute 'get'

    # on Mac OSX / SQLAlchemy 0.7.5, this fails with
    # Traceback (most recent call last):
    #   File "./test.py", line 64, in <module>
    #     save(my_objects)
    #   File "./test.py", line 57, in save
    #     print obj
    #   File "./test.py", line 11, in __str__
    #     return '%s' % (self.title)
    #   File "/Library/Python/2.7/site-packages/sqlalchemy/orm/attributes.py", line 165, in __get__
    #     if self._supports_population and self.key in dict_:
    #   File "/Library/Python/2.7/site-packages/sqlalchemy/orm/attributes.py", line 139, in __getattr__
    #     key)
    # AttributeError: Neither 'InstrumentedAttribute' object nor 'Comparator' object has an attribute '_supports_population'

    for obj in object_list:
        print obj

    # (more code to set up engine and session...)


if __name__ == '__main__':
    my_objects = [MyClass('some title'), MyClass('another title'), MyClass('yet another title')]
    save(my_objects)

It seems to me the mapper does something with my objects implicitly when I create the mapper and I can't use them anymore. From what I read in similar questions, this is not entirely unknown.

Is this behavior expected ?
Am I getting something basic wrong here ?
What is the proper way to map and store my objects ?


Additional info: I am using SQLAlchemy 0.7.5 with system-default Python 2.7.1 on Mac OSX 10.7.3 Lion and SQLAlchemy 0.6.8 with system-default Python 2.7.2+ on a Kubuntu 11.10 virtual machine.


Update: It seems the SQLAlchemy mapper is well known to change objects to 'SQLAlchemy needs'. The solution in the linked article is to create objects after mapper(...) was called.
I already have valid objects though - but I can't use them anymore...

How do I get SQLAlchemy to store my objects ?

Update 2: I am getting the impression I am misunderstanding something fundamental about ORMs:
I thought the SQLAlchemy mapper concept gives me a way to define my objects and work with them in my application decoupled from anything database - and only once I want to persist them, I bring in SQLAlchemy and have it do all the heavy work of mapping a class object to a database table. Also, I can see no architectural reason why SQLAlchemy should even be mentioned anywhere else in my code than in the persistency functions save(...) and load(...).

However, from looking at the first response below, I understand I must map class to database table before I make any use of any of my objects; this would be at the very beginning of my program. Maybe I'm getting something wrong here, but this seems a pretty drastic design restriction from the part of SQLAlchemy - all the loose coupling is gone. With that in mind, I also fail to see the benefit of having a mapper in the first place as the 'late coupling' I want does not seem technically possible with SQLAlchemy and I might as well just do this 'declarative style' and mix and mingle business logic with persistency code :-(

Update 3: I went back to square one and read up on SQLAlchemy again. It says right on the front page:

SQLAlchemy is most famous for its object-relational mapper (ORM), an optional component that provides the data mapper pattern, where classes can be mapped to the database in open ended, multiple ways - allowing the object model and database schema to develop in a cleanly decoupled way from the beginning.

However, based on my experience thus far, SQLAlchemy does not seem to deliver on that promise at all as it forces me to couple object model and database schema from the very beginning. After all I've heard about SQLAlchemy, I really have a hard time believing that this truly is the case and I would rather assume I have not yet understood something basic.

What am I doing wrong ?

Update 4:

For completeness sake, I would like to add the working sample code that does all the mapping before creating any business objects:

#!/usr/bin/env python
# encoding: utf-8

from sqlalchemy     import Column, Integer, MetaData, String, Table, create_engine
from sqlalchemy.orm import mapper, sessionmaker

class MyClass(object):
    def __init__(self, title):
        self.title = title
    def __str__(self):
        return '%s' % (self.title)

def save(object_list, metadata):

    engine  = create_engine('sqlite:///:memory:')
    metadata.create_all(engine)
    Session = sessionmaker(bind=engine)
    session = Session()

    for obj in object_list:
        try:
            session.add(obj)
            session.commit()
        except Exception as e:
            print e
            session.rollback()

    # query objects to test they're in there
    all_objs = session.query(MyClass).all()
    for obj in all_objs:
        print 'object id   :', obj.id
        print 'object title:', obj.title


if __name__ == '__main__':

    metadata = MetaData()
    my_class_table = Table('my_class',
                            metadata,
                            Column('id',    Integer,     primary_key=True),
                            Column('title', String(255), nullable=False))
    mapper(MyClass, my_class_table)

    my_objects = [MyClass('some title'), MyClass('another title'), MyClass('yet another title')]
    save(my_objects, metadata)

In my own (non-sample) code, I import MyClass from its own module and do the mapping in a separate 'object repository' class which creates MetaData in its __init__(...) method and stores it in a member, so the save(...) and load(...) persistency methods can access it as required. All the main application needs to do is create the repository object at the very beginning; this contains the SQLAlchemy impact on the design to one class, but also distributes business object and mapped table definitions to separate locations in the code. Not sure yet if I'll go with that long term, but it seems to work for the moment.

Final quick hint for SQLAlchemy noobs like myself: You must be working with one and the same metadata object throughout, otherwise you'll get exceptions like no such table or class not mapped.

like image 989
ssc Avatar asked Mar 02 '12 14:03

ssc


1 Answers

SQLAlchemy absolutely does modify the mapped class. SQLAlchemy calls this instrumentation.

Observe:

>>> print(MyClass.__dict__)
{'__module__': '__main__', '__str__': <function __str__ at 0x106d50398>, '__dict__':
<attribute '__dict__' of 'MyClass' objects>, '__weakref__': <attribute '__weakref__' of 
'MyClass' objects>, '__doc__': None, '__init__': <function __init__ at 0x106d4b848>}

>>> mapper(MyClass, my_class_table)
Mapper|MyClass|my_class
>>> print(MyClass.__dict__)
{'__module__': '__main__', '_sa_class_manager': <ClassManager of <class 
'__main__.MyClass'> at 7febea6a7f40>, 'title': 
<sqlalchemy.orm.attributes.InstrumentedAttribute object at 0x10fba4f90>, '__str__': 
<function __str__ at 0x10fbab398>, 'id': <sqlalchemy.orm.attributes.InstrumentedAttribute 
object at 0x10fba4e90>, '__dict__': <attribute '__dict__' of 'MyClass' objects>, 
'__weakref__': <attribute '__weakref__' of 'MyClass' objects>, '__doc__': None, '__init__':
 <function __init__ at 0x10fbab410>}

Likewise there is a difference between plain MyClass instances and instrumented MyClass instances:

>>> prem = MyClass('premapper')
>>> mapper(MyClass, my_class_table)
Mapper|MyClass|my_class
>>> postm = MyClass('postmapper')
>>> print prem
{'title': 'premapper'}
>>> print postm
{'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x10e5b6d50>, 'title': 'postmapper'}

Your NoneType error is because the now-instrumented MyClass.title (which was replaced with an InstrumentedAttribute descriptor) is attempting to get the _sa_instance_state property via instance_state(instance). _sa_instance_state is created by the instrumented MyClass on object creation, but not by the uninstrumented MyClass.

Instrumentation is essential. It is done so that attribute access, assignment and other important object state changes can be communicated to the mapper using descriptors. (E.g., how would deferred column loading or even collection access be possible without modifying the class to monitor attribute accesses in the object?) MyClass objects are not bound to the table, but they do need to be bound to the mapper. You can change the mapping options and the table without changing your domain objects' code, but not necessarily your domain objects themselves. (After all, you can connect a single object to multiple mappers and multiple tables.)

Think of instrumentation as transparently adding a "Subject" implementation to a mapped class for the "Observer" mapper in a subject-observer pattern. Obviously you cannot implement a subject-observer pattern on an arbitrary object--you need to conform the observed object to the Subject interface.

I suppose it may be possible to instrument an instance in-place by creating a _sa_instance_state object for it (I don't know how--I would have to read the mapper() code), but I don't see why this is necessary. If there is a possibility that your object will be persisted by SQLAlchemy then simply define the mapping and the table for that persistence before you create any instances of the object. You don't need to create an engine or session or have any database at all in order to define the mapping.

The only way you can even get away with using completely un-instrumented application objects is if your use case is extremely trivial, something like the equivalent of a pickle or unpickle of dicts. E.g., without instrumentation you can never load or save related objects in collection attributes. Do you really never intend to do this? If this is so, perhaps you will be happier with not using sqlalchemy.orm at all and just using the underlying Expression API to persist your instances?

like image 51
Francis Avila Avatar answered Oct 03 '22 21:10

Francis Avila