Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sqlalchemy: secondary relationship update

I have two tables, say A and B. Both have a primary key id. They have a many-to-many relationship, SEC.

SEC = Table('sec', Base.metadata,
    Column('a_id', Integer, ForeignKey('A.id'), primary_key=True, nullable=False),
    Column('b_id', Integer, ForeignKey('B.id'), primary_key=True, nullable=False)
)

class A():
   ...
   id = Column(Integer, primary_key=True) 
   ...
   rels = relationship(B, secondary=SEC)

class B():
   ...
   id = Column(Integer, primary_key=True) 
   ...

Let's consider this piece of code.

a = A()
b1 = B()
b2 = B()
a.rels = [b1, b2]
...
#some place later
b3 = B()
a.rels = [b1, b3]  # errors sometimes

Sometimes, I get an error at the last line saying

duplicate key value violates unique constraint a_b_pkey

In my understanding, I think it tries to add (a.id, b.id) into 'sec' table again resulting in a unique constraint error. Is that what it is? If so, how can I avoid this? If not, why do I have this error?

like image 266
Sri Avatar asked Oct 08 '12 18:10

Sri


People also ask

What is lazy loading in SQLAlchemy?

The loading of relationships falls into three categories; lazy loading, eager loading, and no loading. Lazy loading refers to objects are returned from a query without the related objects loaded at first.

What is Backref in SQLAlchemy?

The sqlalchemy backref is one of the type keywords and it passed as the separate argument parameters which has to be used in the ORM mapping objects. It mainly includes the event listener on the configuration attributes with both directions of the user datas through explicitly handling the database relationships.

What is Primaryjoin?

primaryjoin – A SQL expression that will be used as the primary join of the child object against the parent object, or in a many-to-many relationship the join of the parent object to the association table.


2 Answers

The problem is you want to make sure the instances you create are unique. We can create an alternate constructor that checks a cache of existing uncommited instances or queries the database for existing commited instance before returning a new instance.

Here is a demonstration of such a method:

from sqlalchemy import Column, Integer, String, ForeignKey, Table
from sqlalchemy.engine import create_engine
from sqlalchemy.ext.declarative.api import declarative_base
from sqlalchemy.orm import sessionmaker, relationship

engine = create_engine('sqlite:///:memory:', echo=True)
Session = sessionmaker(engine)
Base = declarative_base(engine)

session = Session()


class Role(Base):
    __tablename__ = 'role'

    id = Column(Integer, primary_key=True)
    name = Column(String, nullable=False, unique=True)

    @classmethod
    def get_unique(cls, name):
        # get the session cache, creating it if necessary
        cache = session._unique_cache = getattr(session, '_unique_cache', {})
        # create a key for memoizing
        key = (cls, name)
        # check the cache first
        o = cache.get(key)
        if o is None:
            # check the database if it's not in the cache
            o = session.query(cls).filter_by(name=name).first()
            if o is None:
                # create a new one if it's not in the database
                o = cls(name=name)
                session.add(o)
            # update the cache
            cache[key] = o
        return o


Base.metadata.create_all()

# demonstrate cache check
r1 = Role.get_unique('admin')  # this is new
r2 = Role.get_unique('admin')  # from cache
session.commit()  # doesn't fail

# demonstrate database check
r1 = Role.get_unique('mod')  # this is new
session.commit()
session._unique_cache.clear()  # empty cache
r2 = Role.get_unique('mod')  # from database
session.commit()  # nop

# show final state
print session.query(Role).all()  # two unique instances from four create calls

The create_unique method was inspired by the example from the SQLAlchemy wiki. This version is much less convoluted, favoring simplicity over flexibility. I have used it in production systems with no problems.

There are obviously improvements that can be added; this is just a simple example. The get_unique method could be inherited from a UniqueMixin, to be used for any number of models. More flexible memoizing of arguments could be implemented. This also puts aside the problem of multiple threads inserting conflicting data mentioned by Ants Aasma; handling that is more complex but should be an obvious extension. I leave that to you.

like image 71
davidism Avatar answered Oct 13 '22 22:10

davidism


The error you mention is indeed from inserting a conflicting value to the sec table. To be sure that it is from the operation you think it is, not some previous change, turn on SQL logging and check what values is it trying to insert before erroring out.

When overwriting a many-to-many collection value, SQLAlchemy compares the new contents of the collection with the state in the database and correspondingly issues delete and insert statements. Unless you are poking around in SQLAlchemy internals, there should be two ways to encounter this error.

First is concurrent modification: Process 1 fetches the value a.rels and notices that it is empty, meanwhile Process 2 also fetches a.rels, sets it to [b1, b2] and commits flushing the (a,b1),(a,b2) tuples, Process 1 sets a.rels to [b1, b3] noticing that the previous contents was empty and when it tries to flush the sec tuple (a,b1) it gets a duplicate key error. The correct action in such cases is usually to retry the transaction from the top. You can use serializable transaction isolation to instead get a serialization error in this case that is distinct from a business logic error causing a duplicate key error.

The second case happens when you have managed to convince SQLAlchemy that you don't need to know the database state by setting the loading strategy of the rels attribute to noload. This can be done when defining the relationship by adding the lazy='noload' parameter, or when querying, calling .options(noload(A.rels)) on the query. SQLAlchemy will assume that sec table has no matching rows for objects loaded with this strategy in effect.

like image 32
Ants Aasma Avatar answered Oct 14 '22 00:10

Ants Aasma