Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLalchemy setting constraints on relationships in many-to-many

Suppose I have a set of users and each user has access to a collection of tools. The same tool might have many users with access so this is a many-to-many relationship:

class User(db.Model):
    __tablename__ = 'user'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String, unique=True)
    tools = db.relationship("Tool", secondary=user_tool_assoc_table,
                            back_populates='users')

class Tool(db.Model):
    __tablename__ = 'tool'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String, unique=False)

user_tool_assoc_table = db.Table('user_tool', db.Model.metadata,
    db.Column('user', db.Integer, db.ForeignKey('user.id')),
    db.Column('tool', db.Integer, db.ForeignKey('tool.id')))

Observe that user names are unique, but tool names are not. So User.name:Mike1 and User.name:Mike2 might have access to Tool.name:Hammer, and, separately, User.name:John1 and User.name:John2 might have access to Tool.name:Hammer by the same name but each with different Tool.ids.

I want to make a constraint that within the User.tools collection there can never be a tool with the same name as another, i.e.

  • A user cannot create a new Tool as part of his collection if one with that name already exists. Mike1 cannot create a new tool called Hammer that forms part of his tools collection.
  • A Tool that exists in the database cannot be appended to the tools collection of a user if one with the same name already exists in the set, i.e. John1's Hammer cannot be shared with Mike1 since Mike1 already has his own Hammer.
  • James, however, can create a new Hammer since he does not already have a hammer. There will then be 3 tools in the database called Hammer each with a distinct set of Users.
  • Note in my specific case a Tool will only exist if it has at least one User, but I also don't know how to ensure this natively in my database.

Is this possible natively with SQLalchemy to automatically configure my database to maintain integrity? I don't want to write my own validator rules since I will likely miss something and end up with a database which breaks my rules.

like image 798
Attack68 Avatar asked Jan 01 '23 19:01

Attack68


2 Answers

The problem is how to express the predicate "A user identified by ID has only one tool with the name NAME". This would of course be easy to express with a simple table such as:

db.Table('user_toolname',
         db.Column('user', db.Integer, db.ForeignKey('user.id'), primary_key=True),
         db.Column('toolname', db.String, primary_key=True))

It is also very clear that this alone is not nearly enough to uphold integrity, as there is no connection between the fact about user's toolnames and the actual tools. Your database could state that a user both has a hammer and doesn't have a hammer.

It would be nice to enforce this in your user_tool_assoc_table or something equivalent, but since Tool.name is not a part of the primary key of Tool, you cannot reference it. On the other hand since you do want to allow multiple tools with the same name to co-exist, the subset { id, name } is in fact the proper key for Tool:

class Tool(db.Model):
    __tablename__ = 'tool'
    id = db.Column(db.Integer, primary_key=True, autoincrement=True)
    name = db.Column(db.String, primary_key=True)

The id now acts as a "discriminator" of sorts between the tools having the same name. Note that id need not be globally unique in this model, but locally to name. It's handy to have it auto increment still, but the default setting of autoincrement='auto' only treats a single-column integer primary key as having auto incrementing behavior by default, so it must be explicitly set.

It is now possible to define user_tool_assoc_table in terms of tool_name as well, with the additional constraint that a user can only have a single tool with a given name:

user_tool_assoc_table = db.Table(
    'user_tool',
    db.Column('user', db.Integer, db.ForeignKey('user.id')),
    db.Column('tool', db.Integer),
    db.Column('name', db.String),
    db.ForeignKeyConstraint(['tool', 'name'],
                            ['tool.id', 'tool.name']),
    db.UniqueConstraint('user', 'name'))

With this model and the following setup:

john = User(name='John')
mark = User(name='Mark')
db.session.add_all([john, mark])
hammer1 = Tool(name='Hammer')
hammer2 = Tool(name='Hammer')
db.session.add_all([hammer1, hammer2])
db.session.commit()

This will succeed:

john.tools.append(hammer1)
hammer2.users.append(mark)
db.session.commit()

And this will fail after the above, since it violates the unique constraint:

john.tools.append(hammer2)
db.session.commit()
like image 155
Ilja Everilä Avatar answered Jan 10 '23 08:01

Ilja Everilä


If you want to model the domain by allowing tool names to be non-unique, then there is no easy way to accomplish this.

You can try adding a validator to the User model which will check the User.tools list during every append and make sure that it obeys a certain condition

from sqlalchemy.orm import validates
class User(db.Model):
  __tablename__ = 'user'
  id = db.Column(db.Integer, primary_key=True)
  name = db.Column(db.String, unique=True)
  tools = db.relationship("Tool", secondary=user_tool_assoc_table,
                        back_populates='users')

  @validates('tools')
  def validate_tool(self, key, tool):
    assert tool.name not in [t.name for t in self.tools]
    return tool

  def __repr__(self):
    return self.name

The above approach will make sure that if you add a new tool which has the same name as an existing tools in user.tools list it will throw an exception. But the problem is that you can still directly assign a new list with duplicate tools directly like this

mike.tools = [hammer1, hammer2, knife1]

This will work because validates works only during append operation. Not during assignment. If we want a solution that works even during assignment, then we will have to figure out a solution where user_id and tool_name will be in the same table.

We can do this by making the secondary association table have 3 columns user_id, tool_id and tool_name. We can then make tool_id and tool_name to behave as a Composite Foreign Key together (Refer https://docs.sqlalchemy.org/en/latest/core/constraints.html#defining-foreign-keys)

By this approach, the association table will have a standard foreign key to user_id and then a composite foreign key constraint which combines tool_id and tool_name. Now that both keys are there in the association table, we can then proceed to define an UniqueConstraint on the table which will make sure that user_id and tool_name will have to be an unique combination

Here is the code

from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
from sqlalchemy.orm import validates
from sqlalchemy.schema import ForeignKeyConstraint, UniqueConstraint

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///:memory:'
db = SQLAlchemy(app)

user_tool_assoc_table = db.Table('user_tool', db.Model.metadata,
    db.Column('user_id', db.Integer, db.ForeignKey('user.id')),
    db.Column('tool_id', db.Integer),
    db.Column('tool_name', db.Integer),
    ForeignKeyConstraint(['tool_id', 'tool_name'], ['tool.id', 'tool.name']),
    UniqueConstraint('user_id', 'tool_name', name='unique_user_toolname')
)

class User(db.Model):
    __tablename__ = 'user'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String, unique=True)
    tools = db.relationship("Tool", secondary=user_tool_assoc_table,
                            back_populates='users')


    def __repr__(self):
        return self.name


class Tool(db.Model):
    __tablename__ = 'tool'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String, unique=False)
    users = db.relationship("User", secondary=user_tool_assoc_table,
                            back_populates='tools')

    def __repr__(self):
        return "{0} - ID: {1}".format(self.name, self.id)

db.create_all()

mike=User(name="Mike")
pete=User(name="Pete")
bob=User(name="Bob")

db.session.add_all([mike, pete, bob])
db.session.commit()

hammer1 = Tool(name="hammer")
hammer2 = Tool(name="hammer")

knife1 = Tool(name="knife")
knife2 = Tool(name="knife")

db.session.add_all([hammer1, hammer2, knife1, knife2])
db.session.commit()

Now let's try playing around

In [2]: users = db.session.query(User).all()

In [3]: tools = db.session.query(Tool).all()

In [4]: users
Out[4]: [Mike, Pete, Bob]

In [5]: tools
Out[5]: [hammer - ID: 1, hammer - ID: 2, knife - ID: 3, knife - ID: 4]

In [6]: users[0].tools = [tools[0], tools[2]]

In [7]: db.session.commit()

In [9]: users[0].tools.append(tools[1])

In [10]: db.session.commit()
---------------------------------------------------------------------------
IntegrityError                            Traceback (most recent call last)
<ipython-input-10-a8e4ec8c4c52> in <module>()
----> 1 db.session.commit()

/home/surya/Envs/inkmonk/local/lib/python2.7/site-packages/sqlalchemy/orm/scoping.pyc in do(self, *args, **kwargs)
    151 def instrument(name):
    152     def do(self, *args, **kwargs):
--> 153         return getattr(self.registry(), name)(*args, **kwargs)
    154     return do

So appending a tool of the same name throws exception.

Now let's try assigning a list with duplicate tool names

In [14]: tools
Out[14]: [hammer - ID: 1, hammer - ID: 2, knife - ID: 3, knife - ID: 4]

In [15]: users[0].tools = [tools[0], tools[1]]

In [16]: db.session.commit()
---------------------------------------------------------------------------
IntegrityError                            Traceback (most recent call last)
<ipython-input-16-a8e4ec8c4c52> in <module>()
----> 1 db.session.commit()

/home/surya/Envs/inkmonk/local/lib/python2.7/site-packages/sqlalchemy/orm/scoping.pyc in do(self, *args, **kwargs)
    151 def instrument(name):
    152     def do(self, *args, **kwargs):
--> 153         return getattr(self.registry(), name)(*args, **kwargs)
    154     return do

This throws an exception as well. So we have made sure at db level that your requirement is solved.

But in my opinion, taking such a convoluted approach usually indicates that we are needlessly complicating the design. If you are ok with changing the table design, please consider the following suggestion for a simpler approach.

In my opinion, it is better to have a set of unique tools and a set of unique users and then model a M2M relationship between them. Any property which is specific to Mike's hammer, but not present in James' hammer should be a property of that association between them.

If you take that approach, you have a set of users like this

Mike, James, John, George

and a set of tools like this

Hammer, Screwdriver, Wedge, Scissors, Knife

And you can still model a many to many relation between them. In this scenario, the only change you have to do is to set unique=True on the Tool.name column, so that there is only one hammer globally which can have that name.

If you need Mike's hammer to have some unique properties distinct from James's Hammer, then you can just add some extra columns in the association table. To access user.tools and tool.users, you can use an association_proxy.

from sqlalchemy.ext.associationproxy import association_proxy

class User(db.Model):
    __tablename__ = 'user'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String, unique=True)
    associated_tools = db.relationship("UserToolAssociation")

    tools = association_proxy("associated_tools", "tool")

class Tool(db.Model):
    __tablename__ = 'tool'
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String, unique=True)
    associated_users = db.relationship("UserToolAssociation")

    users = association_proxy("associated_users", "user")



class UserToolAssociation(db.Model):
    __tablename__ = 'user_tool_association'

    id = db.Column(db.Integer, primary_key=True)
    user_id = db.Column(db.Integer, db.ForeignKey('user.id'))
    tool_id = db.Column(db.Integer, db.ForeignKey('tool.id'))
    property1_specific_to_this_user_tool = db.Column(db.String(20))
    property2_specific_to_this_user_tool = db.Column(db.String(20))

    user = db.relationship("User")
    tool = db.relationship("Tool")

The above approach is better because of the proper separation of concerns. In future when you need to do something that will affect all hammers, you can just modify the hammer instance in the Tools table. If you keep all hammers as separate instances without any link between them, it will become cumbersome to do any modification on them as a whole in the future.

like image 42
suryasankar Avatar answered Jan 10 '23 06:01

suryasankar