Suppose I have a set of users and each user has access to a collection of tools. The same tool might have many users with access so this is a many-to-many relationship:
class User(db.Model):
__tablename__ = 'user'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, unique=True)
tools = db.relationship("Tool", secondary=user_tool_assoc_table,
back_populates='users')
class Tool(db.Model):
__tablename__ = 'tool'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, unique=False)
user_tool_assoc_table = db.Table('user_tool', db.Model.metadata,
db.Column('user', db.Integer, db.ForeignKey('user.id')),
db.Column('tool', db.Integer, db.ForeignKey('tool.id')))
Observe that user names are unique, but tool names are not. So User.name:Mike1
and User.name:Mike2
might have access to Tool.name:Hammer
, and, separately, User.name:John1
and User.name:John2
might have access to Tool.name:Hammer
by the same name but each with different Tool.ids
.
I want to make a constraint that within the User.tools
collection there can never be a tool with the same name as another, i.e.
Tool
as part of his collection if one with that name already exists. Mike1
cannot create a new tool called Hammer
that forms part of his tools
collection.Tool
that exists in the database cannot be appended to the tools
collection of a user if one with the same name already exists in the set, i.e. John1's Hammer
cannot be shared with Mike1 since Mike1 already has his own Hammer
.James
, however, can create a new Hammer
since he does not already have a hammer. There will then be 3 tools in the database called Hammer
each with a distinct set of Users
.Tool
will only exist if it has at least one User
, but I also don't know how to ensure this natively in my database.Is this possible natively with SQLalchemy to automatically configure my database to maintain integrity? I don't want to write my own validator rules since I will likely miss something and end up with a database which breaks my rules.
The problem is how to express the predicate "A user identified by ID has only one tool with the name NAME". This would of course be easy to express with a simple table such as:
db.Table('user_toolname',
db.Column('user', db.Integer, db.ForeignKey('user.id'), primary_key=True),
db.Column('toolname', db.String, primary_key=True))
It is also very clear that this alone is not nearly enough to uphold integrity, as there is no connection between the fact about user's toolnames and the actual tools. Your database could state that a user both has a hammer and doesn't have a hammer.
It would be nice to enforce this in your user_tool_assoc_table
or something equivalent, but since Tool.name
is not a part of the primary key of Tool
, you cannot reference it. On the other hand since you do want to allow multiple tools with the same name to co-exist, the subset { id, name } is in fact the proper key for Tool
:
class Tool(db.Model):
__tablename__ = 'tool'
id = db.Column(db.Integer, primary_key=True, autoincrement=True)
name = db.Column(db.String, primary_key=True)
The id
now acts as a "discriminator" of sorts between the tools having the same name. Note that id
need not be globally unique in this model, but locally to name
. It's handy to have it auto increment still, but the default setting of autoincrement='auto'
only treats a single-column integer primary key as having auto incrementing behavior by default, so it must be explicitly set.
It is now possible to define user_tool_assoc_table
in terms of tool_name
as well, with the additional constraint that a user can only have a single tool with a given name:
user_tool_assoc_table = db.Table(
'user_tool',
db.Column('user', db.Integer, db.ForeignKey('user.id')),
db.Column('tool', db.Integer),
db.Column('name', db.String),
db.ForeignKeyConstraint(['tool', 'name'],
['tool.id', 'tool.name']),
db.UniqueConstraint('user', 'name'))
With this model and the following setup:
john = User(name='John')
mark = User(name='Mark')
db.session.add_all([john, mark])
hammer1 = Tool(name='Hammer')
hammer2 = Tool(name='Hammer')
db.session.add_all([hammer1, hammer2])
db.session.commit()
This will succeed:
john.tools.append(hammer1)
hammer2.users.append(mark)
db.session.commit()
And this will fail after the above, since it violates the unique constraint:
john.tools.append(hammer2)
db.session.commit()
If you want to model the domain by allowing tool names to be non-unique, then there is no easy way to accomplish this.
You can try adding a validator to the User model which will check the User.tools
list during every append and make sure that it obeys a certain condition
from sqlalchemy.orm import validates
class User(db.Model):
__tablename__ = 'user'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, unique=True)
tools = db.relationship("Tool", secondary=user_tool_assoc_table,
back_populates='users')
@validates('tools')
def validate_tool(self, key, tool):
assert tool.name not in [t.name for t in self.tools]
return tool
def __repr__(self):
return self.name
The above approach will make sure that if you add a new tool which has the same name as an existing tools in user.tools
list it will throw an exception. But the problem is that you can still directly assign a new list with duplicate tools directly like this
mike.tools = [hammer1, hammer2, knife1]
This will work because validates
works only during append operation. Not during assignment. If we want a solution that works even during assignment, then we will have to figure out a solution where user_id
and tool_name
will be in the same table.
We can do this by making the secondary association table have 3 columns user_id
, tool_id
and tool_name
. We can then make tool_id
and tool_name
to behave as a Composite Foreign Key
together (Refer https://docs.sqlalchemy.org/en/latest/core/constraints.html#defining-foreign-keys)
By this approach, the association table will have a standard foreign key to user_id
and then a composite foreign key constraint which combines tool_id
and tool_name
. Now that both keys are there in the association table, we can then proceed to define an UniqueConstraint
on the table which will make sure that user_id
and tool_name
will have to be an unique combination
Here is the code
from flask import Flask
from flask.ext.sqlalchemy import SQLAlchemy
from sqlalchemy.orm import validates
from sqlalchemy.schema import ForeignKeyConstraint, UniqueConstraint
app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///:memory:'
db = SQLAlchemy(app)
user_tool_assoc_table = db.Table('user_tool', db.Model.metadata,
db.Column('user_id', db.Integer, db.ForeignKey('user.id')),
db.Column('tool_id', db.Integer),
db.Column('tool_name', db.Integer),
ForeignKeyConstraint(['tool_id', 'tool_name'], ['tool.id', 'tool.name']),
UniqueConstraint('user_id', 'tool_name', name='unique_user_toolname')
)
class User(db.Model):
__tablename__ = 'user'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, unique=True)
tools = db.relationship("Tool", secondary=user_tool_assoc_table,
back_populates='users')
def __repr__(self):
return self.name
class Tool(db.Model):
__tablename__ = 'tool'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, unique=False)
users = db.relationship("User", secondary=user_tool_assoc_table,
back_populates='tools')
def __repr__(self):
return "{0} - ID: {1}".format(self.name, self.id)
db.create_all()
mike=User(name="Mike")
pete=User(name="Pete")
bob=User(name="Bob")
db.session.add_all([mike, pete, bob])
db.session.commit()
hammer1 = Tool(name="hammer")
hammer2 = Tool(name="hammer")
knife1 = Tool(name="knife")
knife2 = Tool(name="knife")
db.session.add_all([hammer1, hammer2, knife1, knife2])
db.session.commit()
Now let's try playing around
In [2]: users = db.session.query(User).all()
In [3]: tools = db.session.query(Tool).all()
In [4]: users
Out[4]: [Mike, Pete, Bob]
In [5]: tools
Out[5]: [hammer - ID: 1, hammer - ID: 2, knife - ID: 3, knife - ID: 4]
In [6]: users[0].tools = [tools[0], tools[2]]
In [7]: db.session.commit()
In [9]: users[0].tools.append(tools[1])
In [10]: db.session.commit()
---------------------------------------------------------------------------
IntegrityError Traceback (most recent call last)
<ipython-input-10-a8e4ec8c4c52> in <module>()
----> 1 db.session.commit()
/home/surya/Envs/inkmonk/local/lib/python2.7/site-packages/sqlalchemy/orm/scoping.pyc in do(self, *args, **kwargs)
151 def instrument(name):
152 def do(self, *args, **kwargs):
--> 153 return getattr(self.registry(), name)(*args, **kwargs)
154 return do
So appending a tool of the same name throws exception.
Now let's try assigning a list with duplicate tool names
In [14]: tools
Out[14]: [hammer - ID: 1, hammer - ID: 2, knife - ID: 3, knife - ID: 4]
In [15]: users[0].tools = [tools[0], tools[1]]
In [16]: db.session.commit()
---------------------------------------------------------------------------
IntegrityError Traceback (most recent call last)
<ipython-input-16-a8e4ec8c4c52> in <module>()
----> 1 db.session.commit()
/home/surya/Envs/inkmonk/local/lib/python2.7/site-packages/sqlalchemy/orm/scoping.pyc in do(self, *args, **kwargs)
151 def instrument(name):
152 def do(self, *args, **kwargs):
--> 153 return getattr(self.registry(), name)(*args, **kwargs)
154 return do
This throws an exception as well. So we have made sure at db level that your requirement is solved.
But in my opinion, taking such a convoluted approach usually indicates that we are needlessly complicating the design. If you are ok with changing the table design, please consider the following suggestion for a simpler approach.
In my opinion, it is better to have a set of unique tools and a set of unique users and then model a M2M relationship between them. Any property which is specific to Mike's hammer, but not present in James' hammer should be a property of that association between them.
If you take that approach, you have a set of users like this
Mike, James, John, George
and a set of tools like this
Hammer, Screwdriver, Wedge, Scissors, Knife
And you can still model a many to many relation between them. In this scenario, the only change you have to do is to set unique=True
on the Tool.name
column, so that there is only one hammer globally which can have that name.
If you need Mike's hammer to have some unique properties distinct from James's Hammer, then you can just add some extra columns in the association table. To access user.tools and tool.users, you can use an association_proxy.
from sqlalchemy.ext.associationproxy import association_proxy
class User(db.Model):
__tablename__ = 'user'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, unique=True)
associated_tools = db.relationship("UserToolAssociation")
tools = association_proxy("associated_tools", "tool")
class Tool(db.Model):
__tablename__ = 'tool'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, unique=True)
associated_users = db.relationship("UserToolAssociation")
users = association_proxy("associated_users", "user")
class UserToolAssociation(db.Model):
__tablename__ = 'user_tool_association'
id = db.Column(db.Integer, primary_key=True)
user_id = db.Column(db.Integer, db.ForeignKey('user.id'))
tool_id = db.Column(db.Integer, db.ForeignKey('tool.id'))
property1_specific_to_this_user_tool = db.Column(db.String(20))
property2_specific_to_this_user_tool = db.Column(db.String(20))
user = db.relationship("User")
tool = db.relationship("Tool")
The above approach is better because of the proper separation of concerns. In future when you need to do something that will affect all hammers, you can just modify the hammer instance in the Tools table. If you keep all hammers as separate instances without any link between them, it will become cumbersome to do any modification on them as a whole in the future.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With