Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to avoid adding duplicates in a many-to-many relationship table in SQLAlchemy - python?

I am dealing with a many-to-many relationship with sqlalchemy. My question is how to avoid adding duplicate pair values in a many-to-many relational table.

To make things clearer, I will use the example from the official SQLAlchemy documentation.

Base = declarative_base()

Parents2children = Table('parents2children', Base.metadata,                                                                                                                                                                                                     
  Column('parents_id', Integer, ForeignKey('parents.id')),                                                                                                                                                                                                       
  Column('children_id', Integer, ForeignKey('children.id'))
)

class Parent(Base):
    __tablename__ = 'parents'
    id = Column(Integer, primary_key=True)
    parent_name = Column(String(45))
    child_rel = relationship("Child", secondary=Parents2children, backref= "parents_backref") 

    def __init__(self, parent_name=""):
        self.parent_name=parent_name
    def __repr__(self):
        return "<parents(id:'%i', parent_name:'%s')>" % (self.id, self.parent_name)

class Child(Base):
    __tablename__ = 'children'
    id = Column(Integer, primary_key=True)
    child_name = Column(String(45))

    def __init__(self, child_name=""):
        self.child_name= child_name
    def __repr__(self):
        return "<experiments(id:'%i', child_name:'%s')>" % (self.id, self.child_name)

###########################################

def setUp():
    global Session
    engine=create_engine('mysql://root:root@localhost/db_name?charset=utf8', pool_recycle=3600,echo=False)
    Session=sessionmaker(bind=engine)

def add_data():
    session=Session()
    name_father1=Parent(parent_name="Richard")
    name_mother1=Parent(parent_name="Kate")
    name_daughter1=Child(child_name="Helen")
    name_son1=Child(child_name="John")

    session.add(name_father1)
    session.add(name_mother1)

    name_father1.child_rel.append(name_son1)
    name_daughter1.parents_backref.append(name_father1)
    name_son1.parents_backref.append(name_father1)

    session.commit()
    session.close()


setUp()
add_data()
session.close()

With this code, the data inserted in the tables is the following:

Parents table:

+----+-------------+
| id | parent_name |
+----+-------------+
|  1 | Richard     |
|  2 | Kate        |
+----+-------------+

Children table:

+----+------------+
| id | child_name |
+----+------------+
|  1 | Helen      |
|  2 | John       |
+----+------------+

Parents2children table

+------------+-------------+
| parents_id | children_id |
+------------+-------------+
|          1 |           1 |
|          1 |           2 |
|          1 |           1 |
+------------+-------------+

As you can see, there's a duplicate in the last table... how could I prevent SQLAlchemy from adding these duplicates?

I've tried to put relationship("Child", secondary=..., collection_class=set) but this error is displayed:

AttributeError: 'InstrumentedSet' object has no attribute 'append'
like image 446
Àngel Ba Avatar asked Apr 16 '13 10:04

Àngel Ba


People also ask

How do you avoid duplicates in a table?

In the Navigation Pane, right-click the table that contains the field, and then click Design View. Select the field that you want to make sure has unique values. In the Field Properties pane at the bottom of the table design view, on the General tab, set the Indexed property to Yes (No duplicates).

How make SQLAlchemy faster?

Instead of returning entire model instances, SQLAlchemy can fetch only the columns we're interested in. This not only reduces the amount of data sent, but also avoids the need to instantiate entire objects. Working with tuples of column data instead of models can be quite a bit faster.

What is lazy dynamic SQLAlchemy?

lazy = 'dynamic': When querying with lazy = 'dynamic', however, a separate query gets generated for the related object. If you use the same query as 'select', it will return: You can see that it returns a sqlalchemy object instead of the city objects.


1 Answers

Add a PrimaryKeyConstraint (or a UniqueConstraint) to your relationship table:

Parents2children = Table('parents2children', Base.metadata,                                                                                                                                                                                                     
  Column('parents_id', Integer, ForeignKey('parents.id')),                                                                                                                                                                                                       
  Column('children_id', Integer, ForeignKey('children.id')),
  PrimaryKeyConstraint('parents_id', 'children_id'),
)

and your code will generate an error when you try to commit the relationship added from both sides. This is very recommended to do.

In order to not even generate an error, just check first:

if not(name_father1 in name_son1.parents_backref):
    name_son1.parents_backref.append(name_father1)
like image 159
van Avatar answered Oct 20 '22 12:10

van