Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using SQLAlchemy's on_conflict_do_update() with unique constraints

I'm using SQLAlchemy 1.3.4 and PostgreSQL 11.3.

I have the following (simplified) table definition:

class MyModel(Base):
    __tablename__ = 'mymodel'

    id = Column(Integer, primary_key=True)
    col1 = Column(Unicode, nullable=False)
    col2 = Column(Unicode, nullable=False)
    col3 = Column(Unicode, nullable=False)
    col4 = Column(Boolean)

    created_at = Column(DateTime(timezone=True), nullable=False)
    updated_at = Column(DateTime(timezone=True), nullable=False)

    __table_args__ = (
        Index('uq_mymodel_col1_col2_col3_col4',
              col1, col2, col3, col4,
              unique=True, postgresql_where=col4.isnot(None)),
        Index('uq_mymodel_col1_col2_col3',
              col1, col2, col3,
              unique=True, postgresql_where=col4.is_(None)),
    )

(I had to create 2 unique index rather than a UniqueConstraint because a UniqueConstraint would allow multiple rows with the same (col1, col2, col3) is col4 is null, which I do not want)

I'm trying to do the following query:

INSERT INTO mymodel (col1, col2, col3, col4, created_at, updated_at)
VALUES (%(col1)s, %(col2)s, %(col3)s, %(col4)s, %(created_at)s, %(updated_at)s)
ON CONFLICT DO UPDATE SET updated_at = %(param_1)s
RETURNING mymodel.id

I can't figure out how to properly use SQLAlchemy's on_conflict_do_update() though. :-/

Here is what I tried:

values = {…}

stmt = insert(MyModel.__table__).values(**values)
stmt = stmt.returning(MyModel.__table__.c.id)
stmt = stmt.on_conflict_do_update(set_={'updated_at': values['updated_at']})
result = dbsession.connection().execute(stmt)

However SQLAlchemy complains: Either constraint or index_elements, but not both, must be specified unless DO NOTHING

I find it very unclear how to use constraint or index_elements.

I tried a few things, to no avail. For example:

values = {…}

stmt = insert(MyModel.__table__).values(**values)
stmt = stmt.returning(MyModel.__table__.c.id)
stmt = stmt.on_conflict_do_update(constraint='uq_mymodel_col1_col2_col3_col4'
                                  set_={'updated_at': values['updated_at']})
result = dbsession.connection().execute(stmt)

But then this doesn't work either: constraint "uq_mymodel_col1_col2_col3_col4" for table "mymodel" does not exist. But it does exist. (I even copy-pasted from pgsql to make sure I hadn't made a typo)

In any case, I have two unique constraints which can raise a conflict, but on_conflict_do_update() seems to only take one. So I also tried specifying both like this:

values = {…}

stmt = insert(MyModel.__table__).values(**values)
stmt = stmt.returning(MyModel.__table__.c.id)
stmt = stmt.on_conflict_do_update(constraint='uq_mymodel_col1_col2_col3_col4'
                                  set_={'updated_at': values['updated_at']})
stmt = stmt.on_conflict_do_update(constraint='uq_mymodel_col1_col2_col3'
                                  set_={'updated_at': values['updated_at']})
result = dbsession.connection().execute(stmt)

But I get the same error, that the uq_mymodel_col1_col2_col3_col4 does not exist.

At this point I just can't figure out how to do the above query, and would really appreciate some help.

like image 939
Mathieu Bridon Avatar asked Oct 17 '25 22:10

Mathieu Bridon


1 Answers

Ok, I think I figured it out. So the problem didn't come from SQLAlchemy after all, I was actually misusing PostgreSQL.

First, the SQL query I pasted above didn't work, because like SQLAlchemy, PostgreSQL requires specifying either the index columns or a constraint name.

And when I specified one of my constraints, PostgreSQL gave me the same error as SQLAlchemy. And that's because my constraints weren't actually constraints, but unique indexes. It seems it really has to be a unique constraint, not a unique index. (even though that index would have the same effect as a unique constraint)

So I rewrote the model as follows:

# Feel free to use the following code under the MIT license


class NullableBoolean(TypeDecorator):
    """A three-states boolean, which allows working with UNIQUE constraints

    In PostgreSQL, when making a composite UNIQUE constraint where one of the
    columns is a nullable boolean, then null values for that column are counted
    as always different.

    So if you have:

        class MyModel(Base):
            __tablename__ = 'mymodel'

            id = Column(Integer, primary_key=True)
            col1 = Column(Unicode, nullable=False)
            col2 = Column(Unicode, nullable=False)
            col3 = Column(Boolean)

            __table_args__ = (
                UniqueConstraint(col1, col2, col3,
                                 name='uq_mymodel_col1_col2_col3'),
            }

    Then you could INSERT multiple records which have the same (col1, col2)
    when col3 is None.

    If you want None to be considered a "proper" value that triggers the
    unicity constraint, then use this type instead of a nullable Boolean.
    """
    impl = Enum

    def __init__(self, **kwargs):
        kwargs['name'] = 'nullable_boolean_enum'

        super().__init__('true', 'false', 'unknown', **kwargs)

    def process_bind_param(self, value, dialect):
        """Convert the Python values into the SQL ones"""
        return {
            True: 'true',
            False: 'false',
            None: 'unknown',
        }[value]

    def process_result_value(self, value, dialect):
        """Convert the SQL values into the Python ones"""
        return {
            'true': True,
            'false': False,
            'unknown': None,
        }[value]


class MyModel(Base):
    __tablename__ = 'mymodel'

    id = Column(Integer, primary_key=True)
    col1 = Column(Unicode, nullable=False)
    col2 = Column(Unicode, nullable=False)
    col3 = Column(Unicode, nullable=False)
    col4 = Column(Boolean)

    created_at = Column(DateTime(timezone=True), nullable=False)
    updated_at = Column(DateTime(timezone=True), nullable=False)

    __table_args__ = (
        UniqueConstraint(col1, col2, col3, col4,
                         name='uq_mymodel_col1_col2_col3_col4')
    )

And now it seems to be working as expected.

Hope that helps someone in the future. If anybody has a better idea though, I'm interested. :)

like image 175
Mathieu Bridon Avatar answered Oct 21 '25 07:10

Mathieu Bridon