Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQLAlchemy Joined Inheritance fast bulk deletion of Child objects

Consider the following SQLAlchemy mappings using joined inheritance:

from sqlalchemy import sa

class Location(Base):
    id = Column(Integer, primary_key=True)
    name = sa.Column(sa.String)
    type_ = sa.column(sa.String)

    __tablename__ = 'location'
    __mapper_args__ = {
        'polymorphic_identity': 'location',
        'polymorphic_on': type_,
    }

class Field(Location):
    id = Column(Integer, primary_key=True)
    size = sa.Column(sa.Float)

    __tablename__ = 'field'
    __mapper_args__ = {
        'polymorphic_identity': 'field',
    }
    __table_args__ = (
        sa.ForeignKeyConstraint(['id'], ['location.id']),
    )


session.query(Field).filter(Field.size < 5).delete()

Where base is an appropriate declarative base and session is an appropriate session object. The implementation of the above will cause the Field objects to be deleted without the parent Location objects from being deleted (as the docs explain clearly, query.delete() does not support inheritance). I can get around this by doing session.delete(obj) which uses the ORM to delete objects up the chain. However, this causes n SQL delete statements to be executed on the database (where n is the number of objects to delete). I have a case where I may be deleting in the order of 100,000 child objects at a time, so this operation is horribly slow (assume for now that it is impossible for me to not use the ORM with joined inheritance - I am too deep to change this).

Is there any construct within SQLAlchemy or a reasonable alternative which would allow me to pass a query object that queries objects of type Field and appropriately deletes the items in the Location table as well without making n SQL delete statements?

Note, I am currently using PostgreSQL, but would like to keep the solution db-agnostic.

Edit: Added table metadata and more information regarding environment as per request.

like image 518
Coxy Avatar asked Nov 10 '17 06:11

Coxy


1 Answers

After trying for one or two hour, I find a solution with not too much code. Then I will reproduce it.

1. I check the document of delete(). There are two sentences:

This method does not work for joined inheritance mappings, since the multiple table deletes are not supported by SQL as well as that the join condition of an inheritance mapper is not automatically rendered

and

However the above SQL will not delete from the Engineer table, unless an ON DELETE CASCADE rule is established in the database to handle it.

Short story, do not use this method for joined inheritance mappings unless you have taken the additional steps to make this feasible.

So, define a foreign key constraint is necessary. Like this:

class Location(Base):
    __tablename__ = 'location'
    id = Column(INTEGER, primary_key=True)
    name = Column(VARCHAR(30))
    type = Column(VARCHAR(30))

    __mapper_args__ = {
        'polymorphic_identity': 'location',
        'polymorphic_on'      : type,
    }

class Field(Location):
    __tablename__ = 'field'
    id = Column(INTEGER, ForeignKey('location.id', ondelete='cascade'), primary_key=True)
    size = Column(DECIMAL(20, 2))

    __mapper_args__ = {
        'polymorphic_identity': 'field',
    }

2. Now, if we delete Location, the row in Field is also be deleted.

session.query(Location).filter(Location.id == 1).delete()

3. But, the poster want to delete Field not Location.

session.query(Field).filter(Field.size < 5).delete()

This only delete row in Field without row in Location. Because Field is a foreign table, it can't cascade the main table.

So, now what we should do is deleting from Location according to Field.size < 5.

I have tried

session.query(Location).filter(Field.size < 5).delete()

and

session.query(Location).outerjoin(Field, Location.id == Field.id).filter(Field.size < 5).delete()

these both throw an exception.

After many try, the solution my found is this:

statment = delete(Field, prefixes=[Location.__tablename__]).where(Field.size == 1)
session.execute(statment)

The generated sql is DELETE location FROM location JOIN field ON location.id = field.id WHERE field.size < 5

like image 64
pktangyue Avatar answered Nov 04 '22 08:11

pktangyue