I have 6 tables in my SQLite database, each table with 6 columns(Date, user, NormalA, specialA, contact, remarks
) and 1000+ rows.
How can I use sqlalchemy to sort through the Date column to look for duplicate dates, and delete that row?
Assuming this is your model:
class MyTable(Base):
__tablename__ = 'my_table'
id = Column(Integer, primary_key=True)
date = Column(DateTime)
user = Column(String)
# do not really care of columns other than `id` and `date`
# important here is the fact that `id` is a PK
following are two ways to delete you data:
For both of them a helper sub-query will be used:
# helper subquery: find first row (by primary key) for each unique date
subq = (
session.query(MyTable.date, func.min(MyTable.id).label("min_id"))
.group_by(MyTable.date)
) .subquery('date_min_id')
Option-1: Find duplicates, mark them for deletion and commit the transaction
# query to find all duplicates
q_duplicates = (
session
.query(MyTable)
.join(subq, and_(
MyTable.date == subq.c.date,
MyTable.id != subq.c.min_id)
)
)
for x in q_duplicates:
print("Will delete %s" % x)
session.delete(x)
session.commit()
Option-2: Create a single SQL query which will perform deletion on the database directly
sq = (
session
.query(MyTable.id)
.join(subq, and_(
MyTable.date == subq.c.date,
MyTable.id != subq.c.min_id)
)
).subquery("subq")
dq = (
session
.query(MyTable)
.filter(MyTable.id.in_(sq))
).delete(synchronize_session=False)
Inspired by the Find duplicate values in SQL table this might help you to select duplicate dates:
query = session.query(
MyTable
).\
having(func.count(MyTable.date) > 1).\
group_by(MyTable.date).all()
If you only want to show unique dates; distinct on
is what you might need
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With