I have a huge table - 36 million rows - in SQLite3. In this very large table, there are two columns:
hash
- textd
- realSome of the rows are duplicates. That is, both hash
and d
have the same values. If two hashes are identical, then so are the values of d
. However, two identical d
's does not imply two identical hash
'es.
I want to delete the duplicate rows. I don't have a primary key column.
What's the fastest way to do this?
To delete the duplicate rows from the table in SQL Server, you follow these steps: Find duplicate rows using GROUP BY clause or ROW_NUMBER() function. Use DELETE statement to remove the duplicate rows.
Introduction to SQLite DELETE statement In this syntax: First, specify the name of the table which you want to remove rows after the DELETE FROM keywords. Second, add a search condition in the WHERE clause to identify the rows to remove. The WHERE clause is an optional part of the DELETE statement.
You need a way to distinguish the rows. Based on your comment, you could use the special rowid column for that.
To delete duplicates by keeping the lowest rowid
per (hash,d)
:
delete from YourTable where rowid not in ( select min(rowid) from YourTable group by hash , d )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With