I’m trying to do a, potentially, large scale delete operation on a single table. (think 100,000 rows on a 1m row table)
I’m using PostgreSQL and EntityFrameworkCore.
Details: The application code has a predicate to match and knows nothing about how many rows potentially match the predicate. It could be 0 row/s or a very large amount.
Research indicates EF Core is incapable of handling this efficiently. (i.e. the following code produces a Delete statement for each row!)
Using (var db = new DbContext)
var queryable = db.Table.AsQueryable()
.Where(o => o.ForeignKey == fKey)
.Where(o => o.OtherColumn == false);
db.Table.RemoveRange(queryable);
await db.SaveChangesAsync();
So here is the SQL I would prefer to run in a sort of batched operation:
delete from Table
where ForeignKey = 1234
and OtherColumn = false
and PK in (
select PK
from Table
where ForeignKey = 1234
and OtherColumn = false
limit 500
)
There are extension libraries out there, but I’ve yet to find an active one that supports Postgres. I’m currently executing the raw sql above through EF Core.
This leads to a couple questions:
I think you are trying to do something you should not use EntityFrameworkCore for. The object of EntityFrameworkCore is to have a nice way to move data between a .Net-Core application and a database. The typical useway is single or a small amount of objects. For bulk operations there are some nuget-packages. There is this package for inserting and updating with postgres.This article by the creator explains how it uses temporary tables and the postgres COPY command to do bulk operations. This shows us a way to delete rows in bulk by id:
var toDelete = GetIdsToDelete();
using (var conn = new NpgsqlConnection(connectionString))
{
conn.Open();
using ( var cmd = conn.CreateCommand())
{
cmd.CommandText =("CREATE TEMP TABLE temp_ids_to_delete (id int NOT NULL) ON COMMIT DROP ");
cmd.Prepare();
cmd.ExecuteNonQuery();
}
using (var writer = conn.BeginBinaryImport($"COPY temp_ids_to_delete (id) FROM STDIN (FORMAT BINARY)"))
{
foreach (var id in toDelete)
{
writer .StartRow();
writer .Write(id);
}
writer .Complete();
}
using (var cmd = conn.CreateCommand())
{
cmd.CommandText = "delete from myTable where id in(select id from temp_ids_to_delete)";
cmd.Prepare();
cmd.ExecuteNonQuery();
}
conn.Close();
With some small changes this can be more generalized.
But you want to do something different. You dont want to move data or information between the application and the database. You want to use efcore to create a slq-procedure on the fly and run that on the server. The problem is that ef core is not realy build to do that. But maybe there are ways around that. One way i could think of is to use ef-core to build a query, get the query string and then insert that string into another sql-string to run on the server. Getting the query string is currently not easy but apparently it will be with EF Core 5.0. Then you could do this:
var queryable = db.Table.AsQueryable()
.Where(o => o.ForeignKey == fKey)
.Where(o => o.OtherColumn == false);
var queryString=queryable.ToQueryString();
db.Database.ExecuteSqlRaw("delete from Table where PK in("+queryString+")" )
And yes that is terribly hacky and i would not recommend that. I would recommend to write procedures and functions on the databaseServer because this is not something ef-core should be used for. And then you can still run those functions from ef-core and pass parameters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With