Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use more than 2100 values in an IN clause using Dapper?

I have a List containing ids that I want to insert into a temp table using Dapper in order to avoid the SQL limit on parameters in the 'IN' clause.

So currently my code looks like this:

public IList<int> LoadAnimalTypeIdsFromAnimalIds(IList<int> animalIds)
{
    using (var db = new SqlConnection(this.connectionString))
    {
        return db.Query<int>(
            @"SELECT a.animalID        
            FROM
            dbo.animalTypes [at]
            INNER JOIN animals [a] on a.animalTypeId = at.animalTypeId
            INNER JOIN edibleAnimals e on e.animalID = a.animalID
            WHERE
            at.animalId in @animalIds", new { animalIds }).ToList();
    }
}

The problem I need to solve is that when there are more than 2100 ids in the animalIds list then I get a SQL error "The incoming request has too many parameters. The server supports a maximum of 2100 parameters".

So now I would like to create a temp table populated with the animalIds passed into the method. Then I can join the animals table on the temp table and avoid having a huge "IN" clause.

I have tried various combinations of syntax but not got anywhere. This is where I am now:

public IList<int> LoadAnimalTypeIdsFromAnimalIds(IList<int> animalIds)
{
    using (var db = new SqlConnection(this.connectionString))
    {
        db.Execute(@"SELECT INTO #tempAnmialIds @animalIds");

        return db.Query<int>(
            @"SELECT a.animalID        
            FROM
            dbo.animalTypes [at]
            INNER JOIN animals [a] on a.animalTypeId = at.animalTypeId
            INNER JOIN edibleAnimals e on e.animalID = a.animalID
            INNER JOIN #tempAnmialIds tmp on tmp.animalID = a.animalID).ToList();
    }
}

I can't get the SELECT INTO working with the list of IDs. Am I going about this the wrong way maybe there is a better way to avoid the "IN" clause limit.

I do have a backup solution in that I can split the incoming list of animalIDs into blocks of 1000 but I've read that the large "IN" clause sufferes a performance hit and joining a temp table will be more efficient and it also means I don;t need extra 'splitting' code to batch up the ids in to blocks of 1000.

like image 998
Chris B Avatar asked Sep 20 '16 11:09

Chris B


1 Answers

Ok, here's the version you want. I'm adding this as a separate answer, as my first answer using SP/TVP utilizes a different concept.

public IList<int> LoadAnimalTypeIdsFromAnimalIds(IList<int> animalIds)
{
  using (var db = new SqlConnection(this.connectionString))
  {
    // This Open() call is vital! If you don't open the connection, Dapper will
    // open/close it automagically, which means that you'll loose the created
    // temp table directly after the statement completes.
    db.Open();

    // This temp table is created having a primary key. So make sure you don't pass
    // any duplicate IDs
    db.Execute("CREATE TABLE #tempAnimalIds(animalId int not null primary key);");
    while (animalIds.Any())
    {
      // Build the statements to insert the Ids. For this, we need to split animalIDs
      // into chunks of 1000, as this flavour of INSERT INTO is limited to 1000 values
      // at a time.
      var ids2Insert = animalIds.Take(1000);
      animalIds = animalIds.Skip(1000).ToList();

      StringBuilder stmt = new StringBuilder("INSERT INTO #tempAnimalIds VALUES (");
      stmt.Append(string.Join("),(", ids2Insert));
      stmt.Append(");");

      db.Execute(stmt.ToString());
    }

    return db.Query<int>(@"SELECT animalID FROM #tempAnimalIds").ToList();
  }
}

To test:

var ids = LoadAnimalTypeIdsFromAnimalIds(Enumerable.Range(1, 2500).ToList());

You just need to amend your select statement to what it originally was. As I don't have all your tables in my environment, I just selected from the created temp table to prove it works the way it should.

Pitfalls, see comments:

  • Open the connection at the beginning, otherwise the temp table will be gone after dapper automatically closes the connection right after creating the table.
  • This particular flavour of INSERT INTO is limited to 1000 values at a time, so the passed IDs need to be split into chunks accordingly.
  • Don't pass duplicate keys, as the primary key on the temp table will not allow that.

Edit

It seems Dapper supports a set-based operation which will make this work too:

public IList<int> LoadAnimalTypeIdsFromAnimalIdsV2(IList<int> animalIds)
{
  // This creates an IEnumerable of an anonymous type containing an Id property. This seems
  // to be necessary to be able to grab the Id by it's name via Dapper.
  var namedIDs = animalIds.Select(i => new {Id = i});
  using (var db = new SqlConnection(this.connectionString))
  {
    // This is vital! If you don't open the connection, Dapper will open/close it
    // automagically, which means that you'll loose the created temp table directly
    // after the statement completes.
    db.Open();

    // This temp table is created having a primary key. So make sure you don't pass
    // any duplicate IDs
    db.Execute("CREATE TABLE #tempAnimalIds(animalId int not null primary key);");

    // Using one of Dapper's convenient features, the INSERT becomes:
    db.Execute("INSERT INTO #tempAnimalIds VALUES(@Id);", namedIDs);

    return db.Query<int>(@"SELECT animalID FROM #tempAnimalIds").ToList();
  }
}

I don't know how well this will perform compared to the previous version (ie. 2500 single inserts instead of three inserts with 1000, 1000, 500 values each). But the doc suggests that it performs better if used together with async, MARS and Pipelining.

like image 199
takrl Avatar answered Sep 18 '22 08:09

takrl