Effecient way to compare data tables

Tags:

I have below c# method to compare two data tables and return the mismatch records.

public DataTable GetTableDiff(DataTable dt1, DataTable dt2, string columnName)
{
    var StartTime = DateTime.Now;
    dt1.PrimaryKey = new DataColumn[] { dt1.Columns["N"] };
    dt2.PrimaryKey = new DataColumn[] { dt2.Columns["N"] };

    DataTable dtDifference = null;
    //Get the difference of two datatables
    var dr = from r in dt1.AsEnumerable()
             where !dt2.AsEnumerable().Any(r2 => r["N"].ToString().Trim().ToLower() == r2["N"].ToString().Trim().ToLower()
                 && r[columnName].ToString().Trim().ToLower() == r2[columnName].ToString().Trim().ToLower())
             select r;

    if (dr.Any())
    {
        dtDifference = dr.CopyToDataTable();
    }
    return dtDifference;
}

This code works, but it takes 1.24 minutes to compare 10,000 records in the datatable. Any way to make this faster?

N is the primary key and columnName is the column to compare.

Thanks.

317

asked May 07 '20 04:05

user1447718

1 Answers

First I would ask if you have tried this in a simple for/foreach loop instead and compared the performance?

At the moment you are creating a new Enumerable and then copying to a datatable. If you use a for/foreach loop then you can compare and copy in the same iteration.

You should also look at the string comparison. At the moment you are trimming then converting to lowercase. This will allocate new memory for each operation for each string as strings are immutable. So in your where statement you are basically doing this (up to) 8 times per iteration.

I would also ask if you really need Trim()? Is it likely that one DT will have a space at the front of the string and the other not? Or will a comparison still be true? Don't trim strings unless really needed.

Then you should use case insensitive string comparison rather than converting ToLower. This will be quicker. According to MS StringComparison.OrdinalIgnoreCase is better performing.

Do these and then compare performance and see how much difference you have

See also: https://docs.microsoft.com/en-us/dotnet/standard/base-types/best-practices-strings

Update:

This intrigued me, so I went back and done some tests. I generated 10,000 rows of random(ish) data in two datatables where every second row would match and executed your comparison vs a simplified for loop comparison with a String comparison like this:

  for (int i = 0; i < dt1.Rows.Count; i++)
  {
      if (dt1.Rows[i]["N"].ToString().Equals(dt2.Rows[i]["N"].ToString(), StringComparison.OrdinalIgnoreCase)
          && dt1.Rows[i][columnName].ToString().Equals(dt2.Rows[i][columnName].ToString(), StringComparison.OrdinalIgnoreCase))
      {
          dtDifference.Rows.Add(dt1.Rows[i].ItemArray);
      }
  }

Your code = 66,000ms -> 75,000ms

For loop code = 12ms -> 20ms

A significant difference!

Then I did a comparison using the for loop method but with the two different string comparison types for the string. Using my string comparison, vs yours. But I had to test on 1 million rows for this, to get a significant difference.

This differend by between 200ms and 800ms

So it seems in this case that the string comparison is not a major factor.

So it seems that your Linq query creating the datarows is what is taking the majority of time and not the comparison of the rows themselves.

So switch to using the for loop, and all will be well in the world again!

199

answered Sep 23 '22 18:09

jason.kaisersmith

Related questions
                            
                                Identity Server 4 - unauthorized client
                            
                                Model binding stopped working after migrating from .NET Core 2.2 to 3.0-preview-9
                            
                                Can I use lambda in Q# to operate on qubits?
                            
                                Why does a simple .Net Core 3.0 WPF app not start on a deployment computer
                            
                                Can you skip remaining breakpoints in debug mode in Visual Studio 2017/2019?
                            
                                Access token validation failure Microsoft Graph API
                            
                                Passing byte array from Unity to Android (C++) for modification
                            
                                Can I Enforce a Subclass to Implement an Interface?
                            
                                How to initialize a scoped injected class in ASP.NET Core involving asynchronous calls
                            
                                Can you define a generic that takes *any* nullable type, value or reference?
                            
                                Howto upload big files 2GB+ to .NET Core API controller from a form?
                            
                                How to safely dispose of IAsyncDisposable objects retrieved with await foreach?
                            
                                Surprising or wrong benchmarks of Where(predicate).FirstOrDefault() vs FirstOrDefault(predicate)?
                            
                                .NET Core 3.1 ChangePasswordAsync Inner Exception "Cannot update Identity column"
                            
                                My join .NetCore 3.1 throws an exception about NavigationExpandingExpressionVisitor, what is that?
                            
                                Passing config values as parameters to an instance method C#
                            
                                What's the equivalent of GetKey in the new unity input system?
                            
                                How to set EntryAssembly for tests in .Net Core
                            
                                How to send data to Service Bus Topic with Azure Functions?
                            
                                .Net Core WindowsIdentity impersonation does not seem to be working

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Effecient way to compare data tables

Tags:

performance

c#

linq

user1447718

People also ask

1 Answers

jason.kaisersmith

Recent Activity

Donate For Us