Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SqlDataReader Performance List<string[]> or List<object[]>

I have been experimenting with ways to read data from a SQL server as quickly as possible and I came across an interesting discovery. If I read the data into a List<object[]> instead of a List<string[]>, performance increases by more than double.

I suspect this is due to not having to call the ToString() method on the fields, but I always thought that using objects had a negative impact on performance.

Is there any reason to not use a list of object arrays instead of string arrays?

EDIT: One thought I just had was the storage size of this data. Will storing the data in object arrays take more room than as strings?

Here is my test code:

private void executeSqlObject()
    {
        List<object[]> list = new List<object[]>();

        using (SqlConnection cnn = new SqlConnection(_cnnString))
        {
            cnn.Open();
            SqlCommand cmd = new SqlCommand("select * from test_table", cnn);

            SqlDataReader reader = cmd.ExecuteReader();

            int fieldCount = reader.FieldCount;

            while (reader.Read())
            {
                object[] row = new object[fieldCount];

                for (int i = 0; i < fieldCount; i++)
                {
                    row[i] = reader[i];
                }
                list.Add(row);
            }
        }
    }

    private void executeSqlString()
    {
        List<string[]> list = new List<string[]>();

        using (SqlConnection cnn = new SqlConnection(_cnnString))
        {
            cnn.Open();
            SqlCommand cmd = new SqlCommand("select * from test_table", cnn);

            SqlDataReader reader = cmd.ExecuteReader();

            int fieldCount = reader.FieldCount;

            while (reader.Read())
            {
                string[] row = new string[fieldCount];

                for (int i = 0; i < fieldCount; i++)
                {
                    row[i] = reader[i].ToString();
                }
                list.Add(row);
            }
        }
    }

    private void runTests()
    {
        Stopwatch watch = new Stopwatch();
        for (int i = 0; i < 10; i++)
        {
            watch.Start();
            executeSqlObject();
            Debug.WriteLine("Object Time: " + watch.ElapsedMilliseconds.ToString());
            watch.Reset();
        }
        for (int i = 0; i < 10; i++)
        {
            watch.Start();
            executeSqlString();
            Debug.WriteLine("String Time: " + watch.ElapsedMilliseconds.ToString());
            watch.Reset();
        }
    }

And the results:

Object Time: 879
Object Time: 812
Object Time: 825
Object Time: 882
Object Time: 880
Object Time: 905
Object Time: 815
Object Time: 799
Object Time: 823
Object Time: 817
Average: 844

String Time: 1819
String Time: 1790
String Time: 1787
String Time: 1856
String Time: 1795
String Time: 1731
String Time: 1792
String Time: 1799
String Time: 1762
String Time: 1869
Average: 1800
like image 492
ChandlerPelhams Avatar asked Nov 06 '11 21:11

ChandlerPelhams


People also ask

Which method provides SqlDataReader object from SqlCommand object?

To create a SqlDataReader, you must call the ExecuteReader method of the SqlCommand object, instead of directly using a constructor. While the SqlDataReader is being used, the associated SqlConnection is busy serving the SqlDataReader, and no other operations can be performed on the SqlConnection other than closing it.

Is it necessary to manually close and dispose of SqlDataReader?

You don't need the . Close() statement in either sample: it's handled by the . Dispose() call.

Is there anything faster than SqlDataReader in net?

SqlDataReader is the fastest way. Make sure you use the get by ordinal methods rather than get by column name. e.g. GetString(1);


1 Answers

object only adds overhead if you are causing additional boxing. And even then, this impact is fairly minimal. In your case, reader[i] always returns object. You already have it as object, no matter whether that is a reference to a string, or an int, etc. Of course calling .ToString() adds overhead; in most cases (int, DateTime, etc) this involves both formatting code and the allocation of one (or more) extra string. By changing to string you are changing the data (for the worse, IMO - for example, you can no longer do correct sorts on dates, for example) and adding overhead. The edge case here is if all the columns are already actually strings - in which case you just add a few virtual method calls (but no extra real work).

For info, if you are after raw performance, I thoroughly recommend looking at the micro-ORMs such as dapper. They are heavily optimised, but avoid the weight of "full" ORMs. For example, in dapper:

var myData = connection.Query<TypedObject>("select * from test_table").ToList();

will, I expect, perform very comparably while giving you strongly typed object data.

like image 71
Marc Gravell Avatar answered Sep 18 '22 02:09

Marc Gravell