I saw many questions about this and Marc Gravell's explanations of why he doesn't want to continue playing with the multiple grid on dapper.
But I just want to understand something
var grid = context.QueryMultiple(string.Join(" ; ", selectCommands));
return queries.Select(q => grid.Read<T>()).AsList();
In this case the QueryMultiple takes around 3 seconds to execute (which is the pretty much the execution time on the SQL) and the read of the grid which has 2 queries takes 3 seconds at first and then 9 seconds on the second.
There are around 50k rows on each and only 5 columns which 2 are int, 2 double(float on sql) and a datetime. I tried to turn off buffering and it didn't help.
Is this because Dapper first Queries the database just to get the queries executed. Then connects to the database again upon request for the read and gets all the data for the specific read?
Would it be better to just create a few async tasks that run together with a single query each?
I saw many questions about this and Marc Gravell's explanations of why he doesn't want to continue playing with the multiple grid on dapper
Most probable reason is, though we execute the query which yields multiple results using MARS - Multiple Active Result Sets
, but still you cannot access all the result sets in one go. They can be read in a sequential manner, as it internally executes by using the NextResult
feature of a DataReader
. For the application it is still 1 call, when ideally same thing using multiple Async calls will always fetch all the result sets in parallel (ideal system / resource usage)
Regarding your query:
queries.Select(q => grid.Read<T>()).AsList();
I assume all your result sets are of same type T
and final result you are fetching is of type List<IEnumerable<T>>
.
Now options that you have is to modify your queries using Async-Await
as follows:
var grid = await context.QueryMultipleAsync(string.Join(";",selectCommands));
return queries.Select(q => await grid.ReadAsync<T>()).AsList();
Would this help ?
To a limited extent, by making operations Asynchronous, but benefit would be limited, since still the Read would be sequential using MARS.
Now main part of your query,
In this case the QueryMultiple takes around 3 seconds to execute (which is the pretty much the execution time on the SQL) and the read of the grid which has 2 queries takes 3 seconds at first and then 9 seconds on the second.
This is surprising, Read is the only operation done here, query has already executed. We are using DataReader
, which is already a connected by, which is same for Dapper GridReader
, which is a wrapper over the DataReader
. If only reading is taking so much time that too for just 50K records, you might want to figure out other impacting parameters like Network IO speed, is it hitting a memory / RAM limit, which are the external factors with huge impact.
Is this because Dapper first Queries the database just to get the queries executed. Then connects to the database again upon request for the read and gets all the data for the specific read?
Always a better strategy due to following reason:
Having said this I am still surprised that just reading 50K records, is so slow, it has to be external factors. Toggling the buffer settings will not have a major impact for just 50K records, which they are compressed well using binary serialization, normally buffer on is a preferred setting, as its optimization
First thing that you must try is running the same test case on relatively high end hardware, higher RAM, more processor, better Network IO, just to understand whether change in hardware bring in the improvement, since Dapper is already optimized even if you use QueryMultiple
and Read via GridReader
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With