I have about 100,000 lines of generic data. Columns/Properties of this data are user definable and are of the usual data types (string, int, double, date). There will be about 50 columns/properties.
I have 2 needs:
DataTable: Pros: DataColumn.Expression is inbuilt Cons: RowFilter & coding c# is not as "nice" as LINQ, DataColumn.Expression does not support callbacks(?) => workaround could be to get & replace external value when creating the calculated column GenericList: Pros: LINQ syntax, NCalc supports callbacks Cons: Implementing NCalc/generic calc engine
Based on the above I would think a GenericList approach would win, but something I have not factored in is the performance which for some reason I think would be better with a datatable.
Does anyone have a gut feeling / experience with LINQ vs. DataTable performance?
How about NCalc?
As I said there are about 100,000 rows of data, with 50 columns, of which maybe 20 are calculated.
In total about 50 rules will be run against the data, so in total there will be 5 million row/object scans.
Would really appreciate any insights. Thx.
ps. Of course using a database + SQL & Views etc. would be the easiest solution, but for various reasons can't be implemented.
Well, using DataTable does not preclude use of LINQ
table.Rows.Cast<DataRow>() //IEnumerable<DataRow>, linq it to death
This guy makes some arguments about HashTable
vs. DataTable
and this guy finds Dictionary
better than DataTable
, but not by much (factors in Dictionary create cost).
Note: if the columns are known beforehand (that is, a user may select some of the columns from a predefined set of columns(name, type)), I would go with strongly typed classes, since data["property"]
does not get Intellisense support like data.Property
does.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With