I want to have EF core translate .Select(x=>x.property).Distinct().Count()
into something like
SELECT COUNT(DISTINCT property)
Let's take an example. Let's say I have a DB table with PersonID(long), VisitStart(datetime2) and VisitEnd(datetime2). If i want to get the number of distinct days a particular person has visited, then I could write SQL like
SELECT COUNT(DISTINCT CONVERT(date, VisitStart)) FROM myTable GROUP BY PersonID
But using EF core and this
MyTable
.GroupBy(x=>x.PersonID)
.Select(x=> new
{
Count = x.Select(y=>y.VisitStart.Date).Distinct().Count()
})
which gives the right results, translates into this SQL
SELECT [x].[PersonID], [x].[VisitStart], [x].[VisitEnd]
FROM [myTable] as [x]
ORDER BY [x].[PersonID]
There is no GROUP BY and no DISTINCT or COUNT anywhere so the grouping must be done in memory, which is not ideal when operating on a table that has millions of records that potentially has to be pulled from DB.
So anyone know how to get EF core to translate a .Select(...).Distinct().Count()
into SELECT COUNT(DISTINCT ...)
To count the number of different values that are stored in a given column, you simply need to designate the column you pass in to the COUNT function as DISTINCT . When given a column, COUNT returns the number of values in that column. Combining this with DISTINCT returns only the number of unique (and non-NULL) values.
Yes, you can use COUNT() and DISTINCT together to display the count of only distinct rows. SELECT COUNT(DISTINCT yourColumnName) AS anyVariableName FROM yourTableName; To understand the above syntax, let us create a table.
The COUNT DISTINCT function returns the number of unique values in the column or expression, as the following example shows. SELECT COUNT (DISTINCT item_num) FROM items; If the COUNT DISTINCT function encounters NULL values, it ignores them unless every value in the specified column is NULL.
COUNT(column name) vs COUNT (DISTINCT column_name)COUNT(column_name) will include duplicate values when counting. In contrast, COUNT (DISTINCT column_name) will count only distinct (unique) rows in the defined column.
I wanted to share an idea I had for solving my issues about count distinct.
Ultimately another way of doing count distinct in a group by function, is by having nested group by functions (assuming you can aggregate your data through).
Here is an example of what I used, it seems to work.
Apologes for the criptic acronims, I am using this to keep my JSON as small as can be.
var myData = _context.ActivityItems
.GroupBy(a => new { ndt = EF.Property<DateTime>(a, "dt").Date, ntn = a.tn })
.Select(g => new
{
g.Key.ndt,
g.Key.ntn,
dpv = g.Sum(o => o.pv),
dlv = g.Sum(o => o.lv),
cnt = g.Count(),
})
.GroupBy(a => new { ntn = a.ntn })
.Select(g => new
{
g.Key.ntn,
sd = g.Min(o => o.ndt),
ld = g.Max(o => o.ndt),
pSum = g.Sum(o => o.dpv),
pMin = g.Min(o => o.dpv),
pMax = g.Max(o => o.dpv),
pAvg = g.Average(o => o.dpv),
lSum = g.Sum(o => o.dlv),
lMin = g.Min(o => o.dlv),
lMax = g.Max(o => o.dlv),
lAvg = g.Average(o => o.dlv),
n10s = g.Sum(o => o.cnt),
ndays = g.Count()
});
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With