I have a table like this
UserID Year EffectiveDate Type SpecialExpiryDate
1 2015 7/1/2014 A
1 2016 7/1/2015 B 10/1/2015
there is no ExpriyDate in the table because it is only valid for one year, so the expiry date can be calculated from the effective date by adding a year.
The result I want to get is like this (the current year's effective date and the next year's expiry date)
UserID EffectiveDate ExpiryDate
1 7/1/2014 7/1/2016
And If the user's type is B, then there will be a special expiry date, so for this person, the result will be
UserID EffectiveDate ExpiryDate
1 7/1/2014 10/1/2015
Here is the code I wrote
var result = db.Table1
.Where(x => x.Year>= 2015 && (x.Type == "A" || x.Type == "B"))
.GroupBy(y => y.UserID)
.OrderByDescending(x => x.FirstOrDefault().Year)
.Select(t => new
{
ID = t.Key,
Type = t.FirstOrDefault().Type,
EffectiveDate = t.FirstOrDefault().EffectiveDate,
ExpiryDate = t.FirstOrDefault().SpecialExpiryDate != null ? t.FirstOrDefault().SpecialExpiryDate : (t.Count() >= 2 ? NextExpiryDate : CurrentExpiryDate)
}
);
The code can get the result I need, but the problem is that in the result set there are about 10000 records which took about 5 to 6 seconds. The project is for a web search API, so I want to speed it up, is there a better way to do the query?
Edit
Sorry I made a mistake, in the select clause it should be
EffectiveDate = t.LastOrDefault().EffectiveDate
but in the Linq of C#, it didn't support this LastOrDefault function transfered to sql, and it cause the new problem, what is the easiest way to get the second item of the group?
You could generate the calculated data on the fly, using a View in your database.
Something like this (pseudocode):
Create View vwUsers AS
Select
UserID,
Year,
EffectiveDate,
EffectiveData + 1 as ExpiryDate, // <--
Type,
SpecialExpiryDate
From
tblUsers
And just connect your LINQ query to that.
Try this:
var result =
db
.Table1
.Where(x => x.Year>= 2015 && (x.Type == "A" || x.Type == "B"))
.GroupBy(y => y.UserID)
.SelectMany(y => y.Take(1), (y, z) => new
{
ID = y.Key,
z.Type,
z.EffectiveDate,
ExpiryDate = z.SpecialExpiryDate != null
? z.SpecialExpiryDate
: (t.Count() >= 2 ? NextExpiryDate : CurrentExpiryDate),
z.Year,
})
.OrderByDescending(x => x.Year);
The .SelectMany(y => y.Take(1) effectively does the .FirstOrDefault() part of your code. By doing this once rather than for many properties you may improve the speed immensely.
In a test I performed using a similarly structured query I got these sub-queries being run when using your approach:
SELECT t0.increment_id
FROM sales_flat_order AS t0
GROUP BY t0.increment_id
SELECT t0.hidden_tax_amount
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND @n0 IS NULL) OR (t0.increment_id = @n0))
LIMIT 0, 1
-- n0 = [100000001]
SELECT t0.customer_email
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND @n0 IS NULL) OR (t0.increment_id = @n0))
LIMIT 0, 1
-- n0 = [100000001]
SELECT t0.hidden_tax_amount
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND @n0 IS NULL) OR (t0.increment_id = @n0))
LIMIT 0, 1
-- n0 = [100000002]
SELECT t0.customer_email
FROM sales_flat_order AS t0
WHERE ((t0.increment_id IS NULL AND @n0 IS NULL) OR (t0.increment_id = @n0))
LIMIT 0, 1
-- n0 = [100000002]
(This continued on for two sub-queries per record number.)
If I ran my approach I got this single query:
SELECT t0.increment_id, t1.hidden_tax_amount, t1.customer_email
FROM (
SELECT t2.increment_id
FROM sales_flat_order AS t2
GROUP BY t2.increment_id
) AS t0
CROSS APPLY (
SELECT t3.customer_email, t3.hidden_tax_amount
FROM sales_flat_order AS t3
WHERE ((t3.increment_id IS NULL AND t0.increment_id IS NULL) OR (t3.increment_id = t0.increment_id))
LIMIT 0, 1
) AS t1
My approach should be much faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With