Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to optimize Linq query with large number of records?

Please help me to optimize the below code. I have tried different methods but I am not getting a significant performance improvement. There are around 30k entries in database and it's taking around 1 min to load in local.

var alarms = from healthIssue in _context.HealthIssues.AsNoTracking()
                         join asset in _context.Assets.AsNoTracking() on healthIssue.AssetNumber equals asset.SerialNumber into joinedTable
                         from data in joinedTable.DefaultIfEmpty()
                         select new
                         {
                             ID = healthIssue.ID,
                             AssetNumber = healthIssue.AssetNumber,                                                         
                             AlarmName = healthIssue.AlarmName, 
                             Crew = data.Crew,                           
                         };
//alarmsViewModelList count is 30k  
var alarmsViewModelList = await alarms.ToListAsync();
//groupedData count = 12k 
var groupedData = alarmsViewModelList.Select(c => new { c.AssetNumber,c.AlarmName}).Distinct().ToList();
// filteralarms' count = 20k 
var filteralarms = (alarmsViewModelList.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown).ToList());
for (int j = 0; j < groupedData.Count; j++)
{
    var alarm = groupedData[j];
    //The line is actually slowing the code.
    var alarmlist = filteralarms.AsEnumerable().Where(c => c.AlarmName == alarm.AlarmName && c.AssetNumber == alarm.AssetNumber
                            ).Select
                            (c => new
                            {
                                HealthIssueID = c.ID,
                                AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
                                AlarmSeverityLevel = c.AlarmSeverityLevel,
                                
                            }).OrderByDescending(c =>c.AlarmLastUpdateDateTime).ToList();
    int alarmCount = alarmlist.Count;
    if (alarmCount > 1)
    {
        businessLogicFunction(alarmlist); 
    }

}
like image 400
StarLord Avatar asked Sep 20 '25 12:09

StarLord


2 Answers

You basically group data by AlarmName + AssetNumber, filter our alarms with severity level Unknown then run business function on grouped batches (after minor adjustment). More efficient approach to this would be something like this:

var grouped = alarmsViewModelList
    // throw away unknown, you are not using them anywhere
    .Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown)
    // group by AssetNumber + AlarmName
    .GroupBy(c => new { c.AssetNumber, c.AlarmName })
    .Select(gr => new
    {
        gr.Key.AlarmName,
        gr.Key.AssetNumber,
        // convert batch of this group to the desired form
        Items = gr.Select(c => new
        {
            HealthIssueID = c.ID,
            AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
            AlarmSeverityLevel = c.AlarmSeverityLevel,
        }).OrderByDescending(c => c.AlarmLastUpdateDateTime).ToList()
    });

foreach (var data in grouped) {
    if (data.Items.Count > 1) {
        businessLogicFunction(data.Items);
    }
}
like image 161
Evk Avatar answered Sep 23 '25 05:09

Evk


This is what I can make with linq.

  //alarmsViewModelList count is 30k  
var alarmsViewModelList = await alarms.ToListAsync();
//groupedData is almost 12k 
var groupedData = alarmsViewModelList.Select(c => new { c.AssetNumber,c.AlarmName}).Distinct().ToList();
// filteralarms' count is almost 20k 
var filteralarms = alarmsViewModelList.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown).OrderByDescending(c => DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdateDateTime));
for (int j = 0; j < groupedData.Count; j++)
{
    var alarm = groupedData[j];
    //The line is actually slowing the code.
    var alarmlist = filteralarms.Where(c => c.AlarmName == alarm.AlarmName && c.AssetNumber == alarm.AssetNumber);
    
    if (alarmlist.Count() > 1)
    {
        businessLogicFunction(alarmlist.Select
                            (c => new
                            {
                                HealthIssueID = c.ID,
                                AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
                                AlarmSeverityLevel = c.AlarmSeverityLevel,
                                
                            }).ToList()); 
    }
filteralarms = filteralarms.Where(c => c.AlarmName != alarm.AlarmName || c.AssetNumber != alarm.AssetNumber).ToList();

}

Above code O(2n) I think. And if you can, you can make it faster by removing ToList() in businessLogicFunction like.

businessLogicFunction(alarmlist.Select
                        (c => new
                        {
                            HealthIssueID = c.ID,
                            AlarmLastUpdateDateTime = DateTimeHelpers.FromEpochSecondsUTC(c.AlarmLastUpdatedTime),
                            AlarmSeverityLevel = c.AlarmSeverityLevel,

                        })); 

Changed it so don't use skip insted index that way way faster Even faster approach is order the lists and skip the rest like this:

//alarmsViewModelList count is 30k  
        var alarmsViewModelList = alarms.ToList();
        // here the groupedData list look like this {(1,1),(2,1),(3,1),(4,1),(5,1),(6,1)}. because the list is orderd by assetNumber then by alarmName
        var groupedData = alarmsViewModelList.Select(c => new { c.AssetNumber, c.AlarmName }).Distinct().OrderBy(c => c.AssetNumber ).ThenBy(c => c.AlarmName).ToList();
        // here the filteralarms list look like this {(1,1), (1,1) (1,1), (2,1),(2,1),(3,1),(3,1),(3,1),(4,1)...}
        var filteralarms = alarmsViewModelList.Where(c => c.AlarmSeverityLevel != AlarmSeverityLevel.Unknown).OrderBy(c => c.AssetNumber).ThenBy(c => c.AlarmName).AsEnumerable();
        int k  = 0;
        for (int j = 0; j < groupedData.Count; j++)
        {
            
            var alarm = groupedData[j];
            //The line is actually slowing the code.
            var alarmlist = new List<Alarm>();
            for(; k<filteralarms.Count();k++)
            {
                if (filteralarms[k].AlarmName == alarm.AlarmName && filteralarms[k].AssetNumber == alarm.AssetNumber)
                {
                    alarmlist.Add(filteralarms[k]);
                }
                else
                {
                    break;
                }
            }
            if (alarmlist.Count() > 1)
            {
                businessLogicFunction(alarmlist.Select
                                    (c => new
                                    {
                                        HealthIssueID = c.ID,
                                        AlarmLastUpdateDateTime = c.AlarmLastUpdatedTime,
                                        AlarmSeverityLevel = c.AlarmSeverityLevel,

                                    }).OrderByDescending(c => c.AlarmLastUpdateDateTime).ToList());
            }
            

Above code is O(n) I think.

like image 25
lork6 Avatar answered Sep 23 '25 03:09

lork6