Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing an algorithm and/or structure in C#

I'm working on an application where you're able to subscribe to a newsletter and choose which categories you want to subscribe to. There's two different sets of categories: cities and categories.

Upon sending emails (which is a scheduled tast), I have to look at which cities and which categories a subscriber has subscribed to, before sending the email. I.e., if I have subscribed to "London" and "Manchester" as my cities of choice and have chosen "Food", "Cloth" and "Electronics" as my categories, I will only get the newsletters that relates to these.

The structure is as follows:

On every newsitem in Umbraco CMS there is a commaseparated string of cities and categories (effectively, these are stored as node ids since cities and categories are nodes in Umbraco aswell) When I subscribe to one or more city and one or more category, I store the city and category nodeids in the database in custom tables. My relational mapping looks like this:

enter image description here

And the whole structure looks like this:

enter image description here

To me, this seems like two sets of 1 - 1..* relations (one subscriber to one or many cities and one subscriber to one or many categories)

To find which emails to send who which subscriber, my code looks like this:

private bool shouldBeAdded = false;

// Dictionary containing the subscribers e-mail address and a list of news nodes which should be sent
Dictionary<string, List<Node>> result = new Dictionary<string, List<Node>>();

foreach(var subscriber in subscribers)
{
    // List of Umbraco CMS nodes to store which nodes the html should come from
    List<Node> nodesToSend = new List<Node> nodesToSend();

    // Loop through the news
    foreach(var newsItem in news)
    {
        // The news item has a comma-separated string of related cities
        foreach (string cityNodeId in newsItem.GetProperty("cities").Value.Split(','))
        {
            // Check if the subscriber has subscribed to the city
            if(subscriber.CityNodeIds.Contains(Convert.ToInt32(cityNodeId)))
            {
                 shouldBeAdded = true;
            }
        }

        // The news item has a comma-separated string of related categories
        foreach (string categoryNodeId in newsItem.GetProperty("categories").Value.Split(','))
        {
            // Check if the subscriber has subscribed to the category
            if(subscriber.CategoryNodeIds.Contains(Convert.ToInt32(categoryNodeId)))
            {
                shouldBeAdded = true;
            }
        }
    }

    // Store in list
    if (shouldBeAdded)
    {
        nodesToSend.Add(newsItem);
    }

    // Add it to the dictionary
    if (nodesToSend.Count > 0)
    {
        result.Add(subscriber.Email, nodesToSend);
    }
}

// Ensure that we process the request only if there are any subscribers to send mails to
if (result.Count > 0)
{
    foreach (var res in result)
    {
        // Finally, create/merge the markup for the newsletter and send it as an email.
    } 
}

While this works, I'm a bit concerned about performance when a certain amount of subscribers is reached since we're into three nested foreach loops. Also, remembering my old teachers preaches: "for every for loop there is a better structure"

So, I would like your oppinion on the above solution, is there anything that can be improved here with the given structure? And will it cause performance problems over time?

Any help/hint is greatly appreciated! :-)

Thanks in advance.

Solution

So after a few good hours of debugging and fumblin' around I finally came up with something that works (initially, it looked like my original code worked, but it didn't)

Sadly, I couldn't get it to work with any LINQ queries I tried, so I went back to the "ol' school' way of iterating ;-) The final algorithm looks like this:

private bool shouldBeAdded = false;

// Dictionary containing the subscribers e-mail address and a list of news nodes which should be sent
Dictionary<string, List<Node>> result = new Dictionary<string, List<Node>>();

foreach(var subscriber in subscribers)
{
    // List of Umbraco CMS nodes to store which nodes the html should come from
    List<Node> nodesToSend = new List<Node> nodesToSend();

    // Loop through the news
    foreach(var newsItem in news)
    {
         foreach (string cityNodeId in newsItem.GetProperty("cities").Value.Split(','))
         {
             // Check if the subscriber has subscribed to the city
             if (subscriber.CityNodeIds.Contains(Convert.ToInt32(cityNodeId)))
             {
                 // If a city matches, we have a base case
                 nodesToSend.Add(newsItem);
             }
         }

         foreach (string categoryNodeId in newsItem.GetProperty("categories").Value.Split(','))
         {
             // Check if the subscriber has subscribed to the category
             if (subscriber.CategoryNodeIds.Contains(Convert.ToInt32(categoryNodeId)))
             {
                 shouldBeAdded = true;

                 // News item matched and will be sent. Stop the loop.
                 break;
             }
             else
             {
                 shouldBeAdded = false;
             }
         }

         if (!shouldBeAdded)
         {
             // The news item did not match both a city and a category and should not be sent
             nodesToSend.Remove(newsItem);
         }
    }

    if (nodesToSend.Count > 0)
    {
        result.Add(subscriber.Email, nodesToSend);
    }  
}

// Ensure that we process the request only if there are any subscribers to send mails to
if (result.Count > 0)
{
    foreach (var res in result)
    { 
        // StringBuilder to build markup for newsletter
        StringBuilder sb = new StringBuilder();

        // Build markup
        foreach (var newsItem in res.Value)
        {
            // build the markup here
        }

        // Email logic here
    }
}
like image 448
bomortensen Avatar asked Mar 08 '12 09:03

bomortensen


People also ask

What is optimization in C?

Optimization is a program transformation technique, which tries to improve the code by making it consume less resources (i.e. CPU, Memory) and deliver high speed. In optimization, high-level general programming constructs are replaced by very efficient low-level programming codes.

What is meant by optimization algorithm?

An optimization algorithm is a procedure which is executed iteratively by comparing various solutions till an optimum or a satisfactory solution is found. With the advent of computers, optimization has become a part of computer-aided design activities.


1 Answers

First you can break you inner foreach as soon as shouldBeAdde = true.

You also could use LINQ, but I'm not sure if it will be faster (but you could use .AsParallel to make it multithreaded easily):

var nodesToSend = from n in news
                 where n.GetProperties("cities").Value.Split(',')
                     .Any(c => subscriber.CityNodeIds.Contains(Convert.ToInt32(c)) &&
                 n.GetProperties("categories").Value.Split(',')
                     .Any(c => subscriber.CategoryNodeIds.Contains(Convert.ToInt32(c))
                 select n;

The complete think would then come down to (incl. parallel):

Dictionary<string, IEnumerable<Node>> result = new Dictionary<string, IEnumerable<Node>>();
foreach(var subscriber in subscribers)
{
    var nodesToSend = from n in news.AsParallel()
        where n.GetProperties("cities").Value.Split(',')
                .Any(c => subscriber.CityNodeIds.Contains(Convert.ToInt32(c)) &&
            n.GetProperties("categories").Value.Split(',')
                .Any(c => subscriber.CategoryNodeIds.Contains(Convert.ToInt32(c))
        select n;

    if (nodesToSend.Count > 0)
        result.Add(subscriber.Email, nodesToSend);
}

if (result.Count > 0)
{
    foreach (var res in result)
    {
        // Finally, create/merge the markup for the newsletter and send it as an email.
    } 
}
like image 88
Christoph Fink Avatar answered Sep 22 '22 04:09

Christoph Fink