Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I optimize this actionresult for better performance? I need put put a timer on when to "GET" XML data from a url and etc

I have a actionresult that I think is pretty heavy, so I wonder how can I optimize it so it gets better performance. This web application will be used for by +100, 000 users at same time.

Right now my Actionresult does the following things:

  • Retrieve XML file from a internet url
  • Fills the xml data to my DB
  • DB data fills my Viewmodel
  • Returns the model to the view

This 4 functions triggers everytime a user visits the view. This is why I think this Actionresult is very badly made by me.

How can I add this following things to my Actionresults?

add a timer to retrieve XML file and fill xml data to DB, like every 10 minute, so it doesnt trigger everytime a user visits the view. The only function that needs to trigger everytime a user visits the site is the viewmodel binding and returning the model. How can I accomplish this?

Note:

  • the xml file gets updated with new data every 10 min or so.
  • I have around 50 actionresults that does the same get xml data and adds to database but 50 different xml files.
  • If the xml URL is offline it should skip the whole xml retrieve and DB add and just do the modelbinding

This is my actionresult:

public ActionResult Index()
        {
            //Get data from xml url (This is the code that shuld not run everytime a user visits the view)
            var url = "http://www.interneturl.com/file.xml";
            XNamespace dcM = "http://search.yahoo.com/mrss/";
            var xdoc = XDocument.Load(url);
            var items = xdoc.Descendants("item")
            .Select(item => new
            {
                Title = item.Element("title").Value,
                Description = item.Element("description").Value,
                Link = item.Element("link").Value,
                PubDate = item.Element("pubDate").Value, 
                MyImage = (string)item.Elements(dcM + "thumbnail")
               .Where(i => i.Attribute("width").Value == "144" && i.Attribute("height").Value == "81")
               .Select(i => i.Attribute("url").Value)
               .SingleOrDefault()
            })
            .ToList();

            //Fill my db entities with the xml data(This is the code that shuld not run everytime a user visits the view)
            foreach (var item in items)
            {
                var date = DateTime.Parse(item.PubDate);
                if (!item.Title.Contains(":") && !(date <= DateTime.Now.AddDays(-1)))
                    {
                        News NewsItem = new News();
                        Category Category = new Category();
                        var CategoryID = 2;

                        var WorldCategoryID = re.GetByCategoryID(CategoryID);
                        NewsItem.Category = WorldCategoryID;

                        NewsItem.Description = item.Description;
                        NewsItem.Title = item.Title.Replace("'", "");
                        NewsItem.Image = item.MyImage;

                        NewsItem.Link = item.Link;
                        NewsItem.Date = DateTime.Parse(item.PubDate);
                        re.AddNews(NewsItem);
                        re.save();
                    }
                }


            //All code below this commenting needs to run everytime a user visits the view
            var GetAllItems = re.GetAllWorldNewsByID();

            foreach (var newsitemz in GetAllItems)
            {
                if (newsitemz.Date <= DateTime.Now.AddDays(-1))
                {
                    re.DeleteNews(newsitemz);
                    re.save();
                }

            }

            var model = new ItemViewModel()
            {
               NewsList = new List<NewsViewModel>()
            };

            foreach (var NewsItems in GetAllItems)
            {
                FillProductToModel(model, NewsItems);
            }

            return View(model);
        }

Right now everytime a user visits the index view, it will get XML data and add it to the DB, so the bad fix Ive done in my repository is following on addNews:

 public void AddNews(News news)
        {
            var exists = db.News.Any(x => x.Title == news.Title);

             if (exists == false)
            {
                db.News.AddObject(news);
            }
            else
            {
                db.News.DeleteObject(news);
            }
        }

Any kind of solution and info is highly appreciated!

like image 336
Obsivus Avatar asked Oct 22 '22 09:10

Obsivus


1 Answers

There are great many things that could be done here: does the file have to be XML (very verbose compared to JSON)? Does it have to be saved to the DB every time?

However, assuming that you have to do every step you have two bottlenecks:

  1. Waiting for the XML file to download/parse
  2. Saving all the XML data to the DB

There are a couple of ways you can speed this up:

Set up a polling interval

If you're happy not seeing updates immediately then you can do something like this:

  • Check the DB for the last update.
  • If (and only if) the last update is more than 10 mins old:
    • Retrieve XML file from a internet url
    • Fills the xml data to my DB
  • DB data fills my Viewmodel
  • Returns the model to the view

This means that your data may be up to 10 mins out of date, but the vast majority of requests will only have to populate the model.

Depending how you're using this you could make this even simpler - just add an OutputCache attribute:

[OutputCache(Duration=600)]
public ActionResult Index() { ...

This will tell the browser to only refresh every 10 mins. You can also set the Location attribute to make that just cached by the browser or on the server for everyone.

Make the XML retrieval async

During the download of the XML file your code is basically just waiting for the URL to be loaded - using the new async keyword in C# you don't need to wait here.

public async Task<ActionResult> Index()
{
    // Get data from xml url
    string url = "http://www.interneturl.com/file.xml";
    XNamespace dcM = "http://search.yahoo.com/mrss/";

    // The await keyword tells the C# code to continue until the slow action completes
    var xdoc = await LoadRemoteXmlAsync(url, dcM);

    // This won't fire until LoadRemoteXmlAsync has finished
    var items = xdoc.Descendants("item")

There's a lot more to using async than I can practically cover here, but if you're on the lastest C# and MVC it could be fairly simple to start using it.

Make only 1 DB call

Your current DB save action is very sub-optimal:

  • Your code suffers from something commonly called the N+1 problem.
  • Each time you add you're first checking the title and deleting the record. This is a very slow way to do the update and will make if very difficult to use any indexes to optimise it.
  • You're looping through all of your news articles every time and deleting all the old ones one by one. That's much slower than an single delete from News where ... query.

Based on this I'd try the following changes (in rough order of how easy they should be):

  1. Change your AddNews method - if the new data is not newer then don't save any changes for that item.

  2. Change your deletion loop to be a single delete from News where Date <= @yesterday

  3. Look at indexes on the news item title and the date, these appear to be the fields that you're querying most.

  4. Look at replacing your AddNews method with something that does an upsert/merge

  5. Does re.GetByCategoryID hit your DB? If so consider splitting that out and either building it into the update query or populating a dictionary to look it up more quickly.

Basically you should have (at most) 1 DB operation per new news article and 1 DB operation to delete the old ones. You currently have 3 per article less than a day old (re.GetByCategoryID + db.News.Any + db.News.Add|DeleteObject) another 1 (re.GetAllWorldNewsByID) and then yet another 1 per article to delete (re.DeleteNews).

Add Profiling

You can add profiling to MVC projects that will tell you exactly how long each step is taking and help find how to optimise them using MiniProfiler. It's used on StackOverflow and I've used it a lot myself - it will tell you which steps are slowing you down and which aren't worth micro-optimising.

If you don't want to use that there are optimisation tools in Visual Studio as well as third party ones like RedGate ANTS.

like image 100
Keith Avatar answered Oct 31 '22 17:10

Keith