I have been presented a task that I'm struggling with, in terms of efficiency. I have a database that can have hundreds of thousands of transactions & people. My goal is to find people who commonly have transactions near one another (Person X has had a transaction within 10 minutes of person Y, on 5 separate occasions).
I'm struggling to wrap my head around an efficient way to approach this. The simplest approach is :
foreach(var doc in db.Transactions.OrderBy(d => d.TransactionID))
{
foreach(var doc2 in db.Transactions.Where(d => d.TransactionID > doc.TransactionID))
{
if(doc2.DateCreated.IsBetween(doc.DateCreated,minutes))
{
// hit found
}
}
}
(TransactionID is a bigint identity). Once I have my list of hits
, it would be easy to count occurrences. But this is obviously pretty poor. The run time is
which will be very slow at 1M+ transactions. I've researched some algorithms, but I can't find any that apply to my situation. Can anyone offer guidance on where to begin speeding this up?
Few tips:
10
minutes (assuming 10 minutes is your detection threshold). Then for each bucket you'd only need to check the adjacent ones, which should reduce the volume of comparison operations.In addition to decPL's tips, you may want to set up a Data Warehouse with transaction data, that can then be analyzed, for instance, at night. This would mean you store data about your data in a separate database, and that database is then scanned for patterns using known algorithms. This is the way services such as Amazon come up with those "People who bought this, also bought..." suggestions.
The data in the warehouse can be optimized for fast processing, so it doesn't need to follow the same format as your actual ("source") database. The output of the analytical process (Reporting) can also be in a format that is easy to process afterward, possibly using LINQ like you do in your question.
For more info see http://en.wikipedia.org/wiki/Data_warehouse and http://www.1keydata.com/datawarehousing/datawarehouse.html.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With