I have a class Product
to hold a specific instance of a given product.
This class has a list of related products that are similar to main product.
class Product
{
public string Name;
public double Rating;
public List<Product> RelatedProducts;
//...
public List<Product> GetTopRelatedProducts(int N)
{
// How to implement this method
// What I did so far(Bad code)
// 1- INFINITE RECURSION
// 2- Can not remember visited objects
var myList = new List<Product>();
foreach(Product prod in RelatedProducts)
{
myList.AddRange(prod.GetTopRelatedProducts(N));
}
return myList.Distinct().OrderByDescending(x => x.Rating).Take(N).ToList();
}
}
I want to define a method within Product
class to get top N
(best rated) related products. This method should take into consideration that elements in RelatedProducts
list are of type Product
and they also have their own RelatedProducts
list. So I should keep navigating nested object until all related products were reached and after that, take the top N product.
I mean, the solution would not be simply this.RelatedProducts.OrderByDescending(x => x.Rating).Take(N);
One more thing to keep in mind: Two products can be mutually related. Which means a product A can belong to RelatedProducts
list of a product B and B also can belong to RelatedProducts
list of product A.
Any suggestion how to solve this issue in optimal way?
Imagine, I have millions of products to maintain. How can I navigate all related products recursively and recognize already visited ones?
I tagged this as both C# and Java, as the same logic could be applied to both languages
Imagine, I have millions of products to maintain. How can I navigate all related products recursively and recognize already visited ones?
It doesn't need to be recursive. Explicit Stack
or Queue
can serve the navigating part. For collecting the result a HashSet
can be used instead of List
. It would serve two purposes - to allow you to skip the already visited elements and also eliminate the need of Distinct
at the end.
Here is a sample Queue
based implementation:
public List<Product> GetTopRelatedProducts(int N)
{
var relatedSet = new HashSet<Product>();
var relatedListQueue = new Queue<List<Product>>();
if (RelatedProducts != null && RelatedProducts.Count > 0)
relatedListQueue.Enqueue(RelatedProducts);
while (relatedListQueue.Count > 0)
{
var relatedList = relatedListQueue.Dequeue();
foreach (var product in relatedList)
{
if (product != this && relatedSet.Add(product) && product.RelatedProducts != null && product.RelatedProducts.Count > 0)
relatedListQueue.Enqueue(product.RelatedProducts);
}
}
return relatedSet.OrderByDescending(x => x.Rating).Take(N).ToList();
}
Update: For completeness, here are the other possible implementations of the related set collecting part:
With explicit Stack
:
public List<Product> GetTopRelatedProducts(int N)
{
if (RelatedProducts == null || RelatedProducts.Count == 0)
return new List<Product>();
var relatedSet = new HashSet<Product>();
var pendingStack = new Stack<List<Product>.Enumerator>();
var relatedList = RelatedProducts.GetEnumerator();
while (true)
{
while (relatedList.MoveNext())
{
var product = relatedList.Current;
if (product != this && relatedSet.Add(product) && product.RelatedProducts != null && product.RelatedProducts.Count > 0)
{
pendingStack.Push(relatedList);
relatedList = product.RelatedProducts.GetEnumerator();
}
}
if (pendingStack.Count == 0) break;
relatedList = pendingStack.Pop();
}
return relatedSet.OrderByDescending(x => x.Rating).Take(N).ToList();
}
Although a bit more verbose than the explicit Queue
based implementation, this method has less space requirements - O(height) where height
is the maximum depth.
The benefit of both iterative implementations is that of course they can handle much bigger depth than the recursive solutions which can lead to StackOverflowExpection
. But if the depth is not expected to be so big and you prefer recursion, then here are a couple recursive implementations (all they need to have access to the relatedSet
and this
):
With classic private recursive method:
public List<Product> GetTopRelatedProducts(int N)
{
var relatedSet = new HashSet<Product>();
GetRelatedProducts(this, relatedSet);
return relatedSet.OrderByDescending(x => x.Rating).Take(N).ToList();
}
private void GetRelatedProducts(Product product, HashSet<Product> relatedSet)
{
if (product.RelatedProducts == null) return;
foreach (var item in product.RelatedProducts)
if (item != this && relatedSet.Add(item))
GetRelatedProducts(item, relatedSet);
}
With recursive lambda:
public List<Product> GetTopRelatedProductsD(int N)
{
var relatedSet = new HashSet<Product>();
Action<Product> GetRelatedProducts = null;
GetRelatedProducts = product =>
{
if (product.RelatedProducts == null) return;
foreach (var item in product.RelatedProducts)
if (item != this && relatedSet.Add(item))
GetRelatedProducts(item);
};
GetRelatedProducts(this);
return relatedSet.OrderByDescending(x => x.Rating).Take(N).ToList();
}
Last, but not least, with the latest C# 7.0 addition - recursive local function:
public List<Product> GetTopRelatedProducts(int N)
{
var relatedSet = new HashSet<Product>();
GetRelatedProducts(this);
return relatedSet.OrderByDescending(x => x.Rating).Take(N).ToList();
void GetRelatedProducts(Product product)
{
if (product.RelatedProducts == null) return;
foreach (var item in product.RelatedProducts)
if (item != this && relatedSet.Add(item))
GetRelatedProducts(item);
}
}
All these methods handle (IMO) optimally the collecting part. The top N part of course is not optimal - O(N*log(N)) and can be optimized as mentioned in @Amit Kumar's answer, but it would require implementing a missing standard data structure, which is outside the scope of a SO answer.
I would recommend using a priority queue(min heap) of fixed size N. Build the priority queue the time you build the list, so after initial build operation, the priority queue will have top N rated products. Subsequent addition/deletion can be done by checking the checking the top element in the priority queue in O(log(N))
.
Pseudo Code: New element to be Added E
while PQ.size < N
PQ.enqueue(E)
if PQ.size == N
Etop = PQ.top() < Min heap element >
if E.rating > Etop.rating
PQ.dequeu()
PQ.enqueue(E)
To get the top N elements , just iterate through the PQ.
My solution :
public List<Product> GetTopRelatedProducts(int N)
{
List<Product> visitedProducts = new List<Product>();
Queue<Product> ProductsQueue = new Queue<Product>();
visitedProducts.add(this);
foreach (product prod in relatedProducts)
if(prod != this) //if a product can't be related to itself then remove this if statement
ProductsQueue.Enqueue(prod);
//for now visitedproducts contains only our main product and ProductsQueue contains the product related to it.
while (ProductsQueue.count > 0)
{
Product p = ProductsQueue.Dequeue();
visitedProducts.add(p);
foreach (product prod in p.relatedProducts)
{
if( ! visitedProduct.contains(prod) && !ProductsQueue.contains(prod))//if we haven't visited the product already or if it is not in the queue so we are going to visit it.
ProductsQueue.Enqueue(prod);
}
}
//now visitedProducts contains all the products that are related (somehow :P) to your first product
visitedProducts.remove(this);// to remove the main product from the results
//all what is left to do is to take the top N products.
return visitedProducts.OrderByDescending(x => x.Rating).Take(N).ToList();
}
I tried to make it as simple as possible ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With