Combining heuristics when ranking social network news feed items

Tags:

We have a news feed, and we want to surface items to the user based on a number of criteria. Certain items will be surfaced because of factor A, another because of factor B, and yet another because of factor C. We can create individual heuristics for each factor, but we then need to combine these heuristics in such a way that it promotes the best content considering each factor while still giving a mix of content from each factor.

Our naive approach is to load the top n from each factor, take the first of each, and make those the first 3 of the feed. Then take the 2nd from each feed and make that the second 3, and so on and so forth. Ideally, we would have some algorithm for more intelligently ranking these feed items - our first thought was to simply sum the three heuristics and pull the top items using the resulting combined score, but there are no guarantees that the heuristics are evenly-scaled (or are evenly-scaled for that particular user), which could result in one factor dominating over the others in the feed. Is there some more intelligent way of ranking these news feed items (akin to what Facebook does in its pseudo-chronological news feed)?

348

asked Jun 08 '17 12:06

Lenny

2 Answers

If your final combined heuristic does not need to be admissible, it can do no harm to use a sum of the original heuristics as your final heuristic. The problem here is that the original heuristics are probably not of the same dimension, for instance A has values ranging from 0 to 100 and B has values from -1 to +1. I suggest using following formula to calculate the combined heuristic for an item, that ignores the dimensions of the particular heuristics:

H = (A - min(A))/(max(A) - min(A)) + (B - min(B))/(max(B) - min(B)) + (C - min(C))/(max(C) - min(C))

Of course, to find the min and max values for each heuristic, you need understanding of the meaning of each individual heuristic. I am not sure this solves your problem, but i hope it does.

132

answered Oct 19 '22 11:10

Arne Van Den Kerchove

I want to add to the point made by Arne Van Den Kerchove - Normalization.

I would suggest another layer that:

Defines the new Heuristic direction:

If optimal A,B,C differ in their direction, e.g. optimal A is low, but optimal B is high. This heuristic is the positive square root of the squares of the normalized factors, so higher is better.
Will allow to incorporate user response based on the amount of attention (weight) the user assigns to each metrics.

Here is how I imagine it:

H = sqrt(
        alpha(
            ((A - min(A))/(max(A) - min(A)))^2
        ) + 
        beta(
            ((B - min(B))/(max(B) - min(B)))^2
        ) + 
        gamma(
            ((C - min(C))/(max(C) - min(C)))^2
        )
)

Alpha, beta and gamma are weights and will start as [1,1,1] unless you have knowledge that one of the metrics is preferred.
These weights shall change with each user response.

For example:

If a user chooses something that ranks as follows:

Max(A)= 100 :       21 out of 100  in A - relative value is 0.21
Max(B)= 10,000 :    1234 out of 10,000 in B - relative value is 0.1234
Max(C)= 1 :         0.2 out of 1 in C - relative value is 0.2
Where all minima are 0.

You can add a fraction of the difference between the relative values to alpha, beta and gamma respectively. This way you will have a dynamic rating that not only calculates the factors as you already do, but also adjusts to what the user cares about.

For the example above, if we add the full difference, the new alpha, beta and gamma will be [1.0322,0.9456,1.0222] respectively.
(Subtract the average (0.1778) from the relative values [0.21,0.1234,0.2] and add the result to the initial set [1,1,1])

This way the new relevant item set will be dictated by the user's cumulative choices.

answered Oct 19 '22 13:10

AChervony

Related questions
                            
                                maximum entropy model and logistic regression
                            
                                Using Silhouette Clustering in Spark
                            
                                How to store neural network knowledge data?
                            
                                numpy generate data from linear function
                            
                                Python/Keras/Theano wrong dimensions for Deep Autoencoder
                            
                                How to interpret keras " predict_generator " output?
                            
                                How LSTM deal with variable length sequence
                            
                                Multiple Linear Regression with specific constraint on each coefficients on Python
                            
                                from torch._C import * ImportError: DLL load failed: The specified module could not be found
                            
                                Sklearn Chi2 For Feature Selection
                            
                                RuntimeError: size mismatch m1: [a x b], m2: [c x d]
                            
                                Classification metrics can't handle a mix of binary and continuous targets [duplicate]
                            
                                How can I use computer vision to find a shape in an image?
                            
                                Architecture & Essential Components of StumbleUpon's Recommendation Engine
                            
                                Advantages of SVM over decion trees and AdaBoost algorithm
                            
                                What FFT descriptors should be used as feature to implement classification or clustering algorithm?
                            
                                roc curve with sklearn [python]
                            
                                SKLearn how to get decision probabilities for LinearSVC classifier
                            
                                What does the capital letter 'J' mean in cost function J(θ)?
                            
                                ROC curve for binary classification in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Combining heuristics when ranking social network news feed items

Tags:

artificial-intelligence

machine-learning

ranking

data-science