We have a news feed, and we want to surface items to the user based on a number of criteria. Certain items will be surfaced because of factor A, another because of factor B, and yet another because of factor C. We can create individual heuristics for each factor, but we then need to combine these heuristics in such a way that it promotes the best content considering each factor while still giving a mix of content from each factor.
Our naive approach is to load the top n
from each factor, take the first of each, and make those the first 3 of the feed. Then take the 2nd from each feed and make that the second 3, and so on and so forth. Ideally, we would have some algorithm for more intelligently ranking these feed items - our first thought was to simply sum the three heuristics and pull the top items using the resulting combined score, but there are no guarantees that the heuristics are evenly-scaled (or are evenly-scaled for that particular user), which could result in one factor dominating over the others in the feed. Is there some more intelligent way of ranking these news feed items (akin to what Facebook does in its pseudo-chronological news feed)?
Facebook surveys thousands of people every day to improve the News Feed ranking. Besides looking at quantitative signals such as likes, comments, and shares, Facebook also surveys thousands of people every day to understand whether the News Feed algorithm is showing people the posts they want to see.
If your final combined heuristic does not need to be admissible, it can do no harm to use a sum of the original heuristics as your final heuristic. The problem here is that the original heuristics are probably not of the same dimension, for instance A has values ranging from 0 to 100 and B has values from -1 to +1. I suggest using following formula to calculate the combined heuristic for an item, that ignores the dimensions of the particular heuristics:
H = (A - min(A))/(max(A) - min(A)) + (B - min(B))/(max(B) -
min(B)) + (C - min(C))/(max(C) - min(C))
Of course, to find the min
and max
values for each heuristic, you need understanding of the meaning of each individual heuristic. I am not sure this solves your problem, but i hope it does.
I want to add to the point made by Arne Van Den Kerchove - Normalization.
I would suggest another layer that:
Defines the new Heuristic direction:
If optimal A,B,C differ in their direction, e.g. optimal A is low, but optimal B is high. This heuristic is the positive square root of the squares of the normalized factors, so higher is better.
Will allow to incorporate user response based on the amount of attention (weight) the user assigns to each metrics.
Here is how I imagine it:
H = sqrt(
alpha(
((A - min(A))/(max(A) - min(A)))^2
) +
beta(
((B - min(B))/(max(B) - min(B)))^2
) +
gamma(
((C - min(C))/(max(C) - min(C)))^2
)
)
Alpha, beta and gamma are weights and will start as [1,1,1] unless you have knowledge that one of the metrics is preferred.
These weights shall change with each user response.
If a user chooses something that ranks as follows:
Max(A)= 100 : 21 out of 100 in A - relative value is 0.21
Max(B)= 10,000 : 1234 out of 10,000 in B - relative value is 0.1234
Max(C)= 1 : 0.2 out of 1 in C - relative value is 0.2
Where all minima are 0.
You can add a fraction of the difference between the relative values to alpha, beta and gamma respectively. This way you will have a dynamic rating that not only calculates the factors as you already do, but also adjusts to what the user cares about.
For the example above, if we add the full difference, the new alpha, beta and gamma will be [1.0322,0.9456,1.0222] respectively.
(Subtract the average (0.1778) from the relative values [0.21,0.1234,0.2] and add the result to the initial set [1,1,1])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With