So I've got a list of weighted items, and I'd like to pick 4 non-duplicate items from this list.
Item Weight
Apple 5
Banana 7
Cherry 12
...
Orange 8
Pineapple 50
What is the most efficient way to do this? My initial foray was to just reroll for the subsequent picks if an already picked item came up... but for a small list this can result in a ton of rerolls.
Edit for clarification: For the above example, and ignoring fruits D through N, there is a total weight of 82. So the chances of being picked first are: A ~6% B ~8.5% C ~14.6% O ~9.8% P ~61% Once an items is picked the probabilities would (should!) change.
In your comment you say that unique means:
I don't want to pick the same item twice.
.. and that the weights determine a likelihood of being picked.
All you need to do to make sure that you don't pick duplicates, is simply remove the last picked item from the list before you pick the next one. Yes, this will change your weights slightly, but that is the correct statistical change to make if you do want unique results.
In addition, I'm not sure how you are using the weights to determine the candidates, but I came up with this algorithm that should do this with a minimal number of loops (and without the need to fill an array according to weights, which could result in extremely large arrays, requires int weights, etc.)
I've used JavaScript here, simply so it's easy to see the output in a browser without a server. It should be trivial to port to PHP since it's not doing anything complicated.
var FRUITS = [
{name : "Apple", weight: 8 },
{name : "Orange", weight: 4 },
{name : "Banana", weight: 4 },
{name : "Nectarine", weight: 3 },
{name : "Kiwi", weight: 1 }
];
var PICKS = 3;
function getNewFruitsAvailable(fruits, removeFruit) {
var newFruits = [];
for (var idx in fruits) {
if (fruits[idx].name != removeFruit) {
newFruits.push(fruits[idx]);
}
}
return newFruits;
}
var results = [];
var candidateFruits = FRUITS;
for (var i=0; i < PICKS; i++) {
// CALCULATE TOTAL WEIGHT OF AVAILABLE FRUITS
var totalweight = 0;
for (var idx in candidateFruits) {
totalweight += candidateFruits[idx].weight;
}
console.log("Total weight: " + totalweight);
var rand = Math.random();
console.log("Random: " + rand);
// ITERATE THROUGH FRUITS AND PICK THE ONE THAT MATCHES THE RANDOM
var weightinc = 0;
for (idx in candidateFruits) {
// INCREMENT THE WEIGHT BY THE NEXT FRUIT'S WEIGHT
var candidate = candidateFruits[idx];
weightinc += candidate.weight;
// IF rand IS BETWEEN LAST WEIGHT AND NEXT WEIGHT, PICK THIS FRUIT
if (rand < weightinc/totalweight) {
results.push(candidate.name);
console.log("Pick: " + candidate.name);
// GET NEXT SET OF FRUITS (REMOVING PICKED FRUIT)
candidateFruits = getNewFruitsAvailable(candidateFruits, candidate.name);
break;
}
}
console.log("CandidateFruits: " + candidateFruits.length);
};
for (var i=0; i < results.length; i++) {
document.write(results[i] + "<br/>");
}
The basic strategy is to allocate each fruit a portion of the total range [0,1)
. In the first loop, you'd have this:
The script iterates over each item in the list, and progresses a weight counter. When it reaches the range that contains the first random, it picks that item, removes it from the list, then recalculates the ranges based on the new total weight and runs again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With