I'm looking for a weighted random selection algorithm with similar characteristics to the alias method, except where items are removed after they are selected.
For example, I have a bag that produces:
I sample the bag and get a red marble. Red is now removed, so I imagine the bag would now produce:
I believe you could precompute a tree of every possible probability table, so that sample are still O(1)
. Are the any more clever algorithms for constant time weighted selection with removal?
Actually, it seems I have misread the question. I was not aware of the alias method, and the answer below is not an analogous algorithm. I'll leave my answer here because it's still informative.
I'm not aware of an O(1) algorithm, but it's easy to do in log(N)2 search and log(N) update. This can likely be improved upon with a more specific algorithm.
Put your elements in a Fenwick tree, with their probabilities as their values. Also when changing elements keep track of the total cumulative probability.
Now we can do every better than just removing elements! We can arbitrarily change the probability of items, but setting an item's probability to 0 is equivalent of removing it. It is then possible to query in log(N) the cumulative probability of the nth element. This logically extends to a log(N)2 binary search of the first element with a cumulative probability larger than p.
Now, to do weighted random selection we generate a number p between 0 and P where P is the total cumulative probability. Then we do the binary search described above to find and select the first element with cumulative probability larger than p.
I've improved on the above, because using a Fenwick tree it's easy to do a log(N) search for the first element with cumulative probality larger than or equal to p. I can strongly suggest to read this explanation of Fenwick trees.
Simply put, to find the element, do a regular binary search on the Fenwick tree as you would on any other tree, with the exception that you keep a current cumulative sum (starting at 0). Whenever you choose the right hand child in the binary search increment the current cumulative sum by the value of the current node. Then when comparing the value of the current node to the cumulative sum we're looking for add the current node's value to the sum so far before comparing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With