assuming we have:
i need algorithm that would select S vectors from U into set R while minimizing function cost(R)
cost(R) = sum(abs(c-sumVectors(R))*w)
(sumVectors is a function that sums all vectors like so: sumVectors({< 1,2 >; < 3 ,4>}) = < 4,6 >
while sum(< 1, 2, 3 >)
returns scalar 6)
The solution does not have to be optimal. I just need to get a best guess i can get in preset time.
Any idea where to start? (Preferably something faster/smarter than genetic algorithms)
This is an optimization problem. Since you don't need the optimal solution, you can try the stochastic optimization method, e.g., Hill Climbing, in which you start with a random solution (a random subset of R) and look at the set of neighboring solutions (adding or removing one of the components of current solution) for those that are better with respective of the cost function.
To get better solution, you can also add Simulated Annealing to your hill climbing search. The idea is that in some cases, it's necessary to move to a worse solution and then arrive at a better one later. Simulated Annealing works better because it allows a move for a worse solution near the beginning of the process. The algorithm becomes less likely to allow a worse solution as the process goes on.
I paste some sample hill climbing python code to solve your problem here: https://gist.github.com/921f398d61ad351ac3d6
In my sample code, R
always holds a list of the index into U
, and I use euclidean distance to compare the similarity between neighbors. Certainly you can use other distance functions that satisfy your own needs. Also note in the code, I am getting neighbors on the fly. If you have a large pool of vectors in U
, you might want to cache the pre-computed neighbors or even consider locality sensitive hashing to avoid O(n^2)
comparison. Simulated Annealing can be added onto the above code.
The result of one random run is shown below.
I use only 20 vectors in U
, S
=10, so that I can compare the result with an optimal solution.
The hill climbing process stops at the 4th step when there is no better choice to move to with replacing only one k-nearest-neighbors.
I also run with an exhaustive search which iterates all possible combinations. You can see that the hill-climbing result is pretty good compared with the exhaustive approach. It takes only 4 steps to get the relatively small cost (a local minimum though) which takes the exhaustive search more than 82K steps to beat it.
initial R [1, 3, 4, 5, 6, 11, 13, 14, 15, 17]
hill-climbing cost at step 1: 91784
hill-climbing cost at step 2: 89574
hill-climbing cost at step 3: 88664
hill-climbing cost at step 4: 88503
exhaustive search cost at step 1: 94165
exhaustive search cost at step 2: 93888
exhaustive search cost at step 4: 93656
exhaustive search cost at step 5: 93274
exhaustive search cost at step 10: 92318
exhaustive search cost at step 44: 92089
exhaustive search cost at step 50: 91707
exhaustive search cost at step 84: 91561
exhaustive search cost at step 99: 91329
exhaustive search cost at step 105: 90947
exhaustive search cost at step 235: 90718
exhaustive search cost at step 255: 90357
exhaustive search cost at step 8657: 90271
exhaustive search cost at step 8691: 90129
exhaustive search cost at step 8694: 90048
exhaustive search cost at step 19637: 90021
exhaustive search cost at step 19733: 89854
exhaustive search cost at step 19782: 89622
exhaustive search cost at step 19802: 89261
exhaustive search cost at step 20097: 89032
exhaustive search cost at step 20131: 88890
exhaustive search cost at step 20134: 88809
exhaustive search cost at step 32122: 88804
exhaustive search cost at step 32125: 88723
exhaustive search cost at step 32156: 88581
exhaustive search cost at step 69336: 88506
exhaustive search cost at step 82628: 88420
You're going to need to check the costs all possible sets R and minimise. If you choose vectors in a stepwise fashion minimsing cost at each addition, you may not find the set with minimum cost. If the set U of vectors is very very large and computation is too slow you may be forced to use a stepwise method.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With