I'm looking for the best algorithm to optimise the decisions made in a simultaion to find a fast result in a reasonable amount of time. The simultaion does a number of "ticks" and occasionaly needs to make a decision. Eventually a goal state is reached. ( It would be possible to never reach a goal state if you make very bad decisions )
There are many many goal states. I want to find the goal state with the least number of ticks ( a tick equates roughly to a second in real life." I basically want to decide which decisions to make to get to the goal in as few seconds as possible,
Some points about the problem domain:
Algorithms:
At the moment I am thinking that I will start off with the known non optimal path to a goal and then perhaps use Simulated Anealing and attempt to improve it over 10 seconds.
What would be a good algorithm to research to try to solve this sort of problem?
Have a look at limited discrepancy search, repeating with increasingly loose limits on the maximum discrepancy search, or beam search.
If you have a good heuristic you should be able to use it to compare individual choices - for the limited discrepancy search, and compare partial solutions, for the beam search.
See if you can place an upper bound on how good any extension of a partial solution is. Then you can prune out partial solutions that can't possibly be extended to beat the result from the heuristic, or the best result found so far in a series of iterative searches with increasing depth.
Let's get a few facts out.
1) The only way to know for sure which decision is the best is to test every possible decision and evaluate the outcome based on some criteria.
2) We are highly unlikely to have the time to decide to go through every possible decision, so we have to limit how far in the future we will evaluate the decision.
3) We are highly unlikely to make the best move ~ever~. Not just often, but ever. Unless you have only a couple of decisions, chances are every time you make a decision, there was a better one you didn't get to.
4) We can use how our previous decisions worked out to our advantage.
Put all this together... Let's say when we have a decision, we evaluate what happens 30 ticks into the future, in 30 ticks we can check to see if what actually happened matches what we simulated 30 ticks ago. If it was, we know that decision leads to predictable outcomes and we should use that decision less. If we didn't, or if it turns out better than we hoped, we should use that decision more.
Ideally, you would use your logic in a ... simulation of your simulation ... for purposes of evaluating it. Then when you get to the 'real' simulation, you have a better chance at picking your better decisions earlier. Of course, give a higher weight to the results of your actual simulation results as opposed to your simulated simulation results.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With