I was wondering when one would decide to resort to Reinforcement Learning to problems that have been previously tackled by mathematical optimisation methods - think the Traveling Salesman Problem or Job Scheduling or Taxi Sharing Problems.
Since Reinforcement Learning aims at minimising/maximising a certain cost/reward function in a similar way as Operational Research attempts at optimising the result of a certain cost function, I would assume that problems that could be solved by one of the two parties may be tackled by the other. However, is this the case? Are there tradeoffs between the two? I haven't really seen too much research done on RL regarding the problems stated above but I may be mistaken.
If anyone has any insights at all, they would be highly appreciated!!
The point to be noted here is that the machine learning models are related and concerned with the one task prediction whereas the operation research is concerned with the large collection of unique methods for specific classes of problems.
Reinforcement learning is the training of machine learning models to make a sequence of decisions. The agent learns to achieve a goal in an uncertain, potentially complex environment. In reinforcement learning, an artificial intelligence faces a game-like situation.
Operations Research may be referred to as part of Artificial Intelligence (at least to the extent that both make use of data to provide support to decision making processes), which naturally includes Machine (& Deep) Learning (ML).
When applied to optimization problems, reinforcement learning can be seen as a learning, heuristic search strategy. After training on a set of problems, a reinforcement learning policy can efficiently generate solutions for similar, unseen problems.
Here is my two cents. I think that although both approximations have a common goal (optimal decision making), their fundamental working principles are different. In essence, Reinforcement Learning is a data driven approach, where the optimization process is achieved by agent-environment interaction (i.e., data). On the other hand, Optimisation Research uses other methods that require deeper knowledge of the problem and/or imposes more assumptions.
There are many problems, especially academic or toy problems, where both approximations, RL and OR, can be applied. In real world applications, I guess that if you can meet all the assumptions required by OR, RL wouldn't achieve better results. Unfortunately, this is no always the case, so RL is more useful in such cases.
Notice, however, that there exist methods in which is not clear the difference between RL and OR.
Pablo provided a great explanation. My research is actually in reinforcement learning vs model predictive control. And MPC is a control approach based on trajectory optimization. Reinforcement learning is just a data driven optimization algorithm and can be used for your above examples. Here is a paper for the traveling salesman problem using RL.
The biggest differences are really these:
Reinforcement Learning Method
Optimzation Approaches
Performance is dependent on the model. If the model is bad, the optimization will be terrible.
Because performance is based on model, identifying a "perfect" model is extremely expensive. In the energy industry, such a model for one plant costs millions, especially because the operating conditions change over time.
GUARANTEES optimality. There are many papers published that goes into the proofs regarding that these approaches guarantee robustness, feasibility, and stability.
Easy to interpret. Controls and decisions using a optimization approach is easy to interpret because you can go into the model and calculate for why a certain action was performed. In the RL case, this is usually a neural network and completely a black box. Therefore, for safety sensitive problems, RL is currently RARELY used.
Very expensive online calculation depending on prediction horizon, because at each time step, we have to optimize the trajectory given the current states.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With