What is the difference between reinforcement learning and deep RL?

Tags:

What is the difference between deep reinforcement learning and reinforcement learning? I basically know what reinforcement learning is about, but what does the concrete term deep stand for in this context?

972

asked Jun 22 '16 16:06

Christopher Klaus

1 Answers

Reinforcement Learning

In reinforcement learning, an agent tries to come up with the best action given a state.

For example, in the video game Pac-Man, the state space would be the 2D game world you are in, the surrounding items (pac-dots, enemies, walls, etc), and actions would be moving through that 2D space (going up/down/left/right).

So, given the state of the game world, the agent needs to pick the best action to maximise rewards. Through reinforcement learning's trial and error, it accumulates "knowledge" through these (state, action) pairs, as in, it can tell if there would be positive or negative reward given a (state, action) pair. Let's call this value Q(state, action).

A rudimentary way to store this knowledge would be a table like below

state | action | Q(state, action) ---------------------------------   ... |   ...  |   ...

The (state, action) space can be very big

However, when the game gets complicated, the knowledge space can become huge and it no longer becomes feasible to store all (state, action) pairs. If you think about it in raw terms, even a slightly different state is still a distinct state (e.g. different position of the enemy coming through the same corridor). You could use something that can generalize the knowledge instead of storing and looking up every little distinct state.

So, what you can do is create a neural network, that e.g. predicts the reward for an input (state, action) (or pick the best action given a state, however you like to look at it)

Approximating the Q value with a Neural Network

So, what you effectively have is a NN that predicts the Q value, based on the input (state, action). This is way more tractable than storing every possible value like we did in the table above.

Q = neural_network.predict(state, action)

Deep Reinforcement Learning

Deep Neural Networks

To be able to do that for complicated games, the NN may need to be "deep", meaning a few hidden layers may not suffice to capture all the intricate details of that knowledge, hence the use of deep NNs (lots of hidden layers).

The extra hidden layers allows the network to internally come up with features that can help it learn and generalize complex problems that may have been impossible on a shallow network.

Closing words

In short, the deep neural network allows reinforcement learning to be applied to larger problems. You can use any function approximator instead of an NN to approximate Q, and if you do choose NNs, it doesn't absolutely have to be a deep one. It's just researchers have had great success using them recently.

137

answered Oct 14 '22 13:10

bakkal

Related questions
                            
                                Does TensorFlow have cross validation implemented for its users?
                            
                                General approach to developing an image classification algorithm for Dilbert cartoons
                            
                                Insert or delete a step in scikit-learn Pipeline
                            
                                How to set weights in Keras with a numpy array?
                            
                                "RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead"?
                            
                                How to fix MatMul Op has type float64 that does not match type float32 TypeError?
                            
                                record the computation time for each epoch in Keras during model.fit()
                            
                                How to load only specific weights on Keras
                            
                                How to turn off dropout for testing in Tensorflow?
                            
                                Tensorflow Slim: TypeError: Expected int32, got list containing Tensors of type '_Message' instead
                            
                                Get learning rate of keras model
                            
                                Simple Python implementation of collaborative topic modeling?
                            
                                Tackling Class Imbalance: scaling contribution to loss and sgd
                            
                                confused about random_state in decision tree of scikit learn
                            
                                Python Implementation of OPTICS (Clustering) Algorithm
                            
                                What is Depth of a convolutional neural network?
                            
                                Early stopping with Keras and sklearn GridSearchCV cross-validation
                            
                                Why should we use Temperature in softmax? [closed]
                            
                                How do you read Tensorboard files programmatically?
                            
                                How to recognize rectangles in this image?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is the difference between reinforcement learning and deep RL?

Tags:

machine-learning

reinforcement-learning

q-learning