Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are the uses of recurrent neural networks when using them with Reinforcement Learning?

I do know that feedforward multi-layer neural networks with backprop are used with Reinforcement Learning as to help it generalize the actions our agent does. This is, if we have a big state space, we can do some actions, and they will help generalize over the whole state space.

What do recurrent neural networks do, instead? To what tasks are they used for, in general?

like image 324
devoured elysium Avatar asked Nov 23 '09 14:11

devoured elysium


People also ask

How is RNN used in reinforcement learning?

As a first step towards reinforcement learning, it is shown that RNN can well map and reconstruct (partially observable) Markov decision processes. In doing so, the resulting inner state of the network can be used as a basis for standard RL algorithms.

How are neural networks used in reinforcement learning?

Neural networks are function approximators, which are particularly useful in reinforcement learning when the state space or action space are too large to be completely known. A neural network can be used to approximate a value function, or a policy function.

What is the use of recurrent neural network?

A recurrent neural network is a type of artificial neural network commonly used in speech recognition and natural language processing. Recurrent neural networks recognize data's sequential characteristics and use patterns to predict the next likely scenario.

What are the uses of using RNN in NLP?

RNN is widely used neural network architecture for NLP. It has proven to be comparatively accurate and efficient for building language models and in tasks of speech recognition.


1 Answers

Recurrent Neural Networks, RNN for short (although beware that RNN is often used in the literature to designate Random Neural Networks, which effectively are a special case of Recurrent NN), come in very different "flavors" which causes them to exhibit various behaviors and characteristics. In general, however these many shades of behaviors and characteristics are rooted in the availability of [feedback] input to individual neurons. Such feedback comes from other parts of the network, be it local or distant, from the same layer (including in some cases "self"), or even on different layers (*). Feedback information it treated as "normal" input the neuron and can then influence, at least in part, its output.

Unlike back propagation which is used during the learning phase of a Feed-forward Network for the purpose of fine-tuning the relative weights of the various [Feedfoward-only] connections, FeedBack in RNNs constitute true a input to the neurons they connect to.

One of the uses of feedback is to make the network more resilient to noise and other imperfections in the input (i.e. input to the network as a whole). The reason for this is that in addition to inputs "directly" pertaining to the network input (the types of input that would have been present in a Feedforward Network), neurons have the information about what other neurons are "thinking". This extra info then leads to Hebbian learning, i.e. the idea that neurons that [usually] fire together should "encourage" each other to fire. In practical terms this extra input from "like-firing" neighbor neurons (or no-so neighbors) may prompt a neuron to fire even though its non-feedback inputs may have been such that it would have not fired (or fired less strongly, depending on type of network).

An example of this resilience to input imperfections is with associative memory, a common employ of RNNs. The idea is to use the feeback info to "fill-in the blanks".

Another related but distinct use of feedback is with inhibitory signals, whereby a given neuron may learn that while all its other inputs would prompt it to fire, a particular feedback input from some other part of the network typically indicative that somehow the other inputs are not to be trusted (in this particular context).

Another extremely important use of feedback, is that in some architectures it can introduce a temporal element to the system. A particular [feedback] input may not so much instruct the neuron of what it "thinks" [now], but instead "remind" the neuron that say, two cycles ago (whatever cycles may represent), the network's state (or one of its a sub-states) was "X". Such ability to "remember" the [typically] recent past is another factor of resilience to noise in the input, but its main interest may be in introducing "prediction" into the learning process. These time-delayed input may be seen as predictions from other parts of the network: "I've heard footsteps in the hallway, expect to hear the door bell [or keys shuffling]".

(*) BTW such a broad freedom in the "rules" that dictate the allowed connections, whether feedback or feed-forward, explains why there are so many different RNN architectures and variations thereof). Another reason for these many different architectures is that one of the characteristics of RNN is that they are not readily as tractable, mathematically or otherwise, compared with the feed-forward model. As a result, driven by mathematical insight or plain trial-and-error approach, many different possibilities are being tried.

This is not to say that feedback network are total black boxes, in fact some of the RNNs such as the Hopfield Networks are rather well understood. It's just that the math is typically more complicated (at least to me ;-) )

I think the above, generally (too generally!), addresses devoured elysium's (the OP) questions of "what do RNN do instead", and the "general tasks they are used for". To many complement this information, here's an incomplete and informal survey of applications of RNNs. The difficulties in gathering such a list are multiple:

  • the overlap of applications between Feed-forward Networks and RNNs (as a result this hides the specificity of RNNs)
  • the often highly specialized nature of applications (we either stay in with too borad concepts such as "classification" or we dive into "Prediction of Carbon shifts in series of saturated benzenes" ;-) )
  • the hype often associated with neural networks, when described in vulgarization texts

Anyway, here's the list

  • modeling, in particular the learning of [oft' non-linear] dynamic systems
  • Classification (now, FF Net are also used for that...)
  • Combinatorial optimization

Also there are a lots of applications associated with the temporal dimension of the RNNs (another area where FF networks would typically not be found)

  • Motion detection
  • load forecasting (as with utilities or services: predicting the load in the short term)
  • signal processing : filtering and control
like image 86
mjv Avatar answered Oct 26 '22 18:10

mjv