Updating an old system to Q-learning with Neural Networks

Tags:

Recently I've been reading a lot about Q-learning with Neural Networks and thought about to update an existing old optimization system in a power plant boiler composed of a simple feed-forward neural network approximating an output from many sensory inputs. The output then is linked to a linear model-based controller that somehow output again an optimal action so the whole model can converge to a desired goal.

Identifying linear models is a consuming task. I thought about refurbishing the whole thing to model- free Q-learning with a Neural Network approximation of the Q-function. I drew a diagram to ask you if I'm on the right track or not.

model

My question: if you think I understood well the concept, should my training set be composed of State Features vectors from one side and Q_target - Q_current (here I'm assuming there's an increasing reward) in order to force the whole model towards the target or am I missing something?

Note: The diagram shows a comparison between the old system in the upper part and my proposed change on the lower part.

EDIT: Does a State Neural Network guarantee Experience Replay?

459

asked Oct 20 '16 15:10

Leb_Broth

1 Answers

You might just use all the Q value of all the actions in the current state as the output layer in your network. A poorly drawn diagram is here

You can therefore take advatange of NN's ability to output multiple Q value at a time. Then, just back prop using loss derived by Q(s, a) <- Q(s, a) + alpha * (reward + discount * max(Q(s', a')) - Q(s, a), where max(Q(s', a')) can be easily computed from the output layer.

Please let me know if you have further questions.

105

answered Nov 15 '22 01:11

xtt

Related questions
                            
                                Getting started with Syntaxnet
                            
                                Using a Tensorflow input pipeline with skflow/tf learn
                            
                                How to build a JSON file with nested records from a flat data table?
                            
                                pandas: How to get .to_string() method to align column headers with column values?
                            
                                Cython: Segmentation Fault Using API Embedding Cython to C
                            
                                Receive a message with RabbitMQ then process it then send back the results
                            
                                boost python make_constructor and custodians and ward
                            
                                How to override a Flask blueprint URL?
                            
                                scikit-bio extract genomic features from gff3 file
                            
                                Latex labels with seaborn
                            
                                Tensorflow seq2seq multidimensional regression
                            
                                Nosetests and Coverage not excluding lines
                            
                                Keras VGG extract features
                            
                                How to speed up scrolling responsiveness when displaying lots of text
                            
                                Meaning of ldexp and frexp?
                            
                                Paginating a DynamoDB query in boto3
                            
                                Pycharm conda env not showing packages installed via pip
                            
                                What's the difference between Celery task and subtask?
                            
                                Installing librdkafka on Windows to support Python development
                            
                                How to remove deferred attribute of SQLAlchemy entity from memory?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Updating an old system to Q-learning with Neural Networks

Tags:

python

artificial-intelligence

machine-learning

tensorflow

reinforcement-learning

Leb_Broth

People also ask

1 Answers

xtt

Recent Activity

Donate For Us