Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Q-Learning Algorithm's implementation recursive?

I am trying to implement the Q-Learning. The general algorithm from here is as below

enter image description here

In the statement

enter image description here

I just don't get it that should i implement the above statement of the original pseudo-code recursively for all next states which current state/action can lead us to and max it every time

OR just choose the maximum value of the next state with current action from the Action-State Q-Value table?

Thanks in advance.

like image 480
dariush Avatar asked May 27 '26 22:05

dariush


1 Answers

All the formula says is that on step t+1 you update the state-action value by using the state-action value from step t and the maximum of values over all the actions for the current state.

like image 189
Don Reba Avatar answered May 31 '26 08:05

Don Reba



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!