dLife Home Page

dlife.rl
Class RLearningPolicy

java.lang.Object
  extended by dlife.rl.QUpdatePolicy
      extended by dlife.rl.RLearningPolicy
All Implemented Interfaces:
Serializable

public class RLearningPolicy
extends QUpdatePolicy

Implementation of the R-Learning update policy. This class will return a new Q-Table value using the R-Learning update described by Sutton and Barto in their text: Reinforcement Learning: An Introduction.

R-Learning is appropriate for continuing undiscounted tasks where the objective is to maximize the reward at each time step. This is contrasted with Q-learning where the objective is to learn a sequence of actions that lead to a reward.

Version:
Apr 27, 2011
Author:
Grant Braught, Dickinson College
See Also:
Serialized Form

Field Summary
 
Fields inherited from class dlife.rl.QUpdatePolicy
qTable
 
Constructor Summary
RLearningPolicy(QTable q, double alpha, double beta)
          Construct a new RLearningPolicy.
 
Method Summary
 void updateQValue(State s, Action a, double r, ArrayList<Action> actions)
          Update the new Q-Table using R-Learning.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RLearningPolicy

public RLearningPolicy(QTable q,
                       double alpha,
                       double beta)
Construct a new RLearningPolicy.

Parameters:
q - the QTable that will be updated.
alpha - the learning rate to be used for adjusting Q values.
beta - the step-size parameter for learning the average reward per time step.
Method Detail

updateQValue

public void updateQValue(State s,
                         Action a,
                         double r,
                         ArrayList<Action> actions)
Update the new Q-Table using R-Learning. The each time this method is invoked it saved the provided state to be used as sp on the next call. Note that the Q-Table is not updated on the first call. On each subsequent call the Q-Table is updated using the the saved previous state and the parameters.

Specified by:
updateQValue in class QUpdatePolicy
Parameters:
s - the current state.
a - the action that led to the current state.
r - the reward received for performing the action.
actions - the list of possible actions from the current state.

dLife Home Page