dlife.rl
Class RLearningPolicy
java.lang.Object
dlife.rl.QUpdatePolicy
dlife.rl.RLearningPolicy
- All Implemented Interfaces:
- Serializable
public class RLearningPolicy
- extends QUpdatePolicy
Implementation of the R-Learning update policy. This class will return a new
Q-Table value using the R-Learning update described by Sutton and Barto in
their text: Reinforcement Learning: An Introduction.
R-Learning is appropriate for continuing undiscounted tasks where the
objective is to maximize the reward at each time step. This is contrasted
with Q-learning where the objective is to learn a sequence of actions that
lead to a reward.
- Version:
- Apr 27, 2011
- Author:
- Grant Braught, Dickinson College
- See Also:
- Serialized Form
|
Constructor Summary |
RLearningPolicy(QTable q,
double alpha,
double beta)
Construct a new RLearningPolicy. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
RLearningPolicy
public RLearningPolicy(QTable q,
double alpha,
double beta)
- Construct a new RLearningPolicy.
- Parameters:
q - the QTable that will be updated.alpha - the learning rate to be used for adjusting Q values.beta - the step-size parameter for learning the average reward per
time step.
updateQValue
public void updateQValue(State s,
Action a,
double r,
ArrayList<Action> actions)
- Update the new Q-Table using R-Learning. The each time this method is
invoked it saved the provided state to be used as sp on the next call.
Note that the Q-Table is not updated on the first call. On each
subsequent call the Q-Table is updated using the the saved previous state
and the parameters.
- Specified by:
updateQValue in class QUpdatePolicy
- Parameters:
s - the current state.a - the action that led to the current state.r - the reward received for performing the action.actions - the list of possible actions from the current state.