dlife.rl
Class TDLearner
java.lang.Object
dlife.sys.SerializationBase
dlife.rl.TDLearner
- All Implemented Interfaces:
- Serializable
public class TDLearner
- extends SerializationBase
Base class for Time Difference based reinforcement learning agents (e.g.
Q-learning or SARSA). This class uses policy objects for action selection,
and Q-value updates, allowing a variety of TD-based reinforcement learning
methods to be implemented.
- Version:
- Apr 26, 2011
- Author:
- Grant Braught, Dickinson College
- See Also:
- Serialized Form
|
Method Summary |
Action |
getNextAction(State s,
double reward)
Select the next action to be performed and update the Q-Table based on
the reward for the previous action. |
QTable |
getQTable()
Get the QTable that is being used by this TDLearner. |
| Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TDLearner
public TDLearner(ArrayList<Action> actions,
QTable qTable,
ActionSelectionPolicy selector,
QUpdatePolicy updater)
- Construct a new TDLearning agent. The actions available to the agent, how
it learns and how it selects actions are dictated by the parameters.
Note: References to each of the parameter objects are stored in fields.
Thus, changes made to the objects (e.g. modifying the list of available
actions) will be reflected when the
getNextAction method is
invoked.
- Parameters:
actions - the actions available to the agent.qTable - the Q-table to be used.selector - the policy object that selects actions for the agent.updater - the policy object that updates the agent's Q-Table.
getNextAction
public Action getNextAction(State s,
double reward)
- Select the next action to be performed and update the Q-Table based on
the reward for the previous action. This method does the following:
- Updates the Q-Table for the previous state and the last action using
the provided reward value.
- Chooses the next action to be taken using the updated Q-Table.
- Updates the N value for the (State,Action) pair given by the current
state and the chosen action.
-
- Parameters:
s - the current state.reward - the reward received for the action that led to this state.
- Returns:
- the next Action to be taken.
getQTable
public QTable getQTable()
- Get the QTable that is being used by this TDLearner. Retrieving the QTable
can be useful for transplanting a trained QTable from one TDLearner to another
TDLearner that may use a different action selection policy.
- Returns:
- the qTable for this TDLearner.