k-Armed Bandit 1.0.0
A collection of k-armed bandits and assoicated agents for reinforcement learning
|
A fake child class to allow testing of BaseAgent. More...
Public Member Functions | |
None | __init__ (self, int k, float start_value=0.0) |
Construct the agent. | |
int | act (self) |
Use a specific algorithm to determine which action to take. | |
None | update (self, int action, float reward) |
Update the Q-Table. | |
Public Member Functions inherited from agent.base_agent.BaseAgent | |
int | exploit (self) |
Select the best action. | |
int | explore (self) |
Explore a new action. | |
numpy.ndarray | table (self) |
Return the Q-Table. | |
Additional Inherited Members | |
Protected Attributes inherited from agent.base_agent.BaseAgent | |
_table | |
A fake child class to allow testing of BaseAgent.
BaseAgent is an abstract class, so can't be instantiated directly. This FakeChild class implements the bare minimum to allow testing of the elements of the base class that can be tested.
Definition at line 5 of file test_base_agent.py.
None agent.tests.test_base_agent.FakeAgent.__init__ | ( | self, | |
int | k, | ||
float | start_value = 0.0 |
||
) |
Construct the agent.
k | The number of possible actions the agent can pick from at any given time. Must be an int greater than zero. |
start_value | An initial value to use for each possible action. This assumes that each action is equally likely at start, so all values in the Q-table are set to this value. |
ValueError | if k is not an integer greater than 0. |
Reimplemented from agent.base_agent.BaseAgent.
Definition at line 13 of file test_base_agent.py.
int agent.tests.test_base_agent.FakeAgent.act | ( | self | ) |
Use a specific algorithm to determine which action to take.
This method should define how exactly the agent selects an action. It is free to use explore and exploit as needed.
Reimplemented from agent.base_agent.BaseAgent.
Definition at line 16 of file test_base_agent.py.
None agent.tests.test_base_agent.FakeAgent.update | ( | self, | |
int | action, | ||
float | reward | ||
) |
Update the Q-Table.
This takes the result of the previous action and the resulting reward and should update the Q-Table. How it updates will depend on the specific implementation.
action | An int representing which arm action was taken. This should be between [0, k]. |
reward | A float representing the resulting reward obtained from the selected action. |
Reimplemented from agent.base_agent.BaseAgent.
Definition at line 19 of file test_base_agent.py.