k-Armed Bandit 1.0.0
A collection of k-armed bandits and assoicated agents for reinforcement learning
Loading...
Searching...
No Matches
Public Member Functions | Public Attributes | Protected Attributes | List of all members
agent.epsilon_greedy.EpsilonGreedy Class Reference

A greedy agent that occasionally explores. More...

Inheritance diagram for agent.epsilon_greedy.EpsilonGreedy:
agent.base_agent.BaseAgent

Public Member Functions

None __init__ (self, int k, float epsilon, float start_value=0.0)
 Construct the agent.
 
int act (self)
 Determine which action to take.
 
float epsilon (self)
 
None epsilon (self, float value)
 
None update (self, int action, float reward)
 Update the Q-table based on the last action.
 
- Public Member Functions inherited from agent.base_agent.BaseAgent
int exploit (self)
 Select the best action.
 
int explore (self)
 Explore a new action.
 
numpy.ndarray table (self)
 Return the Q-Table.
 

Public Attributes

 epsilon
 

Protected Attributes

 _n
 
 _rng
 
 _epsilon
 
- Protected Attributes inherited from agent.base_agent.BaseAgent
 _table
 

Detailed Description

A greedy agent that occasionally explores.

This agent will primarily exploit when deciding its actions. However, it will occasionally choose to explore at a rate of epsilon, which is provided at initialization. This gives it a chance to see if other actions are better options.

Definition at line 5 of file epsilon_greedy.py.

Constructor & Destructor Documentation

◆ __init__()

None agent.epsilon_greedy.EpsilonGreedy.__init__ (   self,
int  k,
float  epsilon,
float   start_value = 0.0 
)

Construct the agent.

Parameters
kThe number of actions to consider. This must be an int greater than zero.
epsilonThe rate at which actions should randomly explore. As this is a probability, it should be between 0 and 1.
start_valueThe initial value to use in the table. All actions start with the same value.
Exceptions
ValueErrorif epsilon is not a valid probability (between 0 and 1).

Reimplemented from agent.base_agent.BaseAgent.

Definition at line 14 of file epsilon_greedy.py.

Member Function Documentation

◆ act()

int agent.epsilon_greedy.EpsilonGreedy.act (   self)

Determine which action to take.

This will explore randomly over the actions at a rate of epsilon and inversely will exploit based on table values at a rate of (1.0 - epsilon).

Returns
The index of the selected action to take. Gauranteed to be an int on the range [0, k).

Reimplemented from agent.base_agent.BaseAgent.

Definition at line 31 of file epsilon_greedy.py.

◆ epsilon() [1/2]

float agent.epsilon_greedy.EpsilonGreedy.epsilon (   self)

Definition at line 49 of file epsilon_greedy.py.

◆ epsilon() [2/2]

None agent.epsilon_greedy.EpsilonGreedy.epsilon (   self,
float  value 
)

Definition at line 53 of file epsilon_greedy.py.

◆ update()

None agent.epsilon_greedy.EpsilonGreedy.update (   self,
int  action,
float  reward 
)

Update the Q-table based on the last action.

This will use an incremental formulation of the mean of all rewards obtained so far as the values of the table.

Parameters
actionAn index representing which action on the table was selected. It must be between [0, k).
rewardThe reward obtained from this action.

Reimplemented from agent.base_agent.BaseAgent.

Definition at line 59 of file epsilon_greedy.py.

Member Data Documentation

◆ _epsilon

agent.epsilon_greedy.EpsilonGreedy._epsilon
protected

Definition at line 57 of file epsilon_greedy.py.

◆ _n

agent.epsilon_greedy.EpsilonGreedy._n
protected

Definition at line 27 of file epsilon_greedy.py.

◆ _rng

agent.epsilon_greedy.EpsilonGreedy._rng
protected

Definition at line 29 of file epsilon_greedy.py.

◆ epsilon

agent.epsilon_greedy.EpsilonGreedy.epsilon

Definition at line 25 of file epsilon_greedy.py.


The documentation for this class was generated from the following file: