k-Armed Bandit 1.0.0
A collection of k-armed bandits and assoicated agents for reinforcement learning
Loading...
Searching...
No Matches
Public Member Functions | Protected Attributes | List of all members
agent.greedy.Greedy Class Reference

An agent that always exploits, never explores. More...

Inheritance diagram for agent.greedy.Greedy:
agent.base_agent.BaseAgent

Public Member Functions

None __init__ (self, int k, float start_value=0.0)
 Construct the agent.
 
int act (self)
 Select an action to take from the available ones.
 
None update (self, int action, float reward)
 Update the table values based on the last action.
 
- Public Member Functions inherited from agent.base_agent.BaseAgent
int exploit (self)
 Select the best action.
 
int explore (self)
 Explore a new action.
 
numpy.ndarray table (self)
 Return the Q-Table.
 

Protected Attributes

 _n
 
- Protected Attributes inherited from agent.base_agent.BaseAgent
 _table
 

Detailed Description

An agent that always exploits, never explores.

It will always pick the action with the highest value from the Q-table. While these values will be updated, it never explores, so will likely quickly converge on a single action.

Definition at line 4 of file greedy.py.

Constructor & Destructor Documentation

◆ __init__()

None agent.greedy.Greedy.__init__ (   self,
int  k,
float   start_value = 0.0 
)

Construct the agent.

Parameters
kThe number of arms to select from. Should be an int greater than zero.
start_valueThe starting reward to use for each arm. All arms assume the same value at the start.

Reimplemented from agent.base_agent.BaseAgent.

Definition at line 12 of file greedy.py.

Member Function Documentation

◆ act()

int agent.greedy.Greedy.act (   self)

Select an action to take from the available ones.

Greedy always exploits, so this will always be one of the actions with the highest table value.

Returns
An int representing the selected action. It will be on the interval [0, k).

Reimplemented from agent.base_agent.BaseAgent.

Definition at line 23 of file greedy.py.

◆ update()

None agent.greedy.Greedy.update (   self,
int  action,
float  reward 
)

Update the table values based on the last action.

This uses an iterative version of a running average to update table values.

Parameters
actionThe index corresponding to the action that was taken.
rewardThe resulting reward that was earned.

Reimplemented from agent.base_agent.BaseAgent.

Definition at line 32 of file greedy.py.

Member Data Documentation

◆ _n

agent.greedy.Greedy._n
protected

Definition at line 21 of file greedy.py.


The documentation for this class was generated from the following file: