A greedy agent that occasionally explores. More...

Inheritance diagram for agent.epsilon_greedy.EpsilonGreedy:

Public Member Functions
None	__init__ (self, int k, float epsilon, float start_value=0.0)
	Construct the agent.

int	act (self)
	Determine which action to take.

float	epsilon (self)

None	epsilon (self, float value)

None	update (self, int action, float reward)
	Update the Q-table based on the last action.

Public Member Functions inherited from agent.base_agent.BaseAgent
int	exploit (self)
	Select the best action.

int	explore (self)
	Explore a new action.

numpy.ndarray	table (self)
	Return the Q-Table.

Public Attributes
	epsilon

Protected Attributes
	_n

	_rng

	_epsilon

Protected Attributes inherited from agent.base_agent.BaseAgent
	_table

Detailed Description

A greedy agent that occasionally explores.

This agent will primarily exploit when deciding its actions. However, it will occasionally choose to explore at a rate of epsilon, which is provided at initialization. This gives it a chance to see if other actions are better options.

Definition at line 5 of file epsilon_greedy.py.

Constructor & Destructor Documentation

◆ init()

None agent.epsilon_greedy.EpsilonGreedy.__init__	(		self,
		int	k,
		float	epsilon,
		float	start_value = `0.0`
	)

Construct the agent.

Parameters

k	The number of actions to consider. This must be an int greater than zero.
epsilon	The rate at which actions should randomly explore. As this is a probability, it should be between 0 and 1.
start_value	The initial value to use in the table. All actions start with the same value.

Exceptions

ValueError if epsilon is not a valid probability (between 0 and 1).

Reimplemented from agent.base_agent.BaseAgent.

Definition at line 14 of file epsilon_greedy.py.

Member Function Documentation

◆ act()

int agent.epsilon_greedy.EpsilonGreedy.act ( self )

Determine which action to take.

This will explore randomly over the actions at a rate of epsilon and inversely will exploit based on table values at a rate of (1.0 - epsilon).

Returns: The index of the selected action to take. Gauranteed to be an int on the range [0, k).

Reimplemented from agent.base_agent.BaseAgent.

Definition at line 31 of file epsilon_greedy.py.

◆ epsilon() [1/2]

float agent.epsilon_greedy.EpsilonGreedy.epsilon ( self )

Definition at line 49 of file epsilon_greedy.py.

◆ epsilon() [2/2]

None agent.epsilon_greedy.EpsilonGreedy.epsilon	(		self,
		float	value
	)

Definition at line 53 of file epsilon_greedy.py.

◆ update()

None agent.epsilon_greedy.EpsilonGreedy.update	(		self,
		int	action,
		float	reward
	)

Update the Q-table based on the last action.

This will use an incremental formulation of the mean of all rewards obtained so far as the values of the table.

Parameters

action	An index representing which action on the table was selected. It must be between [0, k).
reward	The reward obtained from this action.

Reimplemented from agent.base_agent.BaseAgent.

Definition at line 59 of file epsilon_greedy.py.

Member Data Documentation

◆ _epsilon

agent.epsilon_greedy.EpsilonGreedy._epsilon

protected

Definition at line 57 of file epsilon_greedy.py.

◆ _n

agent.epsilon_greedy.EpsilonGreedy._n

protected

Definition at line 27 of file epsilon_greedy.py.

◆ _rng

agent.epsilon_greedy.EpsilonGreedy._rng

protected

Definition at line 29 of file epsilon_greedy.py.

◆ epsilon

agent.epsilon_greedy.EpsilonGreedy.epsilon

Definition at line 25 of file epsilon_greedy.py.

The documentation for this class was generated from the following file:

agent/epsilon_greedy.py

Public Member Functions

Public Attributes

Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ act()

◆ epsilon() [1/2]

◆ epsilon() [2/2]

◆ update()

Member Data Documentation

◆ _epsilon

◆ _n

◆ _rng

◆ epsilon

◆ init()