A base class used to create a variety of bandit solving agents. More...

Inheritance diagram for agent.base_agent.BaseAgent:

Public Member Functions
None	__init__ (self, int k, float start_value=0.0)
	Construct the agent.

int	act (self)
	Use a specific algorithm to determine which action to take.

int	exploit (self)
	Select the best action.

int	explore (self)
	Explore a new action.

numpy.ndarray	table (self)
	Return the Q-Table.

None	update (self, int action, float reward)
	Update the Q-Table.

Protected Attributes
	_table

Detailed Description

A base class used to create a variety of bandit solving agents.

This class provides a table that can be used to store reward estimates. It also defines the interface that any agent must define when implemented. This ensures consistent API across each agent type.

Definition at line 5 of file base_agent.py.

Constructor & Destructor Documentation

◆ init()

None agent.base_agent.BaseAgent.__init__	(		self,
		int	k,
		float	start_value = `0.0`
	)

Construct the agent.

Parameters

k	The number of possible actions the agent can pick from at any given time. Must be an int greater than zero.
start_value	An initial value to use for each possible action. This assumes that each action is equally likely at start, so all values in the Q-table are set to this value.

Exceptions

ValueError if k is not an integer greater than 0.

Reimplemented in agent.epsilon_greedy.EpsilonGreedy, agent.greedy.Greedy, and agent.tests.test_base_agent.FakeAgent.

Definition at line 13 of file base_agent.py.

Member Function Documentation

◆ act()

int agent.base_agent.BaseAgent.act ( self )

Use a specific algorithm to determine which action to take.

This method should define how exactly the agent selects an action. It is free to use explore and exploit as needed.

Returns: An int representing which arm action to take. This int should be between [0, k).

Reimplemented in agent.epsilon_greedy.EpsilonGreedy, agent.greedy.Greedy, and agent.tests.test_base_agent.FakeAgent.

Definition at line 30 of file base_agent.py.

◆ exploit()

int agent.base_agent.BaseAgent.exploit ( self )

Select the best action.

This will use the Q-table to select the action with the highest likelihood. Ties are broken arbitrarily.

Returns: An int representing which arm action to take. This int will be between [0, k).

Definition at line 39 of file base_agent.py.

◆ explore()

int agent.base_agent.BaseAgent.explore ( self )

Explore a new action.

This will select a random action to take from the Q-table, to explore the decision space more.

Returns: An int representing which arm action to take. This int will be between [0, k).

Definition at line 56 of file base_agent.py.

◆ table()

numpy.ndarray agent.base_agent.BaseAgent.table ( self )

Return the Q-Table.

Returns: a Numpy array of k elements. the i-th element holds the estimated value for the i-th action/arm.

Definition at line 68 of file base_agent.py.

◆ update()

None agent.base_agent.BaseAgent.update	(		self,
		int	action,
		float	reward
	)

Update the Q-Table.

This takes the result of the previous action and the resulting reward and should update the Q-Table. How it updates will depend on the specific implementation.

Parameters

action	An int representing which arm action was taken. This should be between [0, k].
reward	A float representing the resulting reward obtained from the selected action.

Reimplemented in agent.epsilon_greedy.EpsilonGreedy, agent.greedy.Greedy, and agent.tests.test_base_agent.FakeAgent.

Definition at line 76 of file base_agent.py.

Member Data Documentation

◆ _table

agent.base_agent.BaseAgent._table

protected

Definition at line 27 of file base_agent.py.

The documentation for this class was generated from the following file:

agent/base_agent.py

Public Member Functions

Protected Attributes

Detailed Description

Constructor & Destructor Documentation

◆ __init__()

Member Function Documentation

◆ act()

◆ exploit()

◆ explore()

◆ table()

◆ update()

Member Data Documentation

◆ _table

◆ init()