lib.utils package

Submodules

lib.utils.ReplayBuffer module

class lib.utils.ReplayBuffer.SimpleReplayBuffer(maxDataTuples)[source]

Bases: object

The replay buffer

Save data for the replay buffer

Parameters:maxDataTuples ({int}) – The size of the deque that is used for storing the data tuples. This assumes that the data tuples are present in the form: (state, action, reward, next_state, done, cumRewards). This means that we assume that the data will have some form of cumulative reward pints associated with each tuple.
append(result)[source]

append a single tuple to the current replay buffer

This function allows someone to add a single tuple to the replay buffer.

Parameters:result ({tuple}) – The tuple that should be added into the memory buffer.
appendAllAgentResults(allResults)[source]

append all data from all agents into the same buffer

This is useful when there is only one agent or when all the agents represent the same exact larning characteristics. In this case, multiple agents can be simulated by the same function.

Parameters:{list} -- List of list tuples to be entered into the buffer. (allResults) –
appendMany(results)[source]

append multiple tuples to the memory buffer

Most often we will not be insterested in inserting a single data point into the replay buffer, but rather a whole list of these. This function just iterates over this list and inserts each tuple one by one.

Parameters:results ({list}) – List of tuples that are to be inserted into the replay buffer.
clear()[source]
len

returns the length of the emory buffer

Remember that this is a property and there is no need to call it as a function.

Returns:int – the length of the currrent memory buffer
load(folder, name)[source]

load the data from a particular file

Data saved with the previous command can be reloaded into this new buffer.

Parameters:
  • folder ({str}) – Path to the folder where the data is saved
  • name ({str}) – Name of the agent associated whose data is to be extracted.
sample(nSamples)[source]

sample from the replay beffer

This function samples form the memory buffer, and returns the number of samples required. This does sampling in an intelligent manner. Since we are saving the cumulative rewards, we selectively select values that provide us greater

Parameters:nSamples ({int}) – The number of memory elements to return
Returns:A list of samples that can be used for sampling the data.
Return type:list
save(folder, name)[source]

save the replay buffer

This function is going to save the data within the replay buffer into a pickle file. This will allow us to reload the buffer to a state where it has already been saved.

Parameters:
  • folder ({str}) – path to the folder where the data is to be saved
  • name ({str}) – Name associated with the buffer. Since this program has two agents acting in tandum, we need to provide a name that will identify which agent’s buffer we are saving.
shape

the shape of the buffer

This is the shape of the memory buffers. This returns a tuple that contains the length of the buffer for the first element of the tuple, and the length of each element as the second element of the tuple. If there is nothing within the memory, this is going to return a None

Returns:tuple – the shape of the data within the memory buffer

Module contents

[summary]

[description]