The class C_Memory is the C++ backend for memory-buffer used in algorithms that stores transitions in a buffer. This class contains optimized routines to support Python front-end of rlpack._C.memory.Memory class. More...
Collaboration diagram for C_Memory:Data Structures | |
| struct | C_MemoryData |
| The class C_MemoryData keeps the references to data that is associated with C_Memory. This class implements the functions necessary to retrieve the data by de-referencing the data associated with C_Memory. More... | |
Public Member Functions | |
| C_Memory () | |
| C_Memory (int64_t bufferSize, const std::string &device, const int32_t &prioritizationStrategyCode, const int32_t &batchSize) | |
| void | clear () |
| void | delete_item (int64_t index) |
| std::map< std::string, torch::Tensor > | get_item (int64_t index) |
| void | initialize (C_MemoryData &viewC_Memory) |
| void | insert (torch::Tensor &stateCurrent, torch::Tensor &stateNext, torch::Tensor &reward, torch::Tensor &action, torch::Tensor &done, torch::Tensor &priority, torch::Tensor &probability, torch::Tensor &weight, bool isTerminalState) |
| int64_t | num_terminal_states () |
| std::map< std::string, torch::Tensor > | sample (float_t forceTerminalStateProbability, int64_t parallelismSizeThreshold, float_t alpha=0.0, float_t beta=0.0, int64_t numSegments=0) |
| void | set_item (int64_t index, torch::Tensor &stateCurrent, torch::Tensor &stateNext, torch::Tensor &reward, torch::Tensor &action, torch::Tensor &done, torch::Tensor &priority, torch::Tensor &probability, torch::Tensor &weight, bool isTerminalState) |
| size_t | size () |
| int64_t | tree_height () |
| void | update_priorities (torch::Tensor &randomIndices, torch::Tensor &newPriorities) |
| C_MemoryData | view () const |
| ~C_Memory () | |
Data Fields | |
| std::shared_ptr< C_MemoryData > | cMemoryData |
| Shared Pointer to C_Memory::C_MemoryData. More... | |
Static Private Member Functions | |
| static torch::Tensor | compute_important_sampling_weights (torch::Tensor &probabilities, int64_t currentSize, float_t beta) |
| static torch::Tensor | compute_probabilities (torch::Tensor &priorities, float_t alpha) |
Private Attributes | |
| std::deque< torch::Tensor > | actions_ |
| Deque of torch tensors for actions. More... | |
| int32_t | batchSize_ = 32 |
| The batch size that is set during class initialisation. Number of samples equivalent to this are selected during sampling. More... | |
| int64_t | bufferSize_ = 32768 |
| Buffer size passed during the class initialisation. Defaults to 32768. More... | |
| torch::Device | device_ = torch::kCPU |
| Torch device passed during class initialisation. Defaults to CPU. More... | |
| std::map< std::string, torch::DeviceType > | deviceMap_ |
| The map between std::string and torch::DeviceType; mapping the device name in string to DeviceType. More... | |
| std::deque< torch::Tensor > | dones_ |
| Deque of torch tensors for dones. More... | |
| std::vector< int64_t > | loadedIndices_ |
| Vector of loaded indices. This indicates the indices that have been loaded out of total capacity of the memory. More... | |
| std::vector< int64_t > | loadedIndicesSlice_ |
| The loaded indices slice; the slice of indices that is sampled during sampling process. In each sampling size its size is equal to C_Memory::batchSize_. More... | |
| Offload< float_t > * | offloadFloat_ |
| Offload class initialised with float template. More... | |
| Offload< int64_t > * | offloadInt64_ |
| Offload class initialised with int64 template. More... | |
| std::deque< torch::Tensor > | priorities_ |
| Deque of torch tensors for priorities. More... | |
| std::deque< float_t > | prioritiesFloat_ |
| Deque of float indicating the priorities in C++ float. Values are obtained from C_Memory::priorities_. More... | |
| int32_t | prioritizationStrategyCode_ = 0 |
| The prioritization strategy code that is being. This determines the sampling technique that is employed. Refer rlpack.dqn.dqn.Dqn.get_prioritization_code. More... | |
| std::deque< torch::Tensor > | probabilities_ |
| Deque of torch tensors for probabilities. More... | |
| std::deque< torch::Tensor > | rewards_ |
| Deque of torch tensors for rewards. More... | |
| std::vector< torch::Tensor > | sampledActions_ |
| The sampled action tensors from C_Memory::actions_. More... | |
| std::vector< torch::Tensor > | sampledDones_ |
| The done tensors from C_Memory::dones_. More... | |
| std::vector< torch::Tensor > | sampledIndices_ |
| The sampled indices as tensors from C_Memory::loadedIndices_. More... | |
| std::vector< torch::Tensor > | sampledPriorities_ |
| The sampled priority tensors from C_Memory::priorities. More... | |
| std::vector< torch::Tensor > | sampledRewards_ |
| The sampled reward tensors from C_Memory::rewards_. More... | |
| std::vector< torch::Tensor > | sampledStateCurrent_ |
| The sampled current state tensors from C_Memory::statesCurrent_. More... | |
| std::vector< torch::Tensor > | sampledStateNext_ |
| The sampled next state tensors from C_Memory::statesNext_. More... | |
| std::vector< float_t > | seedValues_ |
| The seed values generated during each sampling cycle for proportional based prioritization. More... | |
| std::vector< int64_t > | segmentQuantileIndices_ |
| The Quantile segment indices sampled when rank-based prioritization is used. More... | |
| std::deque< torch::Tensor > | statesCurrent_ |
| Deque of torch tensors for current states. More... | |
| std::deque< torch::Tensor > | statesNext_ |
| Deque of torch tensors for next states. More... | |
| int64_t | stepCounter_ = 0 |
| The counter variable the tracks the loaded indices in sync with total timesteps. Once memory reaches the buffer size, this will not update. More... | |
| std::shared_ptr< SumTree > | sumTreeSharedPtr_ |
| Shared Pointer to SumTree class object. More... | |
| std::deque< int64_t > | terminalStateIndices_ |
| Deque of integers indicating the indices of terminal states. More... | |
| std::deque< torch::Tensor > | weights_ |
| Deque of torch tensors for weights. More... | |
The class C_Memory is the C++ backend for memory-buffer used in algorithms that stores transitions in a buffer. This class contains optimized routines to support Python front-end of rlpack._C.memory.Memory class.
A memory index refers to an index that yields a transition from C_Memory. This works by indexing the following variables and grouping them together:
| C_Memory::C_Memory | ( | ) |
The default non-parameterised constructor. This constructor allocates memory as per default initialised variables. This initialises the rlpack._C.memory.Memory.c_memory and is equivalent to rlpack._C.memory.Memory.__init__.
|
explicit |
The class constructor for C_Memory. This constructor initialised the C_Memory class and allocates the required memory as per input arguments. This initialises the rlpack._C.memory.Memory.c_memory and is equivalent to rlpack._C.memory.Memory.__init__.
| bufferSize | : The buffer size to be used and allocated for the memory. |
| device | : The device transfer relevant tensors to. |
| prioritizationStrategyCode | : The prioritization strategy code. Refer rlpack.dqn.dqn.Dqn.get_prioritization_code. |
| batchSize | : The batch size to be used for sampling. |
| C_Memory::~C_Memory | ( | ) |
The destructor for C_Memory.
| void C_Memory::clear | ( | ) |
Clears the data in C_Memory. This will NOT free the memory since it doesn't perform any memory de-allocation. This is C++ backend of rlpack._C.memory.Memory.clear method.
|
staticprivate |
Method to compute the important sampling weights for each probabilities.
| probabilities | : The input probabilities for which IS weights are to be computed. |
| currentSize | : The current size of the C_Memory (see C_Memory::size) |
| beta | : The beta value for prioritization. Refer C_Memory::sample for more information. |
|
staticprivate |
Method to compute probabilities when not using uniform prioritization strategy.
| priorities | : The sampled priorities for which probabilities are to be computed. |
| alpha | : The alpha value for prioritization. Refer C_Memory::sample for more information. |
| void C_Memory::delete_item | ( | int64_t | index | ) |
Deletion method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.__delitem__ so can be accessed by simple indexing operation (with operator []; del memory[index]) from Python side.
This the deletion is fast if index is either the first or last element, else will take O(n) to allocate memory for items after index.
| index | : The index of the transition we want to remove. |
| std::map< std::string, torch::Tensor > C_Memory::get_item | ( | int64_t | index | ) |
Getter method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.__getitem__ method so can be accessed by simple indexing operation (with operator []; item = memory[index]) from Python side.
| index | : The index from which we want to obtain the transition |
| void C_Memory::initialize | ( | C_Memory::C_MemoryData & | viewC_MemoryData | ) |
Initialize method for C_Memory for initializing all the data from an object of C_Memory::C_MemoryData. This is the C++ backend of rlpack._C.memory.Memory.initialize method
| viewC_MemoryData | : An object of C_Memory::C_MemoryData. |
| void C_Memory::insert | ( | torch::Tensor & | stateCurrent, |
| torch::Tensor & | stateNext, | ||
| torch::Tensor & | reward, | ||
| torch::Tensor & | action, | ||
| torch::Tensor & | done, | ||
| torch::Tensor & | priority, | ||
| torch::Tensor & | probability, | ||
| torch::Tensor & | weight, | ||
| bool | isTerminalState | ||
| ) |
Insertion method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.insert method.
| stateCurrent | : Current state from transition |
| stateNext | : Next state from transition. |
| reward | : Reward obtained during transition. |
| action | : Action taken during transition. |
| done | : Flag indicating if next state is terminal packaged in PyTorch Tensor. |
| priority | : Priority value associated with the transition. |
| probability | : Probability value associated with the transition. |
| weight | : Weight value associated with the transition. |
| isTerminalState | : Flag indicating if next state is terminal. |
| int64_t C_Memory::num_terminal_states | ( | ) |
Method to obtain the number of terminal states currently in C_Memory. This is the C++ backend of rlpack._C.memory.Memory.num_terminal_states method.
| std::map< std::string, torch::Tensor > C_Memory::sample | ( | float_t | forceTerminalStateProbability, |
| int64_t | parallelismSizeThreshold, | ||
| float_t | alpha = 0.0, |
||
| float_t | beta = 0.0, |
||
| int64_t | numSegments = 0 |
||
| ) |
The sampling method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.sample. Sampling is done as per the prioritization strategy specified during initialisation of C_Memory.
| forceTerminalStateProbability | : The probability to force a terminal state in final sample. |
| parallelismSizeThreshold | : The threshold size of buffer (from C_Memory::size method) beyond with OpenMP parallelized routines are used for sampling. |
| alpha | : The alpha value for prioritization. This is used to compute probabilities, where higher alpha indicates more aggressive prioritization. |
| beta | : The beta value for prioritization. This is used to compute important sampling weights, where higher beta indicates more aggressive bias correction. |
| numSegments | : The number of segments to be used for rank-based prioritization (in accordance with Zipf's law) |
(batchSize, ...):| void C_Memory::set_item | ( | int64_t | index, |
| torch::Tensor & | stateCurrent, | ||
| torch::Tensor & | stateNext, | ||
| torch::Tensor & | reward, | ||
| torch::Tensor & | action, | ||
| torch::Tensor & | done, | ||
| torch::Tensor & | priority, | ||
| torch::Tensor & | probability, | ||
| torch::Tensor & | weight, | ||
| bool | isTerminalState | ||
| ) |
Setter method for C_Memory. This is the C++ backend of rlpack._C.memory.Memory.__setitem__ method so can be accessed by simple indexing operation (with operator []; memory[index] = index) from Python side. This method modified the items at the given index.
| index | : The index to which we want to set the transition. |
| stateCurrent | : Current state from transition |
| stateNext | : Next state from transition. |
| reward | : Reward obtained during transition. |
| action | : Action taken during transition. |
| done | : Flag indicating if next state is terminal packaged in PyTorch Tensor. |
| priority | : Priority value associated with the transition. |
| probability | : Probability value associated with the transition. |
| weight | : Weight value associated with the transition. |
| isTerminalState | : Flag indicating if next state is terminal. |
| size_t C_Memory::size | ( | ) |
This method obtains the current size of C_Memory. This is the C++ backend of rlpack._C.memory.Memory.__len__ method, so length can be obtained by in-built python function len(memory).
| int64_t C_Memory::tree_height | ( | ) |
Method to obtain the tree height of the sum tree if using a proportional prioritization strategy. This is the C++ backend of rlpack._C.memory.Memory.tree_height. If not using proportional prioritization strategy, calling this method will throw an error.
| void C_Memory::update_priorities | ( | torch::Tensor & | randomIndices, |
| torch::Tensor & | newPriorities | ||
| ) |
The method to update priorities as per new values computed by agent as per the prioritization strategy. This is the C++ backend of rlpack._C.memory.Memory.update_priorities method.
| randomIndices | : The random indices on which priorities are required to be updated. C_Memory::sample provides this information which can be used. |
| newPriorities | : The new priorities computed by the agent as per the prioritization strategy. |
| C_Memory::C_MemoryData C_Memory::view | ( | ) | const |
The pointer to C_Memory::C_MemoryData object. This will contain references of data in C_Memory and provides an easy data view. This is the C++ backend of rlpack._C.memory.Memory.view method.
|
private |
Deque of torch tensors for actions.
|
private |
The batch size that is set during class initialisation. Number of samples equivalent to this are selected during sampling.
|
private |
Buffer size passed during the class initialisation. Defaults to 32768.
| std::shared_ptr<C_MemoryData> C_Memory::cMemoryData |
Shared Pointer to C_Memory::C_MemoryData.
|
private |
Torch device passed during class initialisation. Defaults to CPU.
|
private |
The map between std::string and torch::DeviceType; mapping the device name in string to DeviceType.
|
private |
Deque of torch tensors for dones.
|
private |
Vector of loaded indices. This indicates the indices that have been loaded out of total capacity of the memory.
|
private |
The loaded indices slice; the slice of indices that is sampled during sampling process. In each sampling size its size is equal to C_Memory::batchSize_.
|
private |
Deque of torch tensors for priorities.
|
private |
Deque of float indicating the priorities in C++ float. Values are obtained from C_Memory::priorities_.
|
private |
The prioritization strategy code that is being. This determines the sampling technique that is employed. Refer rlpack.dqn.dqn.Dqn.get_prioritization_code.
|
private |
Deque of torch tensors for probabilities.
|
private |
Deque of torch tensors for rewards.
|
private |
The sampled action tensors from C_Memory::actions_.
|
private |
The done tensors from C_Memory::dones_.
|
private |
The sampled indices as tensors from C_Memory::loadedIndices_.
|
private |
The sampled priority tensors from C_Memory::priorities.
|
private |
The sampled reward tensors from C_Memory::rewards_.
|
private |
The sampled current state tensors from C_Memory::statesCurrent_.
|
private |
The sampled next state tensors from C_Memory::statesNext_.
|
private |
The seed values generated during each sampling cycle for proportional based prioritization.
|
private |
The Quantile segment indices sampled when rank-based prioritization is used.
|
private |
Deque of torch tensors for current states.
|
private |
Deque of torch tensors for next states.
|
private |
The counter variable the tracks the loaded indices in sync with total timesteps. Once memory reaches the buffer size, this will not update.
|
private |
Shared Pointer to SumTree class object.
|
private |
Deque of integers indicating the indices of terminal states.
|
private |
Deque of torch tensors for weights.