-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDDL memory representation in regards of usability and multi-threaded applications #11
Comments
I don't know enough about the details to give concrete advice, but I'd say for a parser the main goal is not speed but legibility of code and ease of maintenance. You'll have a hard time finding tasks that take more than a second to parse. So I'd use simple, flat data structures wherever possible. |
We don't want to use Loki just as a parser. The final representation is supposed to be usable in a planner or other application. The basic idea is to write proxies for each pddl object with potentially extended functionality. There will be no need to implement low level details again. namespace mimir {
class SomeType {
private:
const loki::pddl::SomeType* external_;
// add attributes for additional representation changes if needed
public:
SomeType(const loki::pddl::SomeType* external) : external_(external) {}
// add additional functionality or forwarding, all must be const for multi-threading purposes
};
} Currently, loki::pddl::SomeType is a non-owning raw pointer. In other words, it is a type that provides operator overloads for |
A little more clarification. The main issue with above is that derefencing a template<typename T, size_t N>
class SegmentedPersistentVector {
private:
std::vector<std::array<T, N>> m_data;
int m_block_index;
int m_index_in_block;
size_t m_size;
size_t m_capacity;
void increase_capacity() {
// Add an additional array with capacity N (1 allocation on average)
m_data.resize(m_data.size() + 1);
// Move to the next free block
++m_block_index;
// Set index to next free position in block
m_index_in_block = 0;
// Increase total capacity
m_capacity += N;
}
public:
explicit SegmentedPersistentVector() : m_block_index(-1), m_index_in_block(0), m_size(0), m_capacity(0) { }
T* push_back(T value) {
// Increase capacity if necessary
if (m_size >= m_capacity) {
increase_capacity();
}
auto& block = m_data[m_block_index];
// Take ownership of memory
block[m_index_in_block] = std::move(value);
// Fetch return value
T* return_value = &block[m_index_in_block];
// Set index to next free position in block
++m_index_in_block;
++m_size;
return return_value
}
size_t size() const {
return m_size;
}
size_t capacity() const {
return m_capacity;
}
}; |
Just some words of caution: the Segmented* data structures have been designed to store large amounts of memory, i.e., hundreds of MB. For Loki it seems that a simple non-owning pointer design or using plain references is the easiest and most maintainable solution. Optimizing the Loki data structures for cache locality sounds like premature optimization. This only has an effect if all the other operations in a planner/validator/etc have negligible runtime. I think it makes sense to develop the library in tandem with a use case, such as a plan validator. Then you'll directly see which API to offer and where optimizations are needed (if at all). |
Yes, I fully agree. Now, I understand that we can get pretty close to value semantics through pointers in the API. I do not plan to add these optimizations now. I am just exploring a bit and trying to understand better the boundaries of the current design and API to make it possible to make these kinds of low level optimizations later. I do not want to write everything from scratch later or make API changes later. My main objective is to tackle multi-threaded applications for learning and planning in the future, which is my highest priority to optimize for. |
Multi-threaded learning and planning indeed sounds exciting! |
The current approach won't scale well with multi-threaded applications because of heap contention. Ideally, we want memory allocations and de-allocations to occur very rarely. The main programming paradigm to address this issue is to use pre-allocated buffers and very rarely allocate or de-allocate memory. I plan to start using flatbuffers library from google which allows compact serializations into buffers and cheap unpacking from a pointers into the buffer without any heap allocations at all. Another advantage is that we obtain better data locality which can decrease the number of cache misses and further improve performance. |
I checked out other libraries that are capable of serialization and zero-cost deserialization with zero heap-allocations. An alternative is cap'n'proto. The APIs are quite similar. I have a tendency towards flatbuffers because it offers a little bit more flexibility. It will be more complicated for users of Loki to work with this data structure but it will be much more efficient. Zero allocations is what we need in multi-threaded environments to scale well with the number of cores. I will add a separate namespace |
There are good reasons to move towards value semantics as opposed to pointer semantics for storing PDDL objects. 1) We don't need virtual bases classes but we currently use them, e.g., for conditions, effects. 2) Cache locality: with a combination of handles and views we can get better cache locality. A PDDL object will then be composed of
View
s to make it easy to access the data members.A
Handle
could be implemented as follows:A
View
could be implemented as follows:A
VectorByHandle
could be implemented as follows:The text was updated successfully, but these errors were encountered: