Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adopt FairMQMessage backed memory resource collection from AliceO2 #93

Merged
merged 8 commits into from
Oct 31, 2018

Conversation

mkrzewic
Copy link
Contributor

As discussed in #54 opening a PR for review. This contains the port of the O2 pmr prototype + basic tools + basic integration into FairMQ + some unit tests (essentially the same as in O2).
Next step in the short term could be additional integration steps (e.g. adding a pointer to factory in FairMQMessage) followed by some simplification of the tools.
In the longer term we will see...

@dennisklein dennisklein added this to the v1.3 milestone Sep 20, 2018
@dennisklein dennisklein added feature New feature or request in progress labels Sep 20, 2018
@dennisklein dennisklein self-requested a review September 20, 2018 15:57
@codecov
Copy link

codecov bot commented Oct 8, 2018

Codecov Report

Merging #93 into dev will increase coverage by 0.05%.
The diff coverage is 86.95%.

@@            Coverage Diff             @@
##              dev      #93      +/-   ##
==========================================
+ Coverage   71.43%   71.49%   +0.05%     
==========================================
  Files          69       71       +2     
  Lines        4071     4076       +5     
==========================================
+ Hits         2908     2914       +6     
+ Misses       1163     1162       -1
Impacted Files Coverage Δ
fairmq/shmem/FairMQTransportFactorySHM.h 100% <ø> (ø) ⬆️
fairmq/zeromq/FairMQTransportFactoryZMQ.h 100% <ø> (ø) ⬆️
fairmq/FairMQTransportFactory.h 0% <ø> (ø) ⬆️
fairmq/nanomsg/FairMQTransportFactoryNN.h 100% <ø> (ø) ⬆️
fairmq/MemoryResources.h 0% <0%> (ø)
fairmq/FairMQChannel.h 0% <0%> (ø) ⬆️
fairmq/shmem/FairMQTransportFactorySHM.cxx 82.5% <100%> (+2.5%) ⬆️
fairmq/zeromq/FairMQTransportFactoryZMQ.cxx 75% <100%> (ø) ⬆️
fairmq/nanomsg/FairMQTransportFactoryNN.cxx 93.93% <100%> (ø) ⬆️
fairmq/FairMQDevice.cxx 76.29% <100%> (ø) ⬆️
... and 14 more

@mkrzewic
Copy link
Contributor Author

the ci errors seem unrelated? it fails some unrelated test it looks like

@dennisklein
Copy link
Member

Hm, ya, sry, I also wanted to come back to this anyways. But this week, no chance, will have a look next week. Sry.

@dennisklein
Copy link
Member

Some thoughts on the APIs:

Currently, we have on sender side:

auto pmr = channel.Transport()->GetMemoryResource();
std::vector<MyData, std::pmr::polymorphic_allocator<MyData>> v(
   std::pmr::polymorphic_allocator<MyData>{pmr});
auto message = fair::mq::getMessage(std::move(&v[0]));
channel.Send(message);

here, I would like to present something like this to the user:

auto v = fair::mq::MakeVectorFor(channel);
channel.Send(std::move(v));

For receiving side, we could have just something like this:

template<typename T> gsl::span<T> FairMQMessage::GetSpan();

Not sure about adopting buffers with STL containers. Do you know, if this is "legal" usage?

What are your ideas how the API should look like?

@mkrzewic
Copy link
Contributor Author

mkrzewic commented Oct 26, 2018

Hi @dennisklein,
about the sender side, sure, what you propose just wraps the longer version... There is however a bit of inconsistency about how transport factories are treated - in practice they are singletons and a lot of fairmq code seems to assume that - this is however not guaranteed by the API - if it were I could simplify even further (e.g. the pointer from message to factory could become static).

as for your question about whether the standard allows buffer adoption - it does not forbid it:) See MemoryResourceTools.h for some ideas on how to do this for vector (the adoptVector family of functions, maybe should be called adoptBuffer or sth) - (zero-copy) adoption of buffers in a std::vector (or any other container) is in principle possible, albeit not with the standard allocators. In principle there is no reason to create a std::vector and receive a span, we can present a vector. This is a topic for a longer discussion i guess which we can open any time:)
Anyway, the adoptVector functions are just examples, in the end there should be only one (the one that takes ownership of the message).

@dennisklein
Copy link
Member

In principle there is no reason to create a std::vector and receive a span, we can present a vector. This is a topic for a longer discussion i guess which we can open any time:)

I am still not sure about buffer adoption with e.g. std::vector. If we implement construct with a noop, wont we run into UB?

@mkrzewic
Copy link
Contributor Author

@dennisklein I don't think so, construct/destroy is part of the public interface of allocator_traits and the user can do whatever (in principle), I could not find any "legal" problem with this approach - it is perfectly legal to operate on unitialized memory, standard allows that explicitly afaik. Now doing this might produce unexpected behaviour if the buffer was produced on a machine with a different layout/endianness etc. so the user must make sure this is OK of course, but that is up to a higher level layer (user or lib). anyway this kind of implementation can always be made in the user layer, fairmq does not have to implement/expose it directly.

@mattkretz
Copy link
Contributor

mattkretz commented Oct 29, 2018

construct/destroy is part of the public interface of allocator_traits and the user can do whatever (in principle), I could not find any "legal" problem with this approach

E.g. https://wg21.link/containers#container.requirements.general-3 states "objects stored in these components shall be constructed using the function allocator_­traits<allocator_­type>​::​rebind_­traits<U>​::​​construct". I.e. having those objects constructed before construct is called and having construct not construct those objects invokes undefined behavior.

@mkrzewic
Copy link
Contributor Author

mkrzewic commented Oct 29, 2018

@mattkretz well, so is (in general) using reinterpret_cast - or span - which essentially all boil down to the same thing. In this case we are not talking about a general case but about vectors of numbers: contiguous sequences of simple elements, simple as in construction/destruction has no side effects, data is standard layout... So we either do the proper thing and copy the data from buffer to container loosing oodles of time and resources or we reinterpret cast (or use span, which is just reinterpret_cast with some sugar coating). I don't see how this is UB - I'm not constructing before construction - I'm just defining a special construction strategy. Is it cheating? Perhaps:)

@mattkretz
Copy link
Contributor

You can certainly use reinterpret_cast without invoking UB. The existence of incorrect usage of reinterpret_cast doesn't make your UB any more correct. ;-)

AFAIU your specific hack, I believe you may get away with it - as long as users don't start using the full std::vector interface and expect it to behave as documented. But since it's UB, anything could happen in the future. An new stdlib implementation might suddenly break your code; a UBsan implementation might start flagging your hack; a change in the next language revision might exploit that no such code can exist and break yours. Never rely on "but it works". UB is a maintenance liability.

The reason why I believe this hack isn't worth the trouble, is that you need a different type than std::vector<T> anyway (i.e. with a different allocator type). So why not just use span that doesn't have to invoke UB but still implements the necessary API? (And copying a span implementation with a liberal license and adjusting it for use in C++11/14 code isn't too hard.)

@mkrzewic
Copy link
Contributor Author

:) sure, I wasn't actually pushing for this specific hack to make it in the library (at this level ;) ), it is not the reason for this PR.
I also don't like the fact that the output allocator is possibly different than what goes in. Anyway, you are of course right that it might be a maintenance issue.
I'm still not convinced this actually is UB (because of e.g. layout guarantees std::vector has for the backing store) but i'll take your word for it for now...
Shall I remove the getVector functions from MemoryResourceTools.h?

@mkrzewic
Copy link
Contributor Author

mkrzewic commented Oct 29, 2018

oh, btw, we do need a type different than just std::vector<T>, it definitely needs to be std::vector<T,polymorphic_allocator<byte>> - that is the whole meat of this story.

@mattkretz
Copy link
Contributor

Since the std::vector<T> seems to store a standard layout T you might get away with construct doing nothing, without invoking UB. Because T x; also doesn't do anything.

size_t nelem,
SpectatorMessageResource *resource)
{
return std::vector<const ElemT, SpectatorAllocator<const ElemT>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the conversion from std::vector<const ElemT, SpectatorAllocator<const ElemT>> to std::vector<const ElemT, boost::container::pmr::polymorphic_allocator<const ElemT>> possible? At least https://en.cppreference.com/w/cpp/container/vector/vector doesn't list any ctor that would allow it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is because SpectatorAllocator derives from polymorphic_allocator i guess...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't help. std::vector still doesn't have a ctor that takes a std::vector with different allocator type: https://godbolt.org/z/1bGCWi

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, somehow here it works... actually, in the original implementation in O2 the return type is auto, I had to invent something here for c++11. It is removed from this PR anyway, it lives on in #110 - but it's days are probably numbered as it is a bit hard to use - the owning version is better (or less bad ;) )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i think I know why it worked here: it never did - the template is never instantiated as the function is never used :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-) I suspected as much. Normally, going from one allocator type to the next implies a reallocating and moving the data. That's why I believe this conversion ctor is not part of the std::vector interface. ...and why polymorphic_allocator was needed so badly.

@dennisklein
Copy link
Member

dennisklein commented Oct 29, 2018

Shall I remove the getVector functions from MemoryResourceTools.h?

Not yet. But I triage this PR to the 1.4 milestone, since I want to make a stable 1.3.x release this week, and hurrying here does not help anyone. (This will also delay the ChannelResource and getMessage() features, but I think that's fine ?!)

Also, can you please rebase on current dev, I hope this will make the CI checks green.
edit: nvm, there was a button for it.

@mkrzewic
Copy link
Contributor Author

fine, unless someone in O2 would like to see the core functionality already in 1.3 - @ktf ? I would possibly like to make one more change: we moved O2 to C++17 and we are cleaning up some "hacks" like we would e.g. like to start using std::byte . I defined fair::mq::byte to alias unsigned char, it would be nice to maybe make the definition dependent on the language standard used to compile fairmq with? Would that be possible/doable?

@dennisklein
Copy link
Member

fine, unless someone in O2 would like to see the core functionality already in 1.3 - @ktf ?

Also fine with me, from my point we can merge the ChannelResource and getMessage asap and then we leave only the adoption topic for 1.4.

std::byte

There is a good chance FairMQ will also move to C++17 with 1.4. We are currently testing. Will know more in one or two weeks.

Add a pmr interface to FairMQTransportFactory

refactor

Port the unit tests for MemoryResources

clang format
the patch seems big but most of it is just propagating the new notion of
constness of the factory - since it keeps track of created messages with
the internal allocator it no longer is const
@mkrzewic
Copy link
Contributor Author

mkrzewic commented Oct 30, 2018

for now I removed the adoption related bits then. Having some problems building it locally after rebase, let's see what the ci says.

local build and tests succesful

so the eventual switch to std::pmr becomes easier
@dennisklein dennisklein changed the title [WIP] Adopt FairMQMessage backed memory resource collection from AliceO2 Adopt FairMQMessage backed memory resource collection from AliceO2 Oct 30, 2018
@dennisklein
Copy link
Member

The failing test on Linux is probably unrelated to your changes, we are looking at it.

Apple Clang finds this: https://cdash.gsi.de/viewBuildError.php?type=1&buildid=190840

@dennisklein dennisklein requested a review from rbx October 30, 2018 16:27
@dennisklein dennisklein modified the milestones: v1.4, v1.3 Oct 30, 2018
@dennisklein dennisklein self-assigned this Oct 30, 2018
@mkrzewic
Copy link
Contributor Author

@dennisklein not sure if changing this won't break C++11 builds - copy elision is only required from C++17 on, returning a move only thing by value might break the build on a compiler that does not have this as an optimisation for c++11... difficult to test on my machine, since I always compile with -std=c++17 :)

@dennisklein
Copy link
Member

returning a move only thing by value might break the build on a compiler that does not have this as an optimisation for c++11...

The warning means that std::move will prevent RVO in this case (even on C++17). And for C++11/14 gcc and clang all implement RVO. C++17 just required it and removed the required presence of a copy ctor.

difficult to test on my machine, since I always compile with -std=c++17 :)

You can pass -DCMAKE_CXX_STANDARD=11 to FairMQ, if you don't pass anything, 11 will be set by default. The warnings from the CI are produced with -std=c++11.

@ktf
Copy link
Contributor

ktf commented Oct 30, 2018

Fine with me if we need to move to some new tag..

@dennisklein dennisklein merged commit cbab764 into FairRootGroup:dev Oct 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants