Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Gaudi::Functional, add examples and adapt PodioInput to use Gaudi::Functional #129

Merged
merged 68 commits into from
Oct 4, 2023

Conversation

jmcarcell
Copy link
Member

@jmcarcell jmcarcell commented Jul 27, 2023

BEGINRELEASENOTES

  • Make changes in DataWrapper.h so that algorithms using Gaudi::Functional can work
  • Add examples and tests of a consumer, producer and transformer and their versions with multiple inputs and/or outputs
  • Adapt PodioInput to use Gaudi::Functional

ENDRELEASENOTES

This PR keeps the old tests (that still work) that we can remove or adapt later. There should be no breaking changes and so far the changes to PodioInput seem to work fine.

@jmcarcell jmcarcell changed the title Use Gaudi::Functional, add examples and move PodioInput Use Gaudi::Functional, add examples and adapt PodioInput to use Gaudi::Functional Jul 27, 2023
@Zehvogel
Copy link
Contributor

Zehvogel commented Aug 2, 2023

Nice that the tests work now! :)

I am still a bit confused about the semantics, specifically using a consumer for reading.
My Impression was that this would be done by a producer in the functional approach, but this is completely side-stepped by the DataSvc? PodioInput now behaves like a constant (if there is such a thing in Gaudi functional) as you are not taking any inputs and don't produce any outputs (directly).

// Which type of collection we are reading
// this will always be podio::CollectionBase
// Has to be wrapped in DataWrapper
using colltype = DataWrapper<podio::CollectionBase>;
Copy link
Contributor

@hegner hegner Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As just discussed in person you can avoid these using statements and the user side visibility of handles and wrappers by changing BaseClass_t.
We are using our special wrappers because of what is needed for PODIO here.
To make Gaudi Functional use these, we have to make them visible to [Consumer/Producer/Transformer]. which internally use DataHandleMixin. The latter uses this template magic to know which Handles to use. To override the default we need to subclass of Gaudi::Algorithm that adds something like

template<typename T>
using InputHandle_t = DataObjectReadHandle<DataWrapper<T>>;

and a similar one for the writing part.

Doing this we will not be able to use the legacy read/write any more though. If we want to preserve that, we have to use the key4hep specific data handle which we wanted to deprecate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to do this for the producers. When trying to do the same thing for the consumers (or transformers since they also have inputs) I get the following error message:

ExampleFunction...  ERROR Wrong DataObjectType : The type expected for /Event/MCParticles is AnyDataWrapper<edm4hep::MCParticleCollection> and is different from the one of the object in the store which is DataWrapper<podio::CollectionBase>.

that is coming from Gaudi performing a check with a dynamic_cast: https://gitlab.cern.ch/gaudi/Gaudi/-/blob/master/GaudiKernel/include/GaudiKernel/DataObjectHandle.h#L56. Options:

  • Templated registering of objects, not trivial to do
  • Using a wrapper around Gaudi::Functional::Consumer (and transformer) that retrieves the collection as podio::CollectionBase and then casts to the right type, but I like that our algorithms and the ones from Gaudi look the same
  • Maybe it's also possible to have a wrapper that we can use for registering and then wrappers of concrete types can inherit from so that the dynamic_cast check would pass.

Ideally I would like it to be like it is in the example producers right now, where we inherit from the corresponding Gaudi::Functional and we just have to specify the type of collection that we want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally I would like it to be like it is in the example producers right now, where we inherit from the corresponding Gaudi::Functional and we just have to specify the type of collection that we want.

I agree with this. In case it is not possible the option I would like the second most after this one is to have a k4FWCore::Functional (naming up for discussion) that is effectively a template specialization of the corresponding Gaudi::Functional that injects the necessary types for us. I don't know if this is something that is possible, or if would always need a wrapper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do I understand correctly that the whole maybeRead machinery is necessary to work around the AnyDataWrapper vs DataWrapper issue on the consuming side?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The maybeRead is there to simplify the interface and not have to use podio::CollectionBase as type and then do a dynamic cast by pushing directly, for example, an edm4ep::MCParticleCollection to the TES. There is maybe an option to create a custom Functional base class we can inherit from in k4FWCore that somehow gets around this so that it is possible to retrieve a podio::CollectionBase and do the relevant dynamic casting inside the class and not have the maybeRead stuff, but I don't think this is trivial to do

@jmcarcell jmcarcell force-pushed the functional branch 2 times, most recently from f0ab0a2 to 3914c0b Compare September 11, 2023 11:51
@@ -21,21 +21,172 @@

#include "k4FWCore/PodioDataSvc.h"

#include "edm4hep/MCParticleCollection.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO doing all this explicitly is the wrong approach and not really composable. Since the Input is always on top of the scheduling graph - can't we just push things into the event store with AnyData handles?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that we have to register objects with the type we want to get, otherwise there is the check I have mentioned in one of the comments above that will try to dynamic_cast whatever is in the store (DataWrapper<collectionType>) to what we are asking for. Pushing a DataWrapper<podio::CollectionBase> like before without templating the registration and getting back a AnyDataWrapper<class> doesn't work since you can't dynamic_cast one into the other

@jmcarcell jmcarcell force-pushed the functional branch 2 times, most recently from 0ddc479 to 3b09d32 Compare September 11, 2023 12:17
@jmcarcell
Copy link
Member Author

jmcarcell commented Sep 11, 2023

The latest commit 15e9202 (#129) simplifies the input interfaces allowing not to have to do dynamic casts and not having to use DataWrappers, the price to pay is having to template the registering of objects and having to have explicitly the EDM4hep classes. For classes that are not in EDM4hep they are registered as podio::CollectionBase so using DataWrapper and dynamic casts will be needed. This is one way of doing this, there are others possibly but this is probably from the user interface the least invasive being it almost identical to a non-k4FWCore Gaudi functional algorithm.

Copy link
Contributor

@hegner hegner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Very nice examples you put. I left a few more comments to address

README.md Outdated
cmake ..
make install
```


## Implementing algorithms
k4FWCore uses (or will use) `Gaudi::Functional` for executing algorithms. In
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd remove the "or will use". And then: "There are three different types of algorithms, depending on your use case:" And then the enumeration... if you open a ticket on me I can make the following text a bit more explicit - the complicated part is not in the 'operator' part, but in the templating. Which I would add.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a few more comments and a link to a more complete list that I found in the LHCb starterkit. If you want to change it I think you should be able to push to this branch

README.md Outdated
outputs (like `ExampleFunctionalProducerMultiple`) that can be used as a
template for the more typical case when working with multiple inputs or outputs.

`GaudiAlg` is deprecated and will be removed in future versions of Gaudi.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emphasize it visually

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean add an example here? Or at least the structure with the class definition and template arguments and operator()?


// Base class used for the Traits template argument of the
// Gaudi::Functional algorithms
struct BaseClass_t {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a badly chosen name - and not even name spaced. Do you have something more prescriptive what the purpose is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the idea is that the Gaudi examples always use a BaseClass_t that is defined to be typically Gaudi::Algorithm so by having this the changes are minimal. See for example https://gitlab.cern.ch/gaudi/Gaudi/-/blob/master/GaudiExamples/src/FunctionalAlgorithms/MakeAndConsume.cpp#L60 and a few lines below; In k4FWCore we don't need line 60 but we need instead #include "BaseClass.h" and the rest is exactly the same as in the Gaudi examples. We could also name it differently and have it namespaced but I thought for consistency it would be simpler this way.

k4FWCore/src/PodioDataSvc.cpp Outdated Show resolved Hide resolved
@jmcarcell
Copy link
Member Author

As discussed in a private meeting, and for the sake of time for a coming conference and tutorial, this PR will be merged by the end of today if there aren't any complains. pre-commit and the builds with the nightlies both are happy, so it should be ready to go

Copy link
Contributor

@tmadlener tmadlener left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a brief look at the examples and I do have some "style" comments.

#include <string>

// Which type of collection we are reading
using colltype = edm4hep::MCParticleCollection;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this typedef? It's effectively used twice (as far as I can see), and it reduces readability IMHO. Additionally, there are other examples with multiple collections that show the usage with a typedef which are much clearer, I think.

If not removing it, at least we should give it a more descriptive name.

#include <string>

// Which collection we are producing
using colltype = edm4hep::MCParticleCollection;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a similar fashion as above, I would remove this typedef.

Comment on lines 33 to 37
using Float_t = podio::UserDataCollection<float>;
using Particle_t = edm4hep::MCParticleCollection;
using SimTrackerHit_t = edm4hep::SimTrackerHitCollection;
using TrackerHit_t = edm4hep::TrackerHitCollection;
using Track_t = edm4hep::TrackCollection;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicking here: These typedefs use _t, but the example above doesn't. I think at least in the examples we should have consistency.

Comment on lines 35 to 43
using FloatColl = podio::UserDataCollection<float>;
using ParticleColl = edm4hep::MCParticleCollection;
using SimTrackerHitColl = edm4hep::SimTrackerHitCollection;
using TrackerHitColl = edm4hep::TrackerHitCollection;
using TrackColl = edm4hep::TrackCollection;

// As a simple example, we'll write an integer and a collection of MCParticles
using Counter_t = podio::UserDataCollection<int>;
using Particle_t = edm4hep::MCParticleCollection;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here as above, unless making the in- and outputs follow different conventions was a conscious choice?

@jmcarcell
Copy link
Member Author

I had a brief look at the examples and I do have some "style" comments.

All fixed. They have been changed through many iterations and at some point I became so used to them that I stopped noticing they were around

@jmcarcell jmcarcell merged commit 4eb5cd8 into main Oct 4, 2023
4 of 9 checks passed
@jmcarcell jmcarcell deleted the functional branch October 4, 2023 18:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants