Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat!: link dRICH PID with Reconstructed*ChargedParticles #829

Merged
merged 16 commits into from
Sep 2, 2023

Conversation

c-dilks
Copy link
Member

@c-dilks c-dilks commented Aug 4, 2023

Briefly, what does this PR introduce?

  • Rename algorithm+factory+config */tracking/ParticlesWithTruthPID -> */pid/ParticlesWithPID
    • this is more a PID linking algorithm and less a tracking algorithm, since it links PID to tracks
    • it now links dRICH PID, in addition to the MC PDG, using the same proximity matching procedure the original ParticlesWithTruthPID had
    • some flexibility is left open for other PID detectors to add their links
    • the relevant part of the plugin is moved from tracking.cc to pid.cc
    • documentation (mermaid) updated
  • produce new collections Reconstructed*ChargedParticleIDs, to hold the final edm4hep::ParticleID collections
    • algorithm ConvertParticleID.h is used to convert from edm4eic::CherenkovParticleID to edm4hep::ParticleID
  • delete deprecated factory IrtParticleID_factory

What kind of change does this PR introduce?

  • Bug fix (issue #__)
  • New feature (issue #__)
  • Documentation update
  • Other: __

Please check if this PR fulfills the following:

  • Tests for the changes have been added
  • Documentation has been added / updated
  • Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

No

Does this PR change default behavior?

Yes, this PR will enable dRICH PID in production, since Reconstructed*ChargedParticles collections now depend on dRICH PID; cf. #734

@c-dilks
Copy link
Member Author

c-dilks commented Aug 4, 2023

I don't have time to work on this further, so commits are welcome on this branch by anyone willing to finish the job.

@c-dilks c-dilks mentioned this pull request Aug 4, 2023
30 tasks
@wdconinc
Copy link
Contributor

Two comments.

While I agree that we should move away from truth PID, and I think this should get merged by September, we don't have (and won't have) any realistic PID by then for the pfRICH and for the DIRC. In the short term we will either need to run with both the truth PID and a realistic PID which is incomplete, or with a mixed bag of realistic PID where available, with truth PID as fall back. I don't think we can remove the truth PID entirely though.

Do you have an idea of what is causing the crashes, and is someone looking into this? Again, I think this PR should be a priority for getting ready by September, so it would be good if we can make sure we know who is going to take it on. I am happy to do so if no one else is.

@c-dilks
Copy link
Member Author

c-dilks commented Aug 14, 2023

In the short term we will either need to run with both the truth PID and a realistic PID which is incomplete, or with a mixed bag of realistic PID where available, with truth PID as fall back.

This is what this PR proposes: the truth PID is still used, and this PR adds the dRICH PID for charged particles that pass through its radiators. I renamed ___/tracking/ParticlesWithTruthPID.* to ___/pid/ParticlesWithPID.*, so take a look at the diff between ParticlesWithPID on this branch and ParticlesWithTruthPID on main:

header diff
2c2
< // Copyright (C) 2022 Sylvester Joosten, Wouter Deconinck, Dmitry Romanov
---
> // Copyright (C) 2022, 2023, Sylvester Joosten, Wouter Deconinck, Dmitry Romanov, Christopher Dilks
9a10
> #include <cstddef>
16a18,19
> #include <edm4eic/CherenkovParticleIDCollection.h>
> #include <edm4hep/ParticleIDCollection.h>
19c22,24
< #include "ParticlesWithTruthPIDConfig.h"
---
> #include "ParticlesWithPIDConfig.h"
> #include "ConvertParticleID.h"
> #include "Tools.h"
24c29,33
<     using ParticlesWithAssociationNew = std::pair<std::unique_ptr<edm4eic::ReconstructedParticleCollection>, std::unique_ptr<edm4eic::MCRecoParticleAssociationCollection>>;
---
>     struct ParticlesWithAssociationNew {
>         std::unique_ptr<edm4eic::ReconstructedParticleCollection>     parts;
>         std::unique_ptr<edm4eic::MCRecoParticleAssociationCollection> assocs;
>         std::unique_ptr<edm4hep::ParticleIDCollection>                pids;
>     };
26c35
<     class ParticlesWithTruthPID : public WithPodConfig<ParticlesWithTruthPIDConfig> {
---
>     class ParticlesWithPID : public WithPodConfig<ParticlesWithPIDConfig> {
34c43,45
<                 const edm4eic::TrackParametersCollection* track_params);
---
>                 const edm4eic::TrackParametersCollection* track_params,
>                 std::vector<const edm4eic::CherenkovParticleIDCollection*> cherenkov_pid_collections
>                 );
40a52,57
> 
>         bool linkCherenkovPID(
>                 edm4eic::MutableReconstructedParticle& in_part,
>                 const edm4eic::CherenkovParticleIDCollection& in_pids,
>                 edm4hep::ParticleIDCollection& out_pids
>                 );
source diff
2c2
< // Copyright (C) 2022 Sylvester Joosten, Wouter Deconinck, Dmitry Romanov
---
> // Copyright (C) 2022, 2023, Sylvester Joosten, Wouter Deconinck, Dmitry Romanov, Christopher Dilks
4c4
< #include "ParticlesWithTruthPID.h"
---
> #include "ParticlesWithPID.h"
6a7
> #include <edm4hep/utils/vector_utils.h>
12c13
<     void ParticlesWithTruthPID::init(std::shared_ptr<spdlog::logger> logger) {
---
>     void ParticlesWithPID::init(std::shared_ptr<spdlog::logger> logger) {
17c18
<     ParticlesWithAssociationNew ParticlesWithTruthPID::process(
---
>     ParticlesWithAssociationNew ParticlesWithPID::process(
19c20,22
<             const edm4eic::TrackParametersCollection* track_params) {
---
>             const edm4eic::TrackParametersCollection* track_params,
>             std::vector<const edm4eic::CherenkovParticleIDCollection*> cherenkov_pid_collections
>             ) {
24,25c27,30
<         auto reco_particles = std::make_unique<edm4eic::ReconstructedParticleCollection>();
<         auto associations = std::make_unique<edm4eic::MCRecoParticleAssociationCollection>();
---
>         ParticlesWithAssociationNew out_colls;
>         out_colls.parts  = std::make_unique<edm4eic::ReconstructedParticleCollection>();
>         out_colls.assocs = std::make_unique<edm4eic::MCRecoParticleAssociationCollection>();
>         out_colls.pids   = std::make_unique<edm4hep::ParticleIDCollection>();
106c111
<             auto rec_part = edm4eic::MutableReconstructedParticle();
---
>             auto rec_part = out_colls.parts->create();
130c135
<             rec_part.setGoodnessOfPID(1); // perfect PID
---
>             rec_part.setGoodnessOfPID(0); // assume no PID until proven otherwise
133,134c138,147
<             // Add reconstructed particle to collection BEFORE doing association
<             reco_particles->push_back(rec_part);
---
> 
>             // link Cherenkov PID objects
>             for (const auto& cherenkov_pids : cherenkov_pid_collections) {
>                 auto success = linkCherenkovPID(rec_part, *cherenkov_pids, *(out_colls.pids));
>                 if (success)
>                     m_log->trace("      true PDG vs. CherenkovPID PDG: {:>10} vs. {:<10}",
>                             best_pid,
>                             rec_part.getParticleIDUsed().isAvailable() ? rec_part.getParticleIDUsed().getPDG() : 0
>                             );
>             }
138c151
<                 auto rec_assoc = edm4eic::MutableMCRecoParticleAssociation();
---
>                 auto rec_assoc = out_colls.assocs->create();
146,148d158
<                 // Add association to collection
<                 associations->push_back(rec_assoc);
< 
161a172,178
> 
>                     m_log->trace(" Assoc PDGs: sim.PDG | rec.PDG | rec.particleIDUsed.PDG = {:^6} | {:^6} | {:^6}",
>                                  rec_assoc.getSim().getPDG(),
>                                  rec_assoc.getRec().getPDG(),
>                                  rec_assoc.getRec().getParticleIDUsed().isAvailable() ? rec_assoc.getRec().getParticleIDUsed().getPDG() : 0);
> 
> 
170,171c187
<         // Assembling the results
<         return std::make_pair(std::move(reco_particles), std::move(associations));
---
>         return out_colls;
174c190
<     void ParticlesWithTruthPID::tracePhiToleranceOnce(const double sinPhiOver2Tolerance, double phiTolerance) {
---
>     void ParticlesWithPID::tracePhiToleranceOnce(const double sinPhiOver2Tolerance, double phiTolerance) {
179a196,326
>     }
> 
> 
>     /* link PID objects to input particle
>      * - finds `CherenkovParticleID` object in `in_pids` associated to particle `in_part`
>      *   by proximity matching to the associated track
>      * - converts this `CherenkovParticleID` object's PID hypotheses to `ParticleID` objects,
>      *   relates them to `in_part`, and adds them to the collection `out_pids` for persistency
>      * - returns `true` iff PID objects were found and linked
>      */
>     bool ParticlesWithPID::linkCherenkovPID(
>             edm4eic::MutableReconstructedParticle& in_part,
>             const edm4eic::CherenkovParticleIDCollection& in_pids,
>             edm4hep::ParticleIDCollection& out_pids
>             )
>     {
> 
>         // skip this particle, if neutral
>         if (std::abs(in_part.getCharge()) < 0.001)
>             return false;
> 
>         // structure to store list of candidate matches
>         struct ProxMatch {
>             double      match_dist;
>             std::size_t pid_idx;
>         };
>         std::vector<ProxMatch> prox_match_list;
> 
>         // get input reconstructed particle momentum angles
>         auto in_part_p   = in_part.getMomentum();
>         auto in_part_eta = edm4hep::utils::eta(in_part_p);
>         auto in_part_phi = edm4hep::utils::angleAzimuthal(in_part_p);
>         m_log->trace("Input particle: (eta,phi) = ( {:>5.4}, {:>5.4} deg )",
>                 in_part_eta,
>                 in_part_phi * 180.0 / M_PI
>                 );
> 
>         // loop over input CherenkovParticleID objects
>         for (std::size_t in_pid_idx = 0; in_pid_idx < in_pids.size(); in_pid_idx++) {
>             auto in_pid = in_pids.at(in_pid_idx);
> 
>             // get charged particle track associated to this CherenkovParticleID object
>             auto in_track = in_pid.getChargedParticle();
>             if (!in_track.isAvailable()) {
>                 m_log->error("found CherenkovParticleID object with no chargedParticle");
>                 return false;
>             }
>             if (in_track.points_size() == 0) {
>                 m_log->error("found chargedParticle for CherenkovParticleID, but it has no TrackPoints");
>                 return false;
>             }
> 
>             // get averge momentum direction of the track's TrackPoints
>             decltype(edm4eic::TrackPoint::momentum) in_track_p{0.0, 0.0, 0.0};
>             for (const auto& in_track_point : in_track.getPoints())
>                 in_track_p = in_track_p + ( in_track_point.momentum / in_track.points_size() );
>             auto in_track_eta = edm4hep::utils::eta(in_track_p);
>             auto in_track_phi = edm4hep::utils::angleAzimuthal(in_track_p);
> 
>             // calculate dist(eta,phi)
>             auto match_dist = std::hypot(
>                     in_part_eta - in_track_eta,
>                     in_part_phi - in_track_phi
>                     );
> 
>             // check if the match is close enough: within user-specified tolerances
>             auto match_is_close =
>                 std::abs(in_part_eta - in_track_eta) < m_cfg.etaTolerance &&
>                 std::abs(in_part_phi - in_track_phi) < m_cfg.phiTolerance;
>             if (match_is_close)
>                 prox_match_list.push_back(ProxMatch{match_dist, in_pid_idx});
> 
>             // logging
>             m_log->trace("  - (eta,phi) = ( {:>5.4}, {:>5.4} deg ),  match_dist = {:<5.4}{}",
>                     in_track_eta,
>                     in_track_phi * 180.0 / M_PI,
>                     match_dist,
>                     match_is_close ? " => CLOSE!" : ""
>                     );
> 
>         } // end loop over input CherenkovParticleID objects
> 
>         // check if at least one match was found
>         if (prox_match_list.size() == 0) {
>             m_log->warn("no matching CherenkovParticleID found for this particle");
>             return false;
>         }
> 
>         // choose the closest matching CherenkovParticleID object corresponding to this input reconstructed particle
>         auto closest_prox_match = *std::min_element(
>                 prox_match_list.begin(),
>                 prox_match_list.end(),
>                 [] (ProxMatch a, ProxMatch b) { return a.match_dist < b.match_dist; }
>                 );
>         auto in_pid_matched = in_pids.at(closest_prox_match.pid_idx);
>         m_log->trace("  => best match: match_dist = {:<5.4} at idx = {}",
>                 closest_prox_match.match_dist,
>                 closest_prox_match.pid_idx
>                 );
> 
>         // convert `CherenkovParticleID` object's hypotheses => set of `ParticleID` objects
>         auto out_pid_index_map = ConvertParticleID::ConvertToParticleIDs(in_pid_matched, out_pids, true);
>         if (out_pid_index_map.size() == 0) {
>             m_log->error("found CherenkovParticleID object with no hypotheses");
>             return false;
>         }
> 
>         // relate matched ParticleID objects to output particle
>         bool first = true;
>         for (const auto& [out_pids_index, out_pids_id] : out_pid_index_map) {
>             const auto& out_pid = out_pids->at(out_pids_index);
>             if (out_pid.id() != out_pids_id) { // sanity check
>                 m_log->error("indexing error in `edm4eic::ParticleID` collection");
>                 return false;
>             }
>             in_part.addToParticleIDs(out_pid);
>             if (first) {
>                 in_part.setParticleIDUsed(out_pid); // highest likelihood is the first
>                 in_part.setGoodnessOfPID(1); // FIXME: not used yet, aside from 0=noPID vs 1=hasPID
>                 first = false;
>             }
>         }
> 
>         // trace logging
>         m_log->trace("    {:.^50}"," PID results ");
>         m_log->trace("      Hypotheses (sorted):");
>         for (auto out_pid : in_part.getParticleIDs())
>             Tools::PrintHypothesisTableLine(m_log, out_pid, 8);
>         m_log->trace("    {:'^50}","");
> 
>         return true;

@c-dilks
Copy link
Member Author

c-dilks commented Aug 14, 2023

Do you have an idea of what is causing the crashes, and is someone looking into this?

Looking into it now...

Again, I think this PR should be a priority for getting ready by September, so it would be good if we can make sure we know who is going to take it on. I am happy to do so if no one else is.

It seems that both @chchatte92 and I have other higher priorities, so it would be great if you could take over here. We have this tested in reconstruction_benchmarks, though admittedly this final PID linking part is the least tested algorithm we use; it would be good to have unit testing here too (perhaps as a separate PR).

@github-actions github-actions bot added topic: documentation Improvements or additions to documentation topic: tracking Relates to tracking reconstruction topic: PID Relates to PID reconstruction labels Aug 28, 2023
@c-dilks
Copy link
Member Author

c-dilks commented Aug 30, 2023

More conflicts with main... fixing now...

@c-dilks
Copy link
Member Author

c-dilks commented Aug 30, 2023

Do you have an idea of what is causing the crashes, and is someone looking into this?

I was not able to figure out this issue.

src/algorithms/pid/ParticlesWithPID.cc Outdated Show resolved Hide resolved
src/algorithms/pid/ParticlesWithPID.cc Outdated Show resolved Hide resolved
src/algorithms/pid/ParticlesWithPID.h Outdated Show resolved Hide resolved
src/algorithms/pid/ParticlesWithPID.h Outdated Show resolved Hide resolved
@wdconinc
Copy link
Contributor

Do you have an idea of what is causing the crashes, and is someone looking into this?

I was not able to figure out this issue.

In any case, it seems that this doens't cause any crashes anymore (but, it may be because it's not enabled by default)?

Some minor comments on review.

Looking at the diffs with main and not quite sure why this is seeing things where there shouldn't be diffs.

c-dilks and others added 2 commits August 30, 2023 17:42
@c-dilks
Copy link
Member Author

c-dilks commented Aug 30, 2023

Do you have an idea of what is causing the crashes, and is someone looking into this?

I was not able to figure out this issue.

In any case, it seems that this doens't cause any crashes anymore (but, it may be because it's not enabled by default)?

Some minor comments on review.

Looking at the diffs with main and not quite sure why this is seeing things where there shouldn't be diffs.

This PR will enable dRICH PID by default, since now ReconstructedChargedParticles depends on DRICHMergedIrtCherenkovParticleID, from the dRICH algorithm DAG.

EDIT: so we will get edm4hep::ParticleID objects linked to each reconstructed charged particle (that passed through the dRICH).

// link charged particles to PID and to MC truth
app->Add(new JChainMultifactoryGeneratorT<ParticlesWithPID_factory>(
"ChargedParticlesWithAssociations",
{
"MCParticles", // edm4hep::MCParticle
"outputTrackParameters", // edm4eic::TrackParameters
"DRICHMergedIrtCherenkovParticleID" // edm4eic::CherenkovParticleID
},
{
"ReconstructedChargedParticles", // edm4eic::ReconstructedParticle
"ReconstructedChargedParticleAssociations", // edm4eic::MCRecoParticleAssociation
"ReconstructedChargedParticleIDs" // edm4hep::ParticleID
},
link_cfg,
app
));

@wdconinc
Copy link
Contributor

we will get edm4hep::ParticleID objects

Right, those are expected. I'm also seeing diffs in the calorimeter hits... That's the unexpected part.

Copy link
Contributor

@wdconinc wdconinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, there's this change which will be required since we don't have outputTrackParameters anymore.

src/global/pid/pid.cc Outdated Show resolved Hide resolved
src/global/pid/pid.cc Outdated Show resolved Hide resolved
@wdconinc
Copy link
Contributor

Ok, reenabling the tracking branch and the crash is back :-) No freebies here.

Copy link
Contributor

@wdconinc wdconinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong with my previous suggestion to replace outputTrackParameters with CentralCKFTrackParameters. We actually replaced that at the last minute with CentralCKFTrajectories.

Now this doesn't segfault because it gets passed the wrong collections (really, segfault, that's not even what should happen)...

src/global/pid/pid.cc Outdated Show resolved Hide resolved
src/global/pid/pid.cc Outdated Show resolved Hide resolved
@wdconinc
Copy link
Contributor

Based on local tests, this will still segfault but now in src/algorithms/reco/ElectronReconstruction.cc,

out_electrons->push_back(reco_part.clone());

@wdconinc
Copy link
Contributor

The remaining issue here manifests as a memory error (stack-use-after-scope) that fails the execution due to the address sanitizer:

What I think is happening is the following. Consider:

std::map<const char*, double> rIndices;
rIndices.insert({RadiatorName(kGas).c_str(), 1.00076});
rIndices.insert({RadiatorName(kAerogel).c_str(), 1.0190});
rIndices.insert({"Filter", 1.5017});
for (auto const& [rName, rIndex] : rIndices) {
auto rad = m_irtDetector->GetRadiator(rName);
if (rad)
rad->SetReferenceRefractiveIndex(rIndex);
}

with RadiatorName defined as follows:
static std::string RadiatorName(int num, std::shared_ptr<spdlog::logger> m_log=nullptr) {
if(num==kAerogel) return "Aerogel";
else if(num==kGas) return "Gas";
else {
PrintError(m_log, "unknown radiator number {}", num);
return "UNKNOWN_RADIATOR";
}
}

  • RadiatorName(kGas) returns a (unnamed temporary) string, which is in the scope of the initializer list only. The temporary is destroyed when the initializer list closes, or at the latest at the end of the line (lifetime of temporaries is only for as long as needed in the expression).
  • The c_str() on the string gets a const char* to the content of the temporary, valid for as long as the string exists and not invalidated when the string goes out of scope.
  • The map ends up storing a const char* that points to temporary memory which is deallocated almost right away.
  • When attempting to use rName this immediately fails.

Something like this may fix the issue, just avoiding const char*:

diff --git a/src/services/geometry/richgeo/IrtGeoDRICH.cc b/src/services/geometry/richgeo/IrtGeoDRICH.cc
index 10f28996..8276f44f 100644
--- a/src/services/geometry/richgeo/IrtGeoDRICH.cc
+++ b/src/services/geometry/richgeo/IrtGeoDRICH.cc
@@ -170,12 +170,12 @@ void richgeo::IrtGeoDRICH::DD4hep_to_IRT() {
   } // sector loop
 
   // set reference refractive indices // NOTE: numbers may be overridden externally
-  std::map<const char*, double> rIndices;
-  rIndices.insert({RadiatorName(kGas).c_str(),     1.00076});
-  rIndices.insert({RadiatorName(kAerogel).c_str(), 1.0190});
+  std::map<const std::string, double> rIndices;
+  rIndices.insert({RadiatorName(kGas),     1.00076});
+  rIndices.insert({RadiatorName(kAerogel), 1.0190});
   rIndices.insert({"Filter",                       1.5017});
   for (auto const& [rName, rIndex] : rIndices) {
-    auto rad = m_irtDetector->GetRadiator(rName);
+    auto rad = m_irtDetector->GetRadiator(rName.c_str());
     if (rad)
       rad->SetReferenceRefractiveIndex(rIndex);
   }

@chchatte92 @c-dilks I'm going to let you fix this (within this PR), since it is likely that there are other places in the code base where you might have used a similar pattern and you would know best where those cases might be.

@wdconinc
Copy link
Contributor

Local check with the proposed diff passes when address sanitizer enabled.

Copy link
Contributor

@wdconinc wdconinc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remaining issue with the address sanitizer was due to not having #880 merged in this branch.

Not sure why it was resulting in a heap-user-after-free on the reconstructed particle associations, but likely the incorrect associations were showing to podio or JANA2 that something was not in use anymore when it still was accessible.

@wdconinc wdconinc enabled auto-merge September 1, 2023 19:55
auto-merge was automatically disabled September 1, 2023 19:58

Merge queue setting changed

@wdconinc wdconinc added this pull request to the merge queue Sep 2, 2023
Merged via the queue into main with commit b23007f Sep 2, 2023
@wdconinc wdconinc deleted the pr/link-drich-pid branch September 2, 2023 01:36
github-merge-queue bot pushed a commit that referenced this pull request May 6, 2024
The original approach in #829 was to add more functionality into a
single factory. This pushes back on that by separating the two
responsibilities into two algorithms. This should improve modularity.
This also converts algorithms to the algorithms interface. The factory
code is rewritten from scratch and licensed to LGPL3.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
ajentsch pushed a commit that referenced this pull request May 20, 2024
The original approach in #829 was to add more functionality into a
single factory. This pushes back on that by separating the two
responsibilities into two algorithms. This should improve modularity.
This also converts algorithms to the algorithms interface. The factory
code is rewritten from scratch and licensed to LGPL3.

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: documentation Improvements or additions to documentation topic: PID Relates to PID reconstruction topic: tracking Relates to tracking reconstruction
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants