Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak in IOSvc? #249

Closed
giovannimarchiori opened this issue Oct 9, 2024 · 6 comments
Closed

Memory leak in IOSvc? #249

giovannimarchiori opened this issue Oct 9, 2024 · 6 comments

Comments

@giovannimarchiori
Copy link
Contributor

giovannimarchiori commented Oct 9, 2024

Dear experts,

after migrating my code from k4DataSvc to IOSvc, my reconstruction jobs take a big amount of memory - so big that if I try to run over many events, or to run (as I used to do without problems with k4DataSvc) many jobs in parallel on my 96-core machine, the memory of the system (512 GB of ram) gets exhausted and I start getting many OS24 errors (too many open files).

I have checked that with a very simple steering script that only sets up the reading of a root file produced with ddsim and writes it to a new file, without running other algorithms, a job using IOSvc can take 20 GB of RAM (as observed checking the output of free -h) while for k4DataSvc the free RAM stays stable during the job.

I put my input file and two scripts, using either IOSvc or k4DataSvc, on lxplus

  • OS version: rockylinux9 (should be ~identical to alma9)
  • Compiler version: GCC 14
  • Package version: main branch
  • Reproduced by:
source /cvmfs/sw-nightlies.hsf.org/key4hep/setup.sh -r 2024-10-09 
k4run ~gmarchio/public/iosvc/digi_reco_iosvc.py # => large memory consumption
k4run ~gmarchio/public/iosvc/digi_reco_noiosvc.py # => negligible memory consumption

Could you please have a look?

Thanks a lot,
Giovanni

Tagging @BrieucF @jmcarcell @tmadlener

@BrieucF
Copy link
Contributor

BrieucF commented Oct 10, 2024

I can indeed reproduce the problem on Alma9 machines.

So I tried to regenerate a SIM file with the 2024-10-09 nightlies and the behavior for this new file looks as expected. Maybe that could be the problem?

What I find weird is is that the podio and edm4hep versions used to generate ~gmarchio/public/iosvc/ALLEGRO_sim.root are the same that are shipped with 2024-10-09.

@giovannimarchiori
Copy link
Contributor Author

How many events are there in your new file? Mine had 2000 and was pretty big - not sure if that matters

@tmadlener
Copy link
Contributor

Thanks for the report. I will have a look to figure out whether it's an issue with podio or whether this is something in the IOSvc.

@BrieucF
Copy link
Contributor

BrieucF commented Oct 10, 2024

How many events are there in your new file? Mine had 2000 and was pretty big - not sure if that matters

I generated 1000 events to have something similar to what you had. If useful, it is here: /afs/cern.ch/user/b/brfranco/work/public/giovanni_leak/ddsim_output_edm4hep.root.

@jmcarcell
Copy link
Member

There is a leak in the Writer (when running without writing to an output file I don't see that much memory usage) that #250 fixes for that case, but I'm not sure yet if that's complete.

@andresailer
Copy link
Contributor

Memory leak plugged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants