Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Improved memory management for root files #628

Open
GitFuchs opened this issue Dec 9, 2024 · 4 comments
Open

Feature request: Improved memory management for root files #628

GitFuchs opened this issue Dec 9, 2024 · 4 comments

Comments

@GitFuchs
Copy link
Contributor

GitFuchs commented Dec 9, 2024

Hi,

I tried to run a simulation storing phase space data. The data stored can be extensive.
Currently, all data is stored in memory until the end of the simulation, then it is written to the hard drive.
Depending on the size of the phase space, this could be a few Gigabytes of data.

Do you think it would be feasible to implement a kind of direct storage? E.g. store the data to disk every x MB?

All the best,
Hermann

@BishopWolf
Copy link

BishopWolf commented Dec 9, 2024

Do you think it would be feasible to implement a kind of direct storage?

Like using heap space for this? I think this is the default, no? @tbaudier is this handled in python or C++ ?

store the data to disk every x MB?

Awfully difficult, we need several root files, one for each thread, with exclusive access to it, then combine every x minutes and at the end of the simulation combine all the root files. I think it is better to keep root things in heap space and store in disk only temporary results with non memory intense operations like stats or images.

Anyway, my 2c

@dsarrut
Copy link
Contributor

dsarrut commented Dec 11, 2024

Hi ! I dont think data are kept to memory and only written at the end. ROOT should take care of disk/memory tradeoff automatically. Are you sure ?

@GitFuchs
Copy link
Contributor Author

Hi,

looking at the output of the currently running simulation, it very much looks like it.
Here is a top of the currently running (multithreading) simulation.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31822 hfuchs 20 0 29,1g 19,2g 246528 S 685,7 30,7 16604:40 python

The root files are created on disk but have about 400 byte(!), most likely, that is just the header.
I am expecting (among some images) 3 output root files with about 7 GB each.
Looking at the RAM usage, that seems to match.

@BishopWolf
Copy link

Looking at the RAM usage, that seems to match.

That's not RAM but virtual memory (Heap). In any case, a utility to merge root files from memory and write to disk is needed. This shall be made no more than once per hour as it is really intensive and will slow down the simulation for some minutes!! The only advantage is that you will have temporary results. This would be useful only for very long simulations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants