-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix #40 #42
base: master
Are you sure you want to change the base?
fix #40 #42
Conversation
tested with:
|
not sure we need to test it ... |
Test is not necessary and is up to you. I see there are plenty of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I think it is good to have tests so regression won't happen. I mean if someone for some reason decides to make those properties @cached_property again, those tests would fail then.
Before this goes in, can we figure out what is holding the h5 handle, and why using a |
@mgeplf From a perf point of view: For edges, this is way more tricky because the _population function is used much more. The most affected function is the iter_connection because a new instance is created for every iterator. I did not think about this function yesterday. I need to find a solution here because the perf downgrade is very significant. The problem comes from libsonata:
Fails also for a NodeStorage. The culprits are probably: Since the these root groups are member of both classes, I expect them to keep the context active during all the lifetime of the object instance. |
This problem suddenly became The error is
|
Don t know what is PizDaint... Well, I really spent a lot of time on this today and I don't think there is a miracle solution... For nodes it is ok, but edges with iter_connection (which calls other functions a lot) or some typical "for loop" recipes like:
managing the lifetime of the libsonata objects is a mess. I tried with context manager, decorators etc at the population level and there is always either a huge downgrade in perf or in "userfriendliness" or maintability. I don't want to create private functions that takes the libsonata object as an argument with patterns like:
And anyway, with that, we will still have the for loop problem. The only viable solution is a I tried creating a context manager for the circuit but since you cannot del the object in the exit the behaviour is strange and error prone.
and I don t want to have some random flags that prevent you to open the nodes/edges if a context is open or not. So at the end, I just added a close_context() to circuit and nodes/edges. |
@mgeplf , @asanin-epfl pls tell me what you think about this solution. |
seems fine to me. |
let me read this more carefully than yesterday |
how are you testing performance? Can you give an example that is particularly impacted by just opening the population each time? |
sure let me few minutes. |
All ipython scripts start with:
and are from a new call of ipython. iter_connection:
Version with a new instance everytime (3079c8c):
properties:
vs
As expected the property is not super impacted and the overhead is constant ~1ms per new instance. Typical for loops:
vs
|
I think you have an exemple with these values of the perf downgrade. |
I'm wondering whether all this tracking is overkill: modifying the h5 file underneath of the library isn't something that I think we should handle. I'm wondering, instead, if we should just make sure that any h5 handles are purged if the |
I kinda agree with you for the quite specific use case. |
That should be handled a different way, no? Following the pickle protocol, and only saving the paths & population names? And being able to rebuild the object that way? |
Yes! Pickling would be also great but not critical. I've managed to live without it. |
Yes sure it is easy to path the config and that's it. That's what aleksei is doing already. |
But that means we don't have to worry about invalidating the cache, right? |
Just to be sure, what I mean is to path the circuit_config to the worker and building the circuit objetc inside the function:
? |
What do you mean by that? If I open the same circuit in another process => I will have a lock failure. Currently this problem does not show up only because I have two circuits. I make a copy of original. I use the original for reading and I use the copy for writing the reduced. It is kind of hacky solution. It does not work in the new API where I provide an inplace reduction because I don't have the copied circuit. Just drop the H5 handle when you don't use it! Please ='( |
Btw, this :
was one of my first test. But this is not so easy (impossible ?) to achieve due to the multiple circular references between objects in snap. Circuit lies inside the Storages which lie inside the populations... So the only solutions I guess are:
|
That's fine. However, if don't have any references, and have |
just an illustration: with :
and:
it fails.
is ok. |
The del in python is a bit ... misleading ... |
As a temporary solution (if it s really critical) @asanin-epfl you can use the last snippet I gave in: |
Thank you. I will use it then in my code for now. |
Ok this is the very last thing I can think of, which is "acceptable" (thx @wizmer ):
Works.
Will raise just like open() context manager. But this does allow to open close open close etc ... You will need to reopen the context:
|
People can do this:
So I need to propagate the flags to the nodes/edges also. |
Running out of ideas ... |
so, any resolution here? |
Not yet - you want to try making it so that if one |
I am not sure the |
I am talking about the situation when import h5py
from bluepysnap import Circuit
def close_sonata_circuit(circuit: Circuit):
"""Close h5 file handles opened by SONATA circuit
Temporary function until https://github.com/BlueBrain/snap/pull/42 is resolved.
Args:
circuit: SONATA circuit
"""
def _close_context(pop):
"""Close the h5 context for population."""
if "_population" in pop.__dict__:
del pop.__dict__["_population"]
if circuit.nodes:
for population in circuit.nodes.values():
_close_context(population)
if circuit.edges:
for population in circuit.edges.values():
_close_context(population)
circuit = Circuit("tests/data/9cells/bglibpy_circuit_config.json")
node_pop = circuit.nodes["cortex"]
node_ids = node_pop.ids()
node_get = node_pop.get(group=[1])
ns = node_pop.size
edge_pop = circuit.edges["excvirt_cortex"]
source_node = edge_pop.source
ss = source_node.size
es = edge_pop.size
prop = edge_pop.properties([1, 2, 3, 4], ["depression_time"])
close_sonata_circuit(circuit)
del circuit
with h5py.File("tests/data/9cells/network/bglibpy/nodes/cortex.h5", "r+") as h5:
print(list(h5["/nodes/"]))
|
Given that SNAP is meant to be a productivity layer, IMO we should not be worried about supporting mutating circuits as a use case. However, we should make sure the basic functionality does not leave any dangling handles. |
We fixed @asanin-epfl problem with: 600d638 But the question is still open: how to elegantly kill all the contexts? |
No description provided.