Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix #40 #42

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

fix #40 #42

wants to merge 6 commits into from

Conversation

tomdele
Copy link
Contributor

@tomdele tomdele commented Mar 17, 2020

No description provided.

@tomdele tomdele linked an issue Mar 17, 2020 that may be closed by this pull request
@tomdele tomdele requested a review from asanin-epfl March 17, 2020 17:16
@tomdele
Copy link
Contributor Author

tomdele commented Mar 17, 2020

tested with:

from bluepysnap import Circuit
import h5py


circuit = Circuit("circuit_config_test.json")
node_pop = circuit.nodes["population"]
node_ids = node_pop.ids()
node_get = node_pop.get(group=[1])
ns = node_pop.size

edge_pop = circuit.edges["population"]
source_node = edge_pop.source
ss = source_node.size
es = edge_pop.size
prop = edge_pop.properties([1,2,3,4], ["depression_time"])

with h5py.File("nodes.sonata", "r+") as h5:
    print(list(h5["/nodes/"]))

with h5py.File("edges.sonata", "r+") as h5:
    print(list(h5["/edges/"]))

@tomdele
Copy link
Contributor Author

tomdele commented Mar 17, 2020

not sure we need to test it ...

@asanin-epfl
Copy link
Contributor

Test is not necessary and is up to you. I see there are plenty of @cached_property used. Should we consider them also?

Copy link
Contributor

@asanin-epfl asanin-epfl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I think it is good to have tests so regression won't happen. I mean if someone for some reason decides to make those properties @cached_property again, those tests would fail then.

@mgeplf
Copy link
Contributor

mgeplf commented Mar 18, 2020

Before this goes in, can we figure out what is holding the h5 handle, and why using a cached_property fixes it?

@tomdele
Copy link
Contributor Author

tomdele commented Mar 18, 2020

@mgeplf
That s the opposite: I used a cached property and now I create a new local object everytime in functions which need libsonata accessor. This object is gc at the end of the processing and the context is freed.

From a perf point of view:
For nodes, this is called once for the data collection (_data function) that are cached in a dataframe afterwards and for simplistic function such as size and property_names that are cached also. So we do not even have overheads due to the uncached population.

For edges, this is way more tricky because the _population function is used much more. The most affected function is the iter_connection because a new instance is created for every iterator. I did not think about this function yesterday. I need to find a solution here because the perf downgrade is very significant.

The problem comes from libsonata:

from libsonata import NodePopulation
import h5py

nodes = NodePopulation("nodes.sonata", "", 'hippocampus_neurons')

with h5py.File("nodes.sonata", "r+") as h5:
    print(list(h5["/nodes/"]))

Fails also for a NodeStorage.

The culprits are probably:
https://github.com/BlueBrain/libsonata/blob/master/src/population.hpp#L154
https://github.com/BlueBrain/libsonata/blob/master/src/population.hpp#L177

Since the these root groups are member of both classes, I expect them to keep the context active during all the lifetime of the object instance.

@asanin-epfl
Copy link
Contributor

asanin-epfl commented Mar 18, 2020

This problem suddenly became critical not anymore, it is due to my code in the end. It happens on PizDaint with the currently deployed version of sonata-network-reduction ([email protected], [email protected])

The error is OSError: Unable to open file (file is already open for read-only)

Status code: 200
----------------
Traceback (most recent call last):
  File "test_python.py", line 21, in <module>
    with h5py.File("edges.h5", "r+") as h5:
  File "/apps/hbp/ich002/hbp-spack-deployments/softwares/27-02-2020/install/install/cray-cnl7-haswell/gcc-8.3.0/py-h5py-2.10.0-skwvif/lib/python3.6/site-packages/h5py/_hl/files.py", line 408, in __init__
    swmr=swmr)
  File "/apps/hbp/ich002/hbp-spack-deployments/softwares/27-02-2020/install/install/cray-cnl7-haswell/gcc-8.3.0/py-h5py-2.10.0-skwvif/lib/python3.6/site-packages/h5py/_hl/files.py", line 175, in make_fid
    fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (file is already open for read-only)

@tomdele
Copy link
Contributor Author

tomdele commented Mar 18, 2020

Don t know what is PizDaint...

Well, I really spent a lot of time on this today and I don't think there is a miracle solution...

For nodes it is ok, but edges with iter_connection (which calls other functions a lot) or some typical "for loop" recipes like:

for node_id in node_ids:
    edges.efferent_edges(node_id)
    ...

managing the lifetime of the libsonata objects is a mess. I tried with context manager, decorators etc at the population level and there is always either a huge downgrade in perf or in "userfriendliness" or maintability.

I don't want to create private functions that takes the libsonata object as an argument with patterns like:

    def _pathway_edges(self, population, node_id, properties=None):
           ....

    def pathway_edges(self, node_id, properties=None):
       population = self._population()
        return self._pathway_edges(population, source=node_id, target=None, properties=properties)

And anyway, with that, we will still have the for loop problem.

The only viable solution is a close_context() for populations and circuit.

I tried creating a context manager for the circuit but since you cannot del the object in the exit the behaviour is strange and error prone.

with Circuit("circuit_config.json") as circuit:
    edges = circuit.edges["pop"]
    edges.size

edges = circuit.edges["pop"]
edges.efferent_nodes(12)
# the context is reopen

and I don t want to have some random flags that prevent you to open the nodes/edges if a context is open or not.

So at the end, I just added a close_context() to circuit and nodes/edges.

@tomdele
Copy link
Contributor Author

tomdele commented Mar 18, 2020

@mgeplf , @asanin-epfl pls tell me what you think about this solution.

@asanin-epfl
Copy link
Contributor

seems fine to me.

@mgeplf
Copy link
Contributor

mgeplf commented Mar 19, 2020

let me read this more carefully than yesterday

@mgeplf
Copy link
Contributor

mgeplf commented Mar 19, 2020

how are you testing performance? Can you give an example that is particularly impacted by just opening the population each time?

@tomdele
Copy link
Contributor Author

tomdele commented Mar 19, 2020

how are you testing performance? Can you give an example that is particularly impacted by just opening the population each time?

sure let me few minutes.

@tomdele
Copy link
Contributor Author

tomdele commented Mar 19, 2020

All ipython scripts start with:

In [1]: from bluepysnap import Circuit

In [2]: edges = Circuit("circuit_config.json").edges["pop"]

and are from a new call of ipython.

iter_connection:
Version with the always open context (00fd7d6):

In [3]: %timeit for e in edges.iter_connections(source=np.arange(1000), target=np.arange(1000)): pass
1.57 s ± 245 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit for e in edges.iter_connections(source=np.arange(10000), target=np.arange(10000)): pass
19.5 s ± 1.93 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

Version with a new instance everytime (3079c8c):

In [3]: %timeit for e in edges.iter_connections(source=np.arange(1000), target=np.arange(1000)): pass
3.41 s ± 470 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [4]: %timeit for e in edges.iter_connections(source=np.arange(10000), target=np.arange(10000)): pass
38.1 s ± 3.39 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

properties:

In [3]: %timeit edges.properties(np.arange(1000), list(edges.property_names)[:10])
11.6 ms ± 2.63 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]: %timeit edges.properties(np.arange(1000), list(edges.property_names)[:20])
15.5 ms ± 293 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [5]: %timeit edges.properties(np.arange(100000), list(edges.property_names)[:20])
93.1 ms ± 5.45 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

vs

In [3]: %timeit edges.properties(np.arange(1000), list(edges.property_names)[:10])
18.3 ms ± 1.19 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [4]: %timeit edges.properties(np.arange(1000), list(edges.property_names)[:20])
35.5 ms ± 1.72 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [5]: %timeit edges.properties(np.arange(100000), list(edges.property_names)[:20])
97.4 ms ± 1.86 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

As expected the property is not super impacted and the overhead is constant ~1ms per new instance.

Typical for loops:

In [4]: %timeit for node in np.arange(10): edges.efferent_nodes(node)
600 ms ± 15.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: %timeit for node in np.arange(100): edges.efferent_nodes(node)
3.67 s ± 847 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: %timeit for node in np.arange(100): edges.afferent_nodes(node)
130 ms ± 7.73 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit for node in np.arange(1000): edges.afferent_nodes(node)
1.63 s ± 172 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

vs

In [4]: %timeit for node in np.arange(10): edges.efferent_nodes(node)
598 ms ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [5]: %timeit for node in np.arange(100): edges.efferent_nodes(node)
4.61 s ± 1.52 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [6]: %timeit for node in np.arange(100): edges.afferent_nodes(node)
297 ms ± 6.96 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [7]: %timeit for node in np.arange(1000): edges.afferent_nodes(node)
3.93 s ± 1.12 s per loop (mean ± std. dev. of 7 runs, 1 loop each)

@tomdele
Copy link
Contributor Author

tomdele commented Mar 19, 2020

I think you have an exemple with these values of the perf downgrade.
I don t have circuits big enough to run it with much more than 10000 nodes.

@mgeplf
Copy link
Contributor

mgeplf commented Mar 19, 2020

I'm wondering whether all this tracking is overkill: modifying the h5 file underneath of the library isn't something that I think we should handle. I'm wondering, instead, if we should just make sure that any h5 handles are purged if the circuit object is deleted (which is currently not the case). Does that work for you @asanin-epfl?

@tomdele
Copy link
Contributor Author

tomdele commented Mar 19, 2020

I'm wondering whether all this tracking is overkill: modifying the h5 file underneath of the library isn't something that I think we should handle.

I kinda agree with you for the quite specific use case.
I did not mentioned it but I also wanted to solve an other problem with that: we can't pickle libsonata objects. So having something to invalidate the cache allows circuit to be pickled which can be useful for multiprocessing.

@mgeplf
Copy link
Contributor

mgeplf commented Mar 19, 2020

we can't pickle libsonata objects

That should be handled a different way, no? Following the pickle protocol, and only saving the paths & population names? And being able to rebuild the object that way?

@asanin-epfl
Copy link
Contributor

I'm wondering w.....d (which is currently not the case). Does that work for you @asanin-epfl?

Yes! Pickling would be also great but not critical. I've managed to live without it.

@tomdele
Copy link
Contributor Author

tomdele commented Mar 19, 2020

That should be handled a different way, no? Following the pickle protocol, and only saving the paths & population names? And being able to rebuild the object that way?

Yes sure it is easy to path the config and that's it. That's what aleksei is doing already.

@mgeplf
Copy link
Contributor

mgeplf commented Mar 19, 2020

Yes sure it is easy to path the config and that's it. That's what aleksei is doing already.

But that means we don't have to worry about invalidating the cache, right?

@tomdele
Copy link
Contributor Author

tomdele commented Mar 19, 2020

Just to be sure, what I mean is to path the circuit_config to the worker and building the circuit objetc inside the function:

from multiprocessing import Pool

def f(circuit_path):
    circuit = Circuit(circuit_path)
    ...

if __name__ == '__main__':
    p = Pool(n)
    print(p.map(f, circuit_path))

?

@asanin-epfl
Copy link
Contributor

But that means we don't have to worry about invalidating the cache, right?

What do you mean by that? If I open the same circuit in another process => I will have a lock failure. Currently this problem does not show up only because I have two circuits. I make a copy of original. I use the original for reading and I use the copy for writing the reduced. It is kind of hacky solution. It does not work in the new API where I provide an inplace reduction because I don't have the copied circuit.

Just drop the H5 handle when you don't use it! Please ='(

@tomdele
Copy link
Contributor Author

tomdele commented Mar 19, 2020

Btw, this :

I'm wondering, instead, if we should just make sure that any h5 handles are purged if the circuit object is deleted (which is currently not the case). Does that work for you @asanin-epfl?

was one of my first test. But this is not so easy (impossible ?) to achieve due to the multiple circular references between objects in snap. Circuit lies inside the Storages which lie inside the populations...
Since the __del__() is called when the reference count reaches zero for the circuit, this is very hard to predict when this will happen in our case and it means the user must be ultra careful when extracting objects from the circuit (ie: doing this edge = circuit.edges["pop"]).

So the only solutions I guess are:

  • context manager with a finally but the behaviour is not good see: fix #40 #42 (comment) or we need to add some flags preventing the user to reopen contexts which I don t like so much.
  • the function I proposed
  • drop this entirely

@tomdele tomdele requested a review from mgeplf March 19, 2020 15:46
@mgeplf
Copy link
Contributor

mgeplf commented Mar 20, 2020

user must be ultra careful when extracting objects from the circuit

That's fine. However, if don't have any references, and have del circuit, I would expect there to be no more handles on h5, but it seems like that's not the case. Can we handle that more gracefully?

@tomdele
Copy link
Contributor Author

tomdele commented Mar 20, 2020

just an illustration:

with :

    def __del__(self):
        """Close the context for all populations."""
        def _close_context(pop):
            """Close the h5 context for population."""
            if "_population" in pop.__dict__:
                del pop.__dict__["_population"]

        if self.nodes:
            for population in self.nodes.values():
                _close_context(population)
        if self.edges:
            for population in self.edges.values():
                _close_context(population)

and:

from bluepysnap import Circuit
import h5py

circuit = Circuit("circuit_config.json")
circuit.edges["pop"].size
del circuit

with h5py.File("edges.sonata", "r+") as h5:
    print(list(h5["/edges/"]))

it fails.

from bluepysnap import Circuit
import h5py


circuit = Circuit("circuit_config.json")
circuit.edges["pop"].size

def _close_context(pop):
    """Close the h5 context for population."""
    if "_population" in pop.__dict__:
        del pop.__dict__["_population"]

if circuit.nodes:
    for population in circuit.nodes.values():
        _close_context(population)

if circuit.edges:
    for population in circuit.edges.values():
        _close_context(population)


with h5py.File("edges.sonata", "r+") as h5:
    print(list(h5["/edges/"]))

is ok.

@tomdele
Copy link
Contributor Author

tomdele commented Mar 20, 2020

The del in python is a bit ... misleading ...

@tomdele
Copy link
Contributor Author

tomdele commented Mar 20, 2020

As a temporary solution (if it s really critical) @asanin-epfl you can use the last snippet I gave in:
#42 (comment)

@asanin-epfl
Copy link
Contributor

Thank you. I will use it then in my code for now.

@tomdele
Copy link
Contributor Author

tomdele commented Mar 20, 2020

Ok this is the very last thing I can think of, which is "acceptable" (thx @wizmer ):

from bluepysnap import Circuit
import h5py

with  Circuit("circuit_config.json") as circuit:
    edge = circuit.edges["pop"]
    edge.size

with h5py.File("edges.sonata", "r+") as h5:
    print(list(h5["/edges/"])) 

Works.

from bluepysnap import Circuit
import h5py

with  Circuit("circuit_config.json") as circuit:
    edge = circuit.edges["pop"]
    edge.size

circuit.edges

Will raise just like open() context manager.

But this does allow to open close open close etc ... You will need to reopen the context:

from bluepysnap import Circuit
import h5py


with  Circuit("circuit_config.json") as circuit:
    edge = circuit.edges["pop"]
    edge.size


with h5py.File("edges.sonata", "r+") as h5:
    print(list(h5["/edges/"]))

with  Circuit("circuit_config.json") as circuit:
    edge = circuit.edges["pop"]
    edge.size
...

@tomdele
Copy link
Contributor Author

tomdele commented Mar 20, 2020

People can do this:

from bluepysnap import Circuit
import h5py

with  Circuit("circuit_config.json") as circuit:
    edge = circuit.edges["pop"]
    edge.size

print(edge.property_names) 

with h5py.File("edges.sonata", "r+") as h5:
    print(list(h5["/edges/"]))

So I need to propagate the flags to the nodes/edges also.

@tomdele
Copy link
Contributor Author

tomdele commented Mar 20, 2020

Running out of ideas ...

@asanin-epfl
Copy link
Contributor

so, any resolution here?

@mgeplf
Copy link
Contributor

mgeplf commented Mar 25, 2020

Not yet - you want to try making it so that if one del circuit, it cleans up enough to not to have any handles to h5 (ignoring ones that ppl have made themselves? - then it's not our problem)

@tomdele
Copy link
Contributor Author

tomdele commented Mar 25, 2020

I am not sure the del is the solution. People will not want to del, mutate the file, reinstantiate the circuit and all the nodes/edges, query, del, mutate, reinstantiate ...

@asanin-epfl
Copy link
Contributor

I am talking about the situation when del circuit closes what has been opened by circuit. My h5 file handles are taken care of.
I tried your snippet @tomdele. I still receive the error:

import h5py
from bluepysnap import Circuit


def close_sonata_circuit(circuit: Circuit):
    """Close h5 file handles opened by SONATA circuit

    Temporary function until https://github.com/BlueBrain/snap/pull/42 is resolved.
    Args:
        circuit: SONATA circuit
    """

    def _close_context(pop):
        """Close the h5 context for population."""
        if "_population" in pop.__dict__:
            del pop.__dict__["_population"]

    if circuit.nodes:
        for population in circuit.nodes.values():
            _close_context(population)

    if circuit.edges:
        for population in circuit.edges.values():
            _close_context(population)


circuit = Circuit("tests/data/9cells/bglibpy_circuit_config.json")
node_pop = circuit.nodes["cortex"]
node_ids = node_pop.ids()
node_get = node_pop.get(group=[1])
ns = node_pop.size

edge_pop = circuit.edges["excvirt_cortex"]
source_node = edge_pop.source
ss = source_node.size
es = edge_pop.size
prop = edge_pop.properties([1, 2, 3, 4], ["depression_time"])

close_sonata_circuit(circuit)
del circuit

with h5py.File("tests/data/9cells/network/bglibpy/nodes/cortex.h5", "r+") as h5:
    print(list(h5["/nodes/"]))
Traceback (most recent call last):
  File "deleteme.py", line 20, in <module>
    with h5py.File("tests/data/9cells/network/bglibpy/nodes/cortex.h5", "r+") as h5:
  File "/Users/sanin/workspace/sonata-network-reduction/venv/lib/python3.7/site-packages/h5py/_hl/files.py", line 408, in __init__
    swmr=swmr)
  File "/Users/sanin/workspace/sonata-network-reduction/venv/lib/python3.7/site-packages/h5py/_hl/files.py", line 175, in make_fid
    fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (unable to lock file, errno = 35, error message = 'Resource temporarily unavailable')
(venv) bb-fvfxv5s0j1wt:sonata-network-reduction sanin$ python deleteme.py 
Traceback (most recent call last):
  File "deleteme.py", line 42, in <module>
    with h5py.File("tests/data/9cells/network/bglibpy/nodes/cortex.h5", "r+") as h5:
  File "/Users/sanin/workspace/sonata-network-reduction/venv/lib/python3.7/site-packages/h5py/_hl/files.py", line 408, in __init__
    swmr=swmr)
  File "/Users/sanin/workspace/sonata-network-reduction/venv/lib/python3.7/site-packages/h5py/_hl/files.py", line 175, in make_fid
    fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5f.pyx", line 88, in h5py.h5f.open
OSError: Unable to open file (unable to lock file, errno = 35, error message = 'Resource temporarily unavailable')

@haleepfl
Copy link
Contributor

Given that SNAP is meant to be a productivity layer, IMO we should not be worried about supporting mutating circuits as a use case. However, we should make sure the basic functionality does not leave any dangling handles.

@tomdele
Copy link
Contributor Author

tomdele commented Mar 25, 2020

We fixed @asanin-epfl problem with: 600d638

But the question is still open: how to elegantly kill all the contexts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API to close h5 file handler of circuit?
5 participants