Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lifetime issues with mut.Morphology constructing sections with raw pointers to itself #161

Closed
mgeplf opened this issue Feb 14, 2020 · 11 comments · Fixed by #253
Closed

Lifetime issues with mut.Morphology constructing sections with raw pointers to itself #161

mgeplf opened this issue Feb 14, 2020 · 11 comments · Fixed by #253

Comments

@mgeplf
Copy link
Contributor

mgeplf commented Feb 14, 2020

Constructs like this:
Section(this, _counter, section_) (where this is mut::Morphology` - from)

This means that a raw pointer is stored here that has a different lifetime than Morphology.

This causes a use after free if the Morphology is dtor'd before the sections that have been created are removed, for example:

from morphio.mut import Morphology
m = Morphology('simple.swc')
s = m.root_sections[0]
del m # ~mut.Morphology() called
print(s.children) # use after free

Note, this is different than #29

wizmer pushed a commit to BlueBrain/NeuroR that referenced this issue Sep 17, 2020
The cut planes must be kept alive to prevent the morphology from being seeing its ref count fall to zero.

BlueBrain/MorphIO#161
@wizmer
Copy link
Contributor

wizmer commented Feb 5, 2021

I have been thinking about this for a while. For me what seemed weird is that by design, a section is not supposed to be used after the morphology has been deleted. This is the analogue of attempting to use a node after you have deleted a graph.

One proposal I have is to add a method '_raise_if_not_alive()' call at the beginning of each function so that it raises an actual error when the user attempts to do the manipulation you wrote.

Would that be good?

@mgeplf
Copy link
Contributor Author

mgeplf commented Feb 5, 2021

For me what seemed weird is that by design, a section is not supposed to be used after the morphology has been deleted.

That's the simplified test case to aide in debugging.

The issue is that one has to know this, and work around it (hence the 'Hack for morphio lifetime issue`.) Since python is a GC'd language, the expected behavior is that it should stay alive as long as you have a reference to it.

Currently this is causing corruption that is silent, hard to track down, and can cause problems later. _raise_if_not_alive would be a start, so at least it wouldn't be silent, but it would also need to not allow corruption. It also seems like you may have the same lifetime issues.

Ping @tomdele so he can be up to speed on this.

@tomdele
Copy link
Contributor

tomdele commented Feb 17, 2021

To me this problem is kinda similar to this one I had in snap:
BlueBrain/snap#42

I spent hours on this and could not find anything usable. The del from python is a much more complex beast than the c++ dtors (even if more handy).
The del will remove your variable name from the global scope but the actual __del__() is called only when the pointer counters reached 0 for all objects.

So this is not specific to MorphIO and easily reproducible in python only:

class B:
    def __init__(self, a_object, index):
        self.a = a_object
        self.index = index

    def get_next(self):
        return self.a.get_at(self.index + 1)


class A:
    def __init__(self, number_of_b):
        self.b_list = self.init_list(self, number_of_b)

    @staticmethod
    def init_list(a_object, number_of_b):
        return [B(a_object, index) for index in range(number_of_b)]

    def get_at(self, index):
        if index < len(self.b_list):
            return self.b_list[index]
        return None


if __name__ == '__main__':
    a = A(12)
    b = a.b_list[0]
    del a  --> only remove the name "a" from the global scope and does not call __del__()
    print(b.get_next().index)

So what I can say is :

del m # ~mut.Morphology() called

is probably a false assumption here. The corresponding __del__() function is not called at this line and will be called only when the reference counter of m reaches 0. That is, when all his section are deleted.

@mgeplf
Copy link
Contributor Author

mgeplf commented Feb 17, 2021

The repro is kept as simple as possible, so that's why it uses del.

The problem is that you can have references to sections that then disappear; for instance in the linked ticket: https://github.com/BlueBrain/NeuroR/pull/22/files/0c66835b70b4b543d804eb71f49875a44deaa0d4#diff-f2fc7ab6c036c0212f190055008dc918bf038637af88bef3d07a07e0d4d28ab8R294

The work-around being that one has to have keep_alive.

@tomdele
Copy link
Contributor

tomdele commented Feb 18, 2021

We discussed this further with Benoit and I have a much better understanding of the whole issue. I think the del example was a bit misleading due to the python del behavior and the scoping from python.

@wizmer brought up a solution using weak pointers instead of the raw pointers for the Section's morphology member.

It results in a behavior that makes sense imao where :
If you extract a section and you delete the mother morphology afterwards you cannot use the section's topological methods anymore. Nonetheless, the section still exists as a standalone and the non-topological accessors are still valid.

mgeplf added a commit that referenced this issue Mar 15, 2021
* when a mut::Morphology is deleted, and it resets the mut::Section::_morphology
  to null.
* added getter to mut::Section to retrieve owning morphology that will
  throw if the above pointer is null
* attempt to fix #161
mgeplf added a commit that referenced this issue Mar 17, 2021
* when a mut::Morphology is deleted, and it resets the mut::Section::_morphology
  to null.
* added getter to mut::Section to retrieve owning morphology that will
  throw if the above pointer is null
* attempt to fix #161: user should at least get an error if the lifetimes mismatch, which should help tracking errors down
@arnaudon
Copy link

Hello!
I just saw this: https://github.com/BlueBrain/NeuroR/blob/master/neuror/main.py#L310
and I was wondering if this issue solves it, or we need to keep populating this list to keep axons alive?
If not, I'll remove that on the current PR I'm making on repair.
Thanks!

@tomdele
Copy link
Contributor

tomdele commented Mar 18, 2021

Hello!
I just saw this: https://github.com/BlueBrain/NeuroR/blob/master/neuror/main.py#L310
and I was wondering if this issue solves it, or we need to keep populating this list to keep axons alive?
If not, I'll remove that on the current PR I'm making on repair.
Thanks!

The axons should be alive but you cannot use the topological functions on the sections like .children (if I understood correctly the neuror code). But the function you are pointing is quite complex and I am not sure what s behind it and what you are doing.

Basically, now you can do :

def _get_section():
    """m is garbage collected at the end of the function"""
    m = Morphology(DATA_DIR / 'simple.swc')
    s = m.root_sections[0]
    return s

section = _get_section()
section.points  # <-- you can do that
section.children  # <-- you cannot do that

@wizmer
Copy link
Contributor

wizmer commented Mar 19, 2021

Exactly ! Wow, Tom is so good.

@arnaudon
Copy link

I never doubted that! ;) I couldn't get back to that, but it seems we need to stick to current solution in the code, thanks!

@wizmer
Copy link
Contributor

wizmer commented Mar 19, 2021

Yes, but the big difference is that now, if you try to get rid of the array that keeps the morphologies alive, you'll have a meaningful error message instead of a corrupted morphology.

@arnaudon
Copy link

Lovely!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants