Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SLE Common Class Variables Should be Instance Variables #200

Open
christianmkuss opened this issue Dec 7, 2022 · 1 comment · May be fixed by #202
Open

SLE Common Class Variables Should be Instance Variables #200

christianmkuss opened this issue Dec 7, 2022 · 1 comment · May be fixed by #202

Comments

@christianmkuss
Copy link

christianmkuss commented Dec 7, 2022

In ait/dsn/common.py there is the following code block

class SLE(object):
    ''' SLE interface "base" class
    The SLE class provides SLE interface-agnostic methods and attributes
    for interfacing with SLE.
    '''
    _state = 'unbound'
    _handlers = defaultdict(list)
    _data_queue = gevent.queue.Queue()
    _invoke_id = 0

    def __init__(self, *args, **kwargs):

These class level variables will become shared by any instance inheriting from SLE which will create issues if multiple of RAF, RCF, or CLTU are created in the same scope. Looking at the examples in ait/dsn/bin/examples, I'm not sure this is a super common use case, but it is one I ran across. I believe it may also be avoidable by running each instance in threads.

For example, I have configured a test bed that streams RAF frames over port 5307 and accepts CLTU over port 5112, after accepting bind, start, and similar logic.

import time
from ait.dsn.sle import RAF, CLTU

raf_mngr = RAF(
    hostnames=['localhost'],
    port=5307,
    inst_id='sagr=LSE-SSC.spack=Test.rsl-fg=1.raf=onlc1',
    auth_level="none"
)

cltu_mngr = CLTU(
    hostnames=['localhost'],
    port=5112,
    inst_id="sagr=111.spack=-default.fsl-fg=1.cltu=cltu1",
    auth_level="none"
)

# RAF connect, bind, and start
raf_mngr.connect()

raf_mngr.bind()
while raf_mngr._state != "ready":
    time.sleep(1)

raf_mngr.start(None, None)
while raf_mngr._state != "active":
    time.sleep(1)

# CLTU connect, bind, and start
cltu_mngr.connect()

cltu_mngr.bind()
while cltu_mngr._state != "ready":
    time.sleep(1)

cltu_mngr.start()
while cltu_mngr._state != "active":
    time.sleep(1)

while True:
    time.sleep(0)

Running this outputs (with print(type(handler)) added to ait.dsn.sle.common::data_processor)

2022-12-07T14:41:05.827 | INFO     | Starting conn monitor for <class 'ait.dsn.sle.raf.RAF'>
2022-12-07T14:41:05.827 | INFO     | Configuring SLE connection...
2022-12-07T14:41:05.828 | INFO     | SLE connection configuration successful
2022-12-07T14:41:05.828 | INFO     | Sending Bind request ...
<class 'ait.dsn.sle.raf.RAF'>
2022-12-07T14:41:05.832 | INFO     | Bind successful
2022-12-07T14:41:06.834 | INFO     | Sending data start invocation ...
<class 'ait.dsn.sle.raf.RAF'>
2022-12-07T14:41:06.839 | INFO     | Start successful
<class 'ait.dsn.sle.raf.RAF'>
2022-12-07T14:41:07.850 | INFO     | Connection to DSN successful through localhost.
2022-12-07T14:41:07.851 | INFO     | Starting conn monitor for <class 'ait.dsn.sle.cltu.CLTU'>
2022-12-07T14:41:07.851 | INFO     | Configuring SLE connection...
2022-12-07T14:41:07.852 | INFO     | SLE connection configuration successful
2022-12-07T14:41:07.852 | INFO     | Sending Bind request ...
<class 'ait.dsn.sle.raf.RAF'>
2022-12-07T14:41:07.856 | INFO     | Bind unsuccessful. State already in READY or ACTIVE.
2022-12-07T14:41:07.856 | INFO     | Sending Peer Abort
2022-12-07T14:41:07.858 | INFO     | Bind successful
<class 'ait.dsn.sle.cltu.CLTU'>
2022-12-07T14:41:07.865 | ERROR    | Unable to decode PDU. Skipping ...
<class 'ait.dsn.sle.raf.RAF'>
<class 'ait.dsn.sle.cltu.CLTU'>
2022-12-07T14:41:09.863 | ERROR    | Unable to decode PDU. Skipping ...
<class 'ait.dsn.sle.raf.RAF'>
<class 'ait.dsn.sle.raf.RAF'>
<class 'ait.dsn.sle.cltu.CLTU'>
2022-12-07T14:41:11.874 | ERROR    | Unable to decode PDU. Skipping ...
<class 'ait.dsn.sle.raf.RAF'>
<class 'ait.dsn.sle.cltu.CLTU'>
2022-12-07T14:41:13.877 | ERROR    | Unable to decode PDU. Skipping ...

This shows that the CLTU instance is able to decode but then the incoming RAF frames are put into a queue that is shared between CLTU and RAF (hence the error "Unable to decode PDU. Skipping ...")

@christianmkuss
Copy link
Author

I have changes made and can open a PR.

christianmkuss added a commit to christianmkuss/AIT-DSN that referenced this issue Dec 7, 2022
When running more than one class that inherits from SLE, _state,
_handlers, _data_queue, and _invoke_id become shared. This most
common this issue creates is if data is being inserted into the
data queue that is being shared. Data will be given to any of the
children even if the PDU does not match what the child knows about.

By making these instance level attributes data is kept confined to the
individual member and is not shared.

Additionally, if a socket is closed abruptly, the _state should be
reset to "unbound" because the only way to reset it would be destroy the
object.

patch_all is not required as the only thing being monkey patched for
gevents is time. By running patch_all, issues arise when running
non-gevent queues in calling applications.

Fixes NASA-AMMOS#200
christianmkuss added a commit to christianmkuss/AIT-DSN that referenced this issue Dec 7, 2022
When running more than one class that inherits from SLE, _state, _handlers,
_data_queue, and _invoke_id become shared. The most common issue this
creates is if data is being inserted into the data queue that is being
shared. Data will be given to any of the children even if the PDU
does not match what the child knows about.

By making these instance level attributes data is kept confined
to the individual member and is not shared.

Additionally, if a socket is closed abruptly, the _state should
be reset to "unbound" because the only way to reset it would be
to destroy the object.

patch_all is not required as the only thing being monkey patched
for gevents is time. By running patch_all, issues arise when running
non-gevent queues in calling applications.

_telem_sock should only be created when connecting, not on class instantiation

Fixes NASA-AMMOS#200
christianmkuss added a commit to christianmkuss/AIT-DSN that referenced this issue Feb 24, 2023
When running more than one class that inherits from SLE, _state, _handlers,
_data_queue, and _invoke_id become shared. The most common issue this
creates is if data is being inserted into the data queue that is being
shared. Data will be given to any of the children even if the PDU
does not match what the child knows about.

By making these instance level attributes data is kept confined
to the individual member and is not shared.

Additionally, if a socket is closed abruptly, the _state should
be reset to "unbound" because the only way to reset it would be
to destroy the object. The socket should be reset to None
if it is ever closed.

patch_all is not required as the only thing being monkey patched
for gevents is time. By running patch_all, issues arise when running
non-gevent queues in calling applications.

_telem_sock should only be created when connecting, not on class instantiation

Fixes NASA-AMMOS#200
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant