Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inevitable recursion error from inside get_result #1539

Closed
araghukas opened this issue Mar 3, 2023 · 3 comments · Fixed by #1859 · May be fixed by #1949
Closed

Inevitable recursion error from inside get_result #1539

araghukas opened this issue Mar 3, 2023 · 3 comments · Fixed by #1859 · May be fixed by #1949
Assignees
Labels
bug Something isn't working Team Psi Covalent Team Psi

Comments

@araghukas
Copy link
Contributor

Environment

  • Covalent version: 0.209.1
  • Python version: 3.8
  • Operating system: MacOS

What is happening?

If ct.get_result(dispatch_id, wait=True) takes long enough, eventually a RecursionError will be raised.

Traceback (most recent call last):
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/araghukas/Code/algo-multikernel/software/scripts/qcc_net/__main__.py", line 289, in <module>
    fire.Fire(QCCExperiment, name="qcc_net")
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/Users/araghukas/Code/algo-multikernel/software/scripts/qcc_net/qcc_experiment.py", line 77, in run
    records = self._run(random_seeds, self.data_dimension)
  File "/Users/araghukas/Code/algo-multikernel/software/scripts/qcc_net/qcc_experiment.py", line 165, in _run
    return self._execute(kwargs)
  File "/Users/araghukas/Code/algo-multikernel/software/scripts/qcc_net/qcc_experiment.py", line 179, in _execute
    records = ct.get_result(dispatch_id, wait=True).result
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/covalent/_results_manager/results_manager.py", line 57, in get_result
    result = _get_result_from_dispatcher(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/covalent/_results_manager/results_manager.py", line 108, in _get_result_from_dispatcher
    response = http.get(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/requests/sessions.py", line 600, in get
    return self.request("GET", url, **kwargs)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/urllib3/connectionpool.py", line 878, in urlopen
    return self.urlopen(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/urllib3/connectionpool.py", line 878, in urlopen
    return self.urlopen(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/urllib3/connectionpool.py", line 878, in urlopen
    return self.urlopen(
  [Previous line repeated 964 more times]
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen
    httplib_response = self._make_request(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/urllib3/connectionpool.py", line 449, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/urllib3/connectionpool.py", line 444, in _make_request
    httplib_response = conn.getresponse()
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/http/client.py", line 1348, in getresponse
    response.begin()
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/http/client.py", line 335, in begin
    self.headers = self.msg = parse_headers(self.fp)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/http/client.py", line 234, in parse_headers
    return email.parser.Parser(_class=_class).parsestr(hstring)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/parser.py", line 67, in parsestr
    return self.parse(StringIO(text), headersonly=headersonly)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/parser.py", line 56, in parse
    feedparser.feed(data)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/feedparser.py", line 176, in feed
    self._call_parse()
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/feedparser.py", line 180, in _call_parse
    self._parse()
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/feedparser.py", line 295, in _parsegen
    if self._cur.get_content_maintype() == 'message':
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/message.py", line 594, in get_content_maintype
    ctype = self.get_content_type()
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/message.py", line 578, in get_content_type
    value = self.get('content-type', missing)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/message.py", line 471, in get
    return self.policy.header_fetch_parse(k, v)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/_policybase.py", line 316, in header_fetch_parse
    return self._sanitize_header(name, value)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/_policybase.py", line 287, in _sanitize_header
    if _has_surrogates(value):
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/email/utils.py", line 57, in _has_surrogates
    s.encode()
RecursionError: maximum recursion depth exceeded while calling a Python object

How can we reproduce the issue?

Any workflow that takes long enough will trigger this error.

import time
import covalent as ct

@ct.lattice
@ct.electron
def workflow():
    time.sleep(7200)
    return 42

dispatch_id = ct.dispatch(workflow)()
result = ct.get_result(dispatch_id, wait=True)
print(result)

What should happen?

get_result should not be interrupted by the RecursionError

Any suggestions?

  1. Add a second except block inside get_result() to catch the RecursionError.
  2. Wrap entire function code in a while True loop and return on successful pickle.loads().
def get_result(dispatch_id: str, wait: bool = False, dispatcher_addr: str = None) -> Result:

    while True:
        try:
            result = _get_result_from_dispatcher(
                dispatch_id,
                wait,
                dispatcher_addr,
            )
            return pickle.loads(codecs.decode(result["result"].encode(), "base64"))

        except RecursionError as ex:
            app_log.warning(ex)

        except MissingLatticeRecordError as ex:
            app_log.warning(
                f"Dispatch ID {dispatch_id} was not found in the database. Incorrect dispatch id."
            )
            raise ex
@araghukas araghukas added the bug Something isn't working label Mar 3, 2023
@santoshkumarradha
Copy link
Member

santoshkumarradha commented Oct 14, 2023

@Prasy12 can you add this one as well?
(Low P)

@Prasy12 Prasy12 added the Team Psi Covalent Team Psi label Oct 16, 2023
@ArunPsiog ArunPsiog linked a pull request Nov 22, 2023 that will close this issue
3 tasks
@cjao
Copy link
Contributor

cjao commented Nov 27, 2023

It's curious why the retry logic for get_result() involves any recursion.

@araghukas
Copy link
Contributor Author

It's curious why the retry logic for get_result() involves any recursion.

@cjao I think it's urllib3 that has recursive logic inside (the .urlopen() method returns a call to itself)

Relevant lines from the error msg above:

  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/requests/sessions.py", line 701, in send
    r = adapter.send(request, **kwargs)
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/requests/adapters.py", line 489, in send
    resp = conn.urlopen(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/urllib3/connectionpool.py", line 878, in urlopen
    return self.urlopen(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/urllib3/connectionpool.py", line 878, in urlopen
    return self.urlopen(
  File "/Users/araghukas/miniconda3_x86_64/envs/mkl-rosetta/lib/python3.8/site-packages/urllib3/connectionpool.py", line 878, in urlopen
    return self.urlopen(
  [Previous line repeated 964 more times]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Team Psi Covalent Team Psi
Projects
None yet
7 participants