Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug / Crash for Spam #69

Open
josephsalmon opened this issue Mar 24, 2022 · 5 comments
Open

Bug / Crash for Spam #69

josephsalmon opened this issue Mar 24, 2022 · 5 comments

Comments

@josephsalmon
Copy link
Contributor

Currently the code on main for spam leads to crashes:

image

@gdurif any idea why and when this started?

@gdurif
Copy link

gdurif commented Mar 24, 2022

Yes, the problem is the conda package python-spams (which I am not maintaining), potentially this one conda-forge/python-spams-feedstock#67

@mathurinm reported this here getspams/spams-python#17 and we are trying to solve the problem here #66

@josephsalmon
Copy link
Contributor Author

sorry for the duplicate then from #66
I am closing this issue.

@gdurif
Copy link

gdurif commented Mar 24, 2022

We can keep this issue open until the problem is solved (since #66 is a pull request and it is possible to miss it while looking to report the problem)

@josephsalmon
Copy link
Contributor Author

OK. I am reopenning it until proper résolution.

@josephsalmon josephsalmon reopened this Mar 24, 2022
@josephsalmon
Copy link
Contributor Author

Not sure if this is related but now, running a benchmark with spams leads to the following error message on my machine:

exception calling callback for <Future at 0x7eff7e2c6e80 state=finished raised TerminatedWorkerError>
Traceback (most recent call last):
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 359, in __call__
    self.parallel.dispatch_next()
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 794, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 861, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 779, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 531, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/reusable_executor.py", line 177, in submit
    return super(_ReusablePoolExecutor, self).submit(
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 1115, in submit
    raise self._flags.broken
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {SIGSEGV(-11)}
Traceback (most recent call last):
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/bin/benchopt", line 33, in <module>
    sys.exit(load_entry_point('benchopt', 'console_scripts', 'benchopt')())
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/jsalmon/Documents/OpenSource/benchOpt/benchopt/cli/main.py", line 199, in run
    run_benchmark(
  File "/home/jsalmon/Documents/OpenSource/benchOpt/benchopt/runner.py", line 302, in run_benchmark
    results = Parallel(n_jobs=n_jobs)(
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 1056, in __call__
    self.retrieve()
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 935, in retrieve
    self._output.extend(job.get(timeout=self.timeout))
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 542, in wrap_future_result
    return future.result(timeout=timeout)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/concurrent/futures/_base.py", line 444, in result
    return self.__get_result()
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
    raise self._exception
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
    callback(self)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 359, in __call__
    self.parallel.dispatch_next()
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 794, in dispatch_next
    if not self.dispatch_one_batch(self._original_iterator):
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 861, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/parallel.py", line 779, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 531, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/reusable_executor.py", line 177, in submit
    return super(_ReusablePoolExecutor, self).submit(
  File "/home/jsalmon/anaconda3/envs/benchopt_benchmark_lasso/lib/python3.8/site-packages/joblib/externals/loky/process_executor.py", line 1115, in submit
    raise self._flags.broken
joblib.externals.loky.process_executor.TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker.

The exit codes of the workers are {SIGSEGV(-11)}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants