You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
0 Exception
[hande-VirtualBox:04908] [[59073,0],0] ORTE_ERROR_LOG: Not found in file orted/pmix/pmix_server_dyn.c at line 87
1 Exception
[hande-VirtualBox:04908] [[59073,0],0] ORTE_ERROR_LOG: Not found in file orted/pmix/pmix_server_dyn.c at line 87
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 6 slots
that were requested by the application:
/home/hande/dev/dicodile/env/bin/python
Either request fewer slots for your application, or make more slots available
for use.
--------------------------------------------------------------------------
2 Exception
[hande-VirtualBox:04908] 1 more process has sent help message help-orte-rmaps-base.txt / orte-rmaps-base:alloc-error
[hande-VirtualBox:04908] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
3 Exception
[hande-VirtualBox:04908] 1 more process has sent help message help-orte-rmaps-base.txt / orte-rmaps-base:alloc-error
4 Exception
[hande-VirtualBox:04908] 1 more process has sent help message help-orte-rmaps-base.txt / orte-rmaps-base:alloc-error
5 Exception
[hande-VirtualBox:04932] OPAL ERROR: Timeout in file base/pmix_base_fns.c at line 195
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_dpm_dyn_init() failed
--> Returned "Timeout" (-15) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
continues running and spawning the specified number of processes in each iteration. It complains about insufficient number of slots when the number of slots in hostfile_test would exceed at that iteration.
For example for the above example, hostfile_test specifies 16 slots. For 1st iteration, it spawns 6 processes, then raises
exception. However the processes continue to run. For second iteration it starts 6 more processes, 12 in total. For 3rd iteration, as it has 3 slots left, it complains that there are not enough slots.
I tried the same with 20 slots and it complained in 4th iteration after initializing 18 processes in the first 3.
Similar problem while running plot_mandrill.py example with 16 slots in hostfile with the command: mpirun -np 1 --hostfile hostfile python -m mpi4py examples/plot_mandrill.py
Replace is False and data exists, so doing nothing. Use replace=True to re-download the data.
[DEBUG:DICODILE] Lambda_max = 11.274413430904202
0 Exception
[hande-VirtualBox:05655] [[58362,0],0] ORTE_ERROR_LOG: Not found in file orted/pmix/pmix_server_dyn.c at line 87
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 9 slots
that were requested by the application:
/home/hande/dev/dicodile/env/bin/python
Either request fewer slots for your application, or make more slots available
for use.
--------------------------------------------------------------------------
1 Exception
[hande-VirtualBox:05655] 1 more process has sent help message help-orte-rmaps-base.txt / orte-rmaps-base:alloc-error
[hande-VirtualBox:05655] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
2 Exception
[hande-VirtualBox:05655] 1 more process has sent help message help-orte-rmaps-base.txt / orte-rmaps-base:alloc-error
3 Exception
[hande-VirtualBox:05655] 1 more process has sent help message help-orte-rmaps-base.txt / orte-rmaps-base:alloc-error
4 Exception
[hande-VirtualBox:05655] 1 more process has sent help message help-orte-rmaps-base.txt / orte-rmaps-base:alloc-error
5 Exception
[hande-VirtualBox:05655] 1 more process has sent help message help-orte-rmaps-base.txt / orte-rmaps-base:alloc-error
6 Exception
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort. There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems. This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):
ompi_dpm_dyn_init() failed
--> Returned "Timeout" (-15) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init_thread
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
*** and potentially your MPI job)
[hande-VirtualBox:05664] OPAL ERROR: Timeout in file base/pmix_base_fns.c at line 195
The text was updated successfully, but these errors were encountered:
Unit tests fail on ubuntu 18.04 with openmpi 2.1.1 after renaming
dicodile.py
to_dicodile.py
and exposingdicodile
function in__init__.py
asWhile running the test:
dicodile/update_z/tests/test_dicod.py::test_stopping_criterion[6-signal_support0-atom_support0]
It returns
Each exception occurs at line
dicodile/dicodile/workers/reusable_workers.py
Line 127 in 1b54bac
while trying to spawn workers at line
dicodile/dicodile/workers/reusable_workers.py
Line 120 in 1b54bac
The code spawns specified number of processes (6 in this case). The processes start
executing the specified
main_worker.py
script. However it stops atdicodile/dicodile/workers/main_worker.py
Line 6 in 1b54bac
where it tries to import from
dicodile
package.I've tried adding lines before the import line, all run until the import line. But then it fails silently.
For nb_workers = [1, 2], the code runs without problems.
For nb_workers = 6, it raises exception in spawning processes.
I thought, the code was not able to access hostfile_test, however I realized that the loop starting at line
dicodile/dicodile/workers/reusable_workers.py
Line 110 in 1b54bac
continues running and spawning the specified number of processes in each iteration. It complains about insufficient number of slots when the number of slots in hostfile_test would exceed at that iteration.
For example for the above example, hostfile_test specifies 16 slots. For 1st iteration, it spawns 6 processes, then raises
exception. However the processes continue to run. For second iteration it starts 6 more processes, 12 in total. For 3rd iteration, as it has 3 slots left, it complains that there are not enough slots.
I tried the same with 20 slots and it complained in 4th iteration after initializing 18 processes in the first 3.
Similar problem while running
plot_mandrill.py
example with 16 slots in hostfile with the command:mpirun -np 1 --hostfile hostfile python -m mpi4py examples/plot_mandrill.py
The text was updated successfully, but these errors were encountered: