-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test mpi versions #20
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #20 +/- ##
=======================================
Coverage 74.29% 74.29%
=======================================
Files 41 41
Lines 2587 2587
=======================================
Hits 1922 1922
Misses 665 665
Flags with carried forward coverage won't be shown. Click here to find out more. Continue to review full report at Codecov.
|
2f39e76
to
aaa4638
Compare
aaa4638
to
75a5004
Compare
I am not sure why it is still in fail fast. It would be nice to have all these tests, potentially with xfail on the configurations known to cause problems? |
They are stopped because of time out. The tests with mpich hang at a point due to #19, then it waits until max timeout for github actions. When it is over, they are cancelled. |
@tomMoral The main problem for tests with mpich is that we need to run the tests with |
@tomMoral As far as I understand, this is message is due to Singleton feature not being implemented in mpich, see mpich issue on github. details are explained in #19. I think with mpich we need to run the tests with:
Note: Actually we can use the same command for both mpich and openmpi. As hostfile format for mpich and openmpi are not the same When we use the above command with:
I think the openmpi version should be able to stop spawned processes properly. That makes me think that the code to stop spawned processes might not be reliable. |
@tomMoral I tried using mpich with a very simple MPI program that spawns a number of processes (gets the hostfile from env.) to see if the problem arises from dicodile code. With openmpi I can run the prog as:
If I do the same with mpich, I get the above error; ie.
I think this is really due to Singleton not being implemented in mpich. I propose to change the testing command to
and fix the hanging problem and other possible problems afterwards. WDYT? |
fac864b
to
cf55093
Compare
Runs the tests:
Tests with openmpi on ubuntu-18.04 fails due to #12.
Tests with mpich on both ubuntu-18.04 and ubuntu 20.04 fail due to #19.