Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Unit tests related to SharedProcessPool.terminate() generates stack traces and possibly block the remote-ci #1926

Closed
2 tasks done
yczhang-nv opened this issue Oct 2, 2024 · 0 comments · Fixed by #1929
Assignees
Labels
bug Something isn't working

Comments

@yczhang-nv
Copy link
Contributor

Version

24.10

Which installation method(s) does this occur on?

Source

Describe the bug.

The unit tests that call SharedProcessPool.terminate() generates stack traces while killing all the sub-processes in the process pool. The stack traces are misleading to developers and it is possible to block the remote-ci in some cases.

To fix the issue, the stack traces should be captured or re-directed to somewhere else, or just remove the tests that calls SharedProcessPool.terminate()

Minimum reproducible example

pytest -v tests/utils/test_shared_process_pool.py

Relevant log output

Click here to see error details

PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863188 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863259 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863241 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863255 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863168 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863293 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863273 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863231 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863289 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863191 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863216 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863265 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863158 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863281 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863226 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863197 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863246 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863201 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863276 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863250 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863286 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
PC: @ 0x0 (unknown)
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863297 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
*** SIGTERM (@0x3e8000cdf45) received by PID 863236 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863153 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
PC: @ 0x0 (unknown)
*** SIGTERM (@0x3e8000cdf45) received by PID 863212 (TID 0x7f92c523c6c0) from PID 843589; stack trace: ***
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()
@ 0x7f92b316e197 google::(anonymous namespace)::FailureSignalHandler()

Full env printout

Click here to see environment details

[Paste the results of print_env.sh here, it will be hidden by default]

Other/Misc.

No response

Code of Conduct

  • I agree to follow Morpheus' Code of Conduct
  • I have searched the open bugs and have found no duplicates for this bug report
@yczhang-nv yczhang-nv added the bug Something isn't working label Oct 2, 2024
@yczhang-nv yczhang-nv self-assigned this Oct 2, 2024
@morpheus-bot-test morpheus-bot-test bot moved this from Todo to Review - Ready for Review in Morpheus Boards Oct 2, 2024
@rapids-bot rapids-bot bot closed this as completed in #1929 Oct 2, 2024
@rapids-bot rapids-bot bot closed this as completed in 1a5c7a7 Oct 2, 2024
@github-project-automation github-project-automation bot moved this from Review - Ready for Review to Done in Morpheus Boards Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
1 participant