You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the CI Benchmark workflow, the execution of the benchmarks are always true, rather than they could be fail.
It is because if we run asv run … it always has two outputs, the STD Output and the STD Error, which cause that the job for CI is always failing, not matter if it passed or fails. For this, reason the return value (0: success | Not 0: fail) from the ASV package (app) is ignored then stored in logs to validate if it shows some error.
The workflow “Run benchmarks for last 5 commits if not PR” passed. But, any benchmarks were executed and drop an error.
· Discovering benchmarks
·· Uninstalling from mamba-py3.12
·· Building 7e7069a6 <master> for mamba-py3.12
·· Installing 7e7069a6 <master> into mamba-py3.12
…
File "/home/runner/work/tardis/tardis/.asv/env/0a7f40a14f159f43256c541ac3f740f8/lib/python3.12/importlib/__init__.py", line 90, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 995, in exec_module
File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
File "/home/runner/work/tardis/tardis/benchmarks/benchmark_base.py", line 18, in <module>
from tardis.transport.montecarlo import NumbaModel, opacity_state_initialize
ImportError: cannot import name 'NumbaModel' from 'tardis.transport.montecarlo' (/home/runner/work/tardis/tardis/.asv/env/0a7f40a14f159f43256c541ac3f740f8/lib/python3.12/site-packages/tardis/transport/montecarlo/__init__.py)
·· Failed to build the project and import the benchmark suite.
Actual source code
.github/workflows/benchmarks.yml.
- name: Run benchmarks for last 5 commits if not PRif: github.event_name != 'pull_request_target'run: | git log -n 5 --pretty=format:"%H" >> tag_commits.txt asv run HASHFILE:tag_commits.txt | tee asv-output.log if grep -q failed asv-output.log; then echo "Some benchmarks have failed!" exit 1 fi
Conclusion / Proposal
Some of these proposals could be together, others could be separate.
Improve the grep command grep -q failed asv-output.log.
Research, if adding the ignore case, resolve the problem.
-i, --ignore-case: ignore case distinctions in patterns and data.
Add more words to find in other words which reflect the error.
Research, in the ASV documentation or source code, if we can catch the return error for some specific error for the failing tests.
In my little research doing experiments with ASV, the codes are:
Executed: time asv run --verbose --show-stderr master^\!; echo "${?}".
0: Success.
1: Fatal error, the ASV could not build the benchmarks.
2: Error during the execution of the benchmarks.
I don't have more ideas, so far, it looks like a good start.
Erratum
If all the benchmarks run good, then the returned error code is 0 (success). For this reason, I think that we don't need the condition if grep -q failed asv-output.log; then. But, we need to explore more cases to find ways to break it.
Update 01 [2024-05-10 13:29 CST]
Executed: time asv run --verbose --show-stderr --bench transport_montecarlo_opacities HASHFILE:tag_commits.txt; echo "${?}". Returned the code 2 (runtime error).
The possible solution is checking if there are results to publish in the folder .asv/results/MACHINE_ID/. In my computer, I got it:
Command: time asv run --verbose --show-stderr --bench transport_montecarlo_opacities HASHFILE:tag_commits.txt; echo "${?}"
Result:
ls -lh .asv/results/9fb9a8132f7e/
-rw-r--r-- 1 root root 12213 May 10 13:01 303b0d39-mamba-py3.12.json
-rw-r--r-- 1 root root 1610 May 10 13:01 328ec77d-mamba-py3.12.json
-rw-r--r-- 1 root root 20967 May 10 12:57 7e7069a6-mamba-py3.12.json
-rw-r--r-- 1 root root 12219 May 10 12:58 8d70aaa5-mamba-py3.12.json
-rw-r--r-- 1 root root 12204 May 10 13:00 b668802a-mamba-py3.12.json
-rw-r--r-- 1 root root 205 May 10 12:56 machine.json
If I remove this .asv folder and execute ASV with failing build, the result is:
Command: time asv run --verbose --show-stderr --bench transport_montecarlo_opacities HASHFILE:tag_commits.txt; echo "${?}"
Result:
ls -lh .asv/results/9fb9a8132f7e/
-rw-r--r-- 1 root root 205 May 10 13:23 machine.json
Conclusion
With this information and experiments, the optimal solution looks like:
Create a condition to check the return error code. If it is 1 (build fail), then mark this job in the workflow as a fail.
If the error code is 2 (runtime error), then we can check if the benchmark results were generated, if it didn't generate, we can return exit 1 to fail the job.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Description
In the CI Benchmark workflow, the execution of the benchmarks are always true, rather than they could be fail.
It is because if we run
asv run …
it always has two outputs, the STD Output and the STD Error, which cause that the job for CI is always failing, not matter if it passed or fails. For this, reason the return value (0: success | Not 0: fail) from the ASV package (app) is ignored then stored in logs to validate if it shows some error.Evidence
https://github.com/tardis-sn/tardis/actions/runs/9033563191/job/24824186527
The workflow “Run benchmarks for last 5 commits if not PR” passed. But, any benchmarks were executed and drop an error.
Actual source code
.github/workflows/benchmarks.yml
.Conclusion / Proposal
Some of these proposals could be together, others could be separate.
grep -q failed asv-output.log
.-i, --ignore-case: ignore case distinctions in patterns and data
.time asv run --verbose --show-stderr master^\!; echo "${?}"
.Erratum
If all the benchmarks run good, then the returned error code is 0 (success). For this reason, I think that we don't need the condition
if grep -q failed asv-output.log; then
. But, we need to explore more cases to find ways to break it.Update 01 [2024-05-10 13:29 CST]
Executed:
time asv run --verbose --show-stderr --bench transport_montecarlo_opacities HASHFILE:tag_commits.txt; echo "${?}"
. Returned the code 2 (runtime error).The possible solution is checking if there are results to publish in the folder
.asv/results/MACHINE_ID/
. In my computer, I got it:Command:
time asv run --verbose --show-stderr --bench transport_montecarlo_opacities HASHFILE:tag_commits.txt; echo "${?}"
Result:
If I remove this
.asv
folder and execute ASV with failing build, the result is:Command:
time asv run --verbose --show-stderr --bench transport_montecarlo_opacities HASHFILE:tag_commits.txt; echo "${?}"
Result:
Conclusion
With this information and experiments, the optimal solution looks like:
exit 1
to fail the job.Beta Was this translation helpful? Give feedback.
All reactions