Integration tests seem to be non-deterministic and regularly fail on main #1108

james-garner-canonical · 2024-09-23T21:47:35Z

EDIT: See my comment below for a table of current test failures on main.

Description

When trying to fix an issue with test failures run against PRs (e.g. #1088 identifies a breakage with /merge), I had multiple integration test failures from a relatively simple PR against main (#1106 pinning a dependency to the version before a recent release).

To troubleshoot this, I made a simpler PR against main (#1107 editing CONTRIBUTORS), which also has integration test failures.

Here is a table showing the number of times each test failured over 4 runs of the integration tests on these two 2 PRs.

test	#1106	#1107
test_app_relation_destroy_block_until_done	4	4
test_deploy_bundle_with_multiple_overlays_with_include_files	4	4
test_deploy_bundle_with_overlay_as_argument	4	4
test_wait_for_idle_more_units_than_needed	3	3
test_upgrade_local_charm - juju...	2	4
test_wait_for_idle_with_not_enough_units	2	2
test_unit_annotations - asyncio.excep...	2	0
test_action - juju.errors.JujuA...	1	1
test_upgrade_local_charm_resource	0	2
test_attach_resource - asyncio.except...	0	1

It would be ideal if the tests were deterministic.

Short of fixing the tests themselves, it would be good if the current state of the tests was prominently documented in contributing guidelines -- which test failures are likely to just be the tests being flaky and shouldn't block a merge. A separate issue can then be opened against that list to fix the individual tests.

Urgency

Blocker for our release

Python-libjuju version

main?

Juju version

the version the github workflow is using (doesn't look like versioning for juju etc is printed during test setup)

Reproduce / Test

Run integration tests on main.

The text was updated successfully, but these errors were encountered:

#1106 #### Description Pin python kubernetes version to fix recent breakage in jenkins tests. The latest update to the python kubernetes library (v31, 3 days ago) breaks the Jenkins `github-check-merge-juju-python-libjuju` test due to failure to build a new dependency (durationpy). I thought this might be the fix for issue #1088, but that's been open since August 9. Should we switch to stricter dependency versioning across the board here to avoid breakages of this nature? In setup.py, 4 dependencies now specify both minimum and maximum versions, 5 only specify a minimum, and 2 have no version specification. In tox.ini, only 1 dependency (kubernetes) specifies a (maximum) version. tox.ini should probably have the same version constraints as setup.py. #### QA Steps All tests pass, except for integration tests, which are flaky (see issue #1108).

…remove-eol-schema #1113 Remove `3.2.X` schemas. Delete `_client*.py` and run `make client`. #### Description Moving towards the goal of having schemas and generated code in python-libjuju only for the latest supported Juju versions (`3.1.9`, `3.3.6`, `3.5.4`, `3.5.3`), and using client only schemas (see #1099), this PR removes schemas for EOL Juju 3.2, and reruns code generations, removing `_client*.py` files and then running `make client`. #### QA Steps CI steps should all continue to pass, except for integration testing which should continue to fail with the usual suspects (see #1108 for a non-exhaustive table of tests that sometimes fail on `main`). #### Notes To hopefully simplify the diffs, this is the first PR in a planned sequence of PRs that will depend on each other. Subsequent PRs will be: 1. replace current schemas with latest release client only schemas (`3.1.9`, `3.3.6`) and regenerate code 2. add client only schema for `3.4.5` and regenerate code 3. add client only schema for `3.5.3` and regenerate code

dimaqq · 2024-10-01T04:45:56Z

I'm running into a new frequently (or always) failing test:
FAILED tests/integration/test_charmhub.py::test_subordinate_false_field_exists

Maybe it deserves to be added to the list.

james-garner-canonical · 2024-10-01T22:02:28Z

commit=50b42d013aee01536416e6334f99443f2b4f1e4c
n_jobs=50
n_failing_tests=26

path	test	# jobs	% jobs
test_application.py	test_app_relation_destroy_block_until_done	50	100.00%
test_model.py	test_deploy_bundle_with_multiple_overlays_with_include_files	50	100.00%
test_model.py	test_deploy_bundle_with_overlay_as_argument	50	100.00%
test_model.py	test_wait_for_idle_more_units_than_needed	41	82.00%
test_charmhub.py	test_subordinate_false_field_exists	14	28.00%
test_application.py	test_app_destroy	11	22.00%
test_application.py	test_upgrade_local_charm	11	22.00%
test_application.py	test_upgrade_local_charm_resource	8	16.00%
test_crossmodel.py	test_relate_with_offer	8	16.00%
test_model.py	test_deploy_bundle_with_storage_constraint	8	16.00%
test_application.py	test_app_remove_wait_flag	7	14.00%
test_application.py	test_action	6	12.00%
test_machine.py	test_machine_ssh	5	10.00%
test_model.py	test_destroy_units	5	10.00%
test_model.py	test_unit_annotations	4	8.00%
test_model.py	test_wait_for_idle_with_not_enough_units	4	8.00%
test_model.py	test_local_file_resource_charm	3	6.00%
test_secrets.py	test_list_secrets	3	6.00%
test_model.py	test_attach_resource	2	4.00%
test_model.py	test_deploy_with_base	2	4.00%
test_model.py	test_storage_pools_on_lxd	2	4.00%
test_controller.py	test_secrets_backend_lifecycle	1	2.00%
test_model.py	test_add_and_list_storage	1	2.00%
test_model.py	test_add_manual_machine_ssh	1	2.00%
test_model.py	test_add_manual_machine_ssh_root	1	2.00%
test_unit.py	test_run	1	2.00%

How many failing tests does each job have?

# tests failing	# jobs
3	*
4	****
5	************
6	****************
7	**************
8	*
9	**

How many tests fail once, twice, etc?

# fails	# tests
1	*****
2	***
3	**
4	**
5	**
6	*
7	*
8	***
11	**
14	*
41	*
50	***

Previous tables of 30 jobs preserved below for reference:

commit=50b42d013aee01536416e6334f99443f2b4f1e4c
n_jobs=30
n_failing_tests=20

path	test	# jobs	% jobs
test_application.py	test_app_relation_destroy_block_until_done	30	100.00%
test_model.py	test_deploy_bundle_with_multiple_overlays_with_include_files	30	100.00%
test_model.py	test_deploy_bundle_with_overlay_as_argument	30	100.00%
test_model.py	test_wait_for_idle_more_units_than_needed	24	80.00%
test_application.py	test_app_destroy	8	26.67%
test_application.py	test_upgrade_local_charm	8	26.67%
test_charmhub.py	test_subordinate_false_field_exists	8	26.67%
test_application.py	test_app_remove_wait_flag	6	20.00%
test_application.py	test_upgrade_local_charm_resource	6	20.00%
test_application.py	test_action	5	16.67%
test_model.py	test_destroy_units	3	10.00%
test_model.py	test_unit_annotations	3	10.00%
test_machine.py	test_machine_ssh	2	6.67%
test_model.py	test_attach_resource	2	6.67%
test_model.py	test_wait_for_idle_with_not_enough_units	2	6.67%
test_model.py	test_add_manual_machine_ssh	1	3.33%
test_model.py	test_add_manual_machine_ssh_root	1	3.33%
test_model.py	test_deploy_with_base	1	3.33%
test_model.py	test_local_file_resource_charm	1	3.33%
test_unit.py	test_run	1	3.33%

How many failing tests does each job have?

# tests failing	# jobs
3	*
4	***
5	*********
6	**********
7	*****
8	*
9	*

How many tests fail once, twice, etc?

# fails	# tests
1	*****
2	***
3	**
5	*
6	**
8	***
24	*
30	***

james-garner-canonical · 2024-10-14T02:19:40Z

Running integration tests serially seems to have helped a lot, but we still get some intermittent failures. For example, quarantined integration tests even regularly pass, but just now we had this failure:

FAILED tests/integration/test_model.py::test_deploy_bundle_with_storage_constraint

https://github.com/juju/python-libjuju/actions/runs/11319443801/job/31475510517?pr=1158

and in a second run:

FAILED tests/integration/test_model.py::test_deploy_local_bundle_include_base64

https://github.com/juju/python-libjuju/actions/runs/11319443801/job/31476578410

with the third run passing

james-garner-canonical mentioned this issue Sep 23, 2024

fix(jenkins): pin python kubernetes version to fix recent breakage in tests #1106

Merged

james-garner-canonical mentioned this issue Sep 26, 2024

feat(client-only-schemas): remove schemas for EOL Juju 3.2 #1113

Merged

james-garner-canonical mentioned this issue Sep 27, 2024

feat(client-only-schemas): replace existing schemas with latest client schemas #1117

Closed

james-garner-canonical added priority/high should be prioritized and removed priority/normal normal priority labels Oct 1, 2024

dimaqq mentioned this issue Oct 1, 2024

chore: disable failing integration test #1119

Closed

dimaqq mentioned this issue Oct 1, 2024

feat: deprecation warnings on select application attributes #1116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration tests seem to be non-deterministic and regularly fail on main #1108

Integration tests seem to be non-deterministic and regularly fail on main #1108

james-garner-canonical commented Sep 23, 2024 •

edited

Loading

dimaqq commented Oct 1, 2024

james-garner-canonical commented Oct 1, 2024 •

edited

Loading

james-garner-canonical commented Oct 14, 2024 •

edited

Loading

Integration tests seem to be non-deterministic and regularly fail on main #1108

Integration tests seem to be non-deterministic and regularly fail on main #1108

Comments

james-garner-canonical commented Sep 23, 2024 • edited Loading

Description

Urgency

Python-libjuju version

Juju version

Reproduce / Test

dimaqq commented Oct 1, 2024

james-garner-canonical commented Oct 1, 2024 • edited Loading

james-garner-canonical commented Oct 14, 2024 • edited Loading

james-garner-canonical commented Sep 23, 2024 •

edited

Loading

james-garner-canonical commented Oct 1, 2024 •

edited

Loading

james-garner-canonical commented Oct 14, 2024 •

edited

Loading