Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ec2 CI testing framework #1541

Merged
merged 16 commits into from
Apr 29, 2024
Merged

Conversation

samuel40791765
Copy link
Contributor

@samuel40791765 samuel40791765 commented Apr 19, 2024

Issues:

Resolves P113131493

Description of changes:

This fixing ec2 CI testing framework we've turned off. I tried introducing intentional test failures to see if the framework will fail correctly (as shown in the commit history of this PR). I also introduced 10 builds to run simultaneously, to see if we would get the sporadic test failures we had been having.
The general run fails early in the build and the FIPS specific one fails sooner since it's being ran after the sanitizer tests. Good news is all runs fail and succeed as anticipated without the original sporadic issues we've been having.

Call-outs:

  1. Apparently the docker container needs to be in "privileged-mode" for the TSAN tests to work, so I've turned that on in the SSM document.
  2. An sporadic error message E: Unable to lock the administration directory (/var/lib/dpkg/) is another process using it? occurs when we try to call apt-get update right after the instance is spun up. Calling killall apt apt-get effectively gets around the issue.

More details on debugging can be found in thread of P113131493

Testing:

CI changes

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.

@samuel40791765 samuel40791765 force-pushed the fix-ec2-test branch 3 times, most recently from edaa922 to 9a75435 Compare April 19, 2024 23:50
@codecov-commenter
Copy link

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.27%. Comparing base (10a389e) to head (9a75435).
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1541      +/-   ##
==========================================
- Coverage   77.33%   77.27%   -0.07%     
==========================================
  Files         424      424              
  Lines       71452    71452              
==========================================
- Hits        55260    55217      -43     
- Misses      16192    16235      +43     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@samuel40791765 samuel40791765 marked this pull request as ready for review April 24, 2024 20:34
@samuel40791765 samuel40791765 requested a review from a team as a code owner April 24, 2024 20:34
@samuel40791765 samuel40791765 force-pushed the fix-ec2-test branch 3 times, most recently from a5bfb7b to 9e5a0e6 Compare April 24, 2024 22:44
smittals2
smittals2 previously approved these changes Apr 25, 2024
tests/ci/cdk/cdk/ssm/general_test_run_ssm_document.yaml Outdated Show resolved Hide resolved
Comment on lines +79 to +82
# Wait 5 minutes for instance to "warm up"?
echo "Instances need to initialize a few minutes before SSM commands can be properly run"
sleep 300
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of just waiting can this check some SSM agent status and ensure it's up and running before starting?

Copy link
Contributor Author

@samuel40791765 samuel40791765 Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not that SSM isn't accessible after the ec2 instance is spun up, we've been using SSM right after in our past test runs. The weird thing is that SSM will suddenly end the session and is unavailable for a brief moment at times. This only happens when the instance has been recently launched. It's the main source of sporadic issues we've been running into (I left a comment on it under the thread of P113131493).
I'd rather sacrifice the pending 5 minutes for stability, rather than have the inconsistent failures happen again.

@samuel40791765 samuel40791765 merged commit 4d280eb into aws:main Apr 29, 2024
68 checks passed
skmcgrail added a commit that referenced this pull request May 2, 2024
d52018b Minor functions to build with Ruby's cipher module (#1564)
364d28b Changed SSL_client_hello_get0_ciphers to align with OpenSSL
behavior (#1542)
e8eb7de ppc64le: EVP_has_aes_hardware is false w/ no-asm (#1566)
d726d06 OpenBSD 7.4 and 7.5 Support (#1437)
a66c66e Remove comments about overread for entropy generation (#1551)
f8a575f Migrate from __FreeBSD__ to __FreeBSD_version (#1562)
c31d1ce Centralize handling of s2n-bignum alt/non-alt function
selection (#1547)
00f3c45 CI for other MacOS versions (#1558)
0541314 Cleanup remaing duplicate symbol definitions and turn
Wredundant-decls on (#1561)
4d280eb Fix ec2 CI testing framework (#1541)
9a4b43e Update x25519_test.cc array initialization to avoid a bug with
a GCC 13 warning (#1555)
388cbe7 Remove duplicate X509_OBJECT_new and X509_OBJECT_free
declarations (#1560)
2ea6706 Avoid 'z' format with MSVCRT (#1559)
c25dc2a Add dependency to python3-six in github action grpc (#1554)
2bdcba3 Link porting guide table to header documentation (#1540)
311ca38 Basic GH CI build/test with full range of gcc/clang (#1546)
1f19717 Add SHA3-256 KAT to FIPS self-test (#1549)
0f3548a Add EC point add/dbl to speed.cc (#1545)
d7ddfc4 Fix the NTP integration test (NTP website changed) (#1548)
8ccd85b Fix skipped tests in Mariadb integration CI (#1533)
d940162 Support vpinsrq in delocater (#1543)
4cd6d21 Remove redundant test exec libraries (#1544)
56f3569 [ML-KEM] Add experimental support for ML-KEM-512-IPD (#1516)
c295aef Upstream merge 2024 04 16 (#1535)
2e51629 Re-add function
0aebf17 Define OPENSSL_NO_TLS_PHA, typedef PSK callback signatures
(#1526)
46056cf Pull the string-based extensions APIs into their own section
960ea42 Unexport X509_VERIFY_PARAM_lookup
3c597b1 Remove X509_VERIFY_PARAM_get0_peername
9c399e5 Document some key usage accessors
2fe70b5 Simplify and document X509_supported_extension
2e04897 Const-correct X509_LOOKUP_METHOD
9826568 Replace X509_LOOKUP_ctrl with real functions
e47c056 Tidy up x509_lu.c functions a little
62e019f Clean up the by_file_ctrl x509 code to be slightly less obtuse
45c46c2 Use relative links in markdown files

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license and the ISC license.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants