Skip to content

Commit

Permalink
fix: inventory canary fails because state is too large (#1362)
Browse files Browse the repository at this point in the history
The Inventory Canary (a Lambda function that calculates how many
packages, docsets, etc we have) has been failing for a long while.

The reason is that it accumulates data into an intermediary JSON object
that gets saved to S3 periodically when the Lambda is reaching its 15
minute timeout, then reloaded into the next Lambda instance. However,
after a certain point the JSON payload exceeds 512MB (which is the
maximum string size that V8 will serialize), and the Lambda fails.

Ultimately, we use this information for 2 purposes:

- Emit metrics about total counts of packges, submodules, docsets for
each.
- Write detailed reports about missing documentation sets and corrupted
assemblies etc.

The first one is used on the dashboard (which has not been showing data
for a while); the second one we ignore and never look at.

Solve this problem by replacing the counters with a
[HyperLogLog](https://en.wikipedia.org/wiki/HyperLogLog) structure. This
is now no longer an accurate counter -- it is allowed to have a 1%
deviation from the actual number we are looking for. In return, its size
is constant instead of ever-growing. We now no longer run the risk of
growing our state object too large to serialize.

This drops support for most of the "failed packages" reports we used to
create; we only collect a list of uninstallable packages. Otherwise, the
other reports we used to create (missing documentation and corrupt
assemblies per package version), are no longer created.

We can add those back if we ever see the need, and collect only those
packages, instead of all data on all packages in the catalog in
triplicate.


----

*By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache-2.0 license*

---------

Signed-off-by: github-actions <[email protected]>
Co-authored-by: github-actions <[email protected]>
  • Loading branch information
rix0rrr and github-actions authored Nov 23, 2023
1 parent 3a398c7 commit 8c31ff9
Show file tree
Hide file tree
Showing 15 changed files with 42,928 additions and 587 deletions.
1 change: 1 addition & 0 deletions .gitignore

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions .projen/deps.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions .projen/tasks.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions .projenrc.ts
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,7 @@ const project = new CdklabsConstructLibrary({
'semver',
'spdx-license-list',
'streamx',
'streamcount',
'tar-stream',
'uuid',
'yaml',
Expand Down Expand Up @@ -174,6 +175,7 @@ project.setScript(

project.addGitIgnore('!/test/fixtures/tests/package.tgz');
project.addGitIgnore('/test/integ.transliterator.ecstask.ts.snapshot/asset.*');
project.addGitIgnore('!/src/third-party-types/*');

project.package.addField('resolutions', {
// https://github.com/aws/aws-cdk/issues/20319
Expand Down
1 change: 1 addition & 0 deletions package.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 8c31ff9

Please sign in to comment.