Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: inventory canary fails because state is too large #1362

Merged
merged 3 commits into from
Nov 23, 2023

Conversation

rix0rrr
Copy link
Contributor

@rix0rrr rix0rrr commented Nov 22, 2023

The Inventory Canary (a Lambda function that calculates how many packages, docsets, etc we have) has been failing for a long while.

The reason is that it accumulates data into an intermediary JSON object that gets saved to S3 periodically when the Lambda is reaching its 15 minute timeout, then reloaded into the next Lambda instance. However, after a certain point the JSON payload exceeds 512MB (which is the maximum string size that V8 will serialize), and the Lambda fails.

Ultimately, we use this information for 2 purposes:

  • Emit metrics about total counts of packges, submodules, docsets for each.
  • Write detailed reports about missing documentation sets and corrupted assemblies etc.

The first one is used on the dashboard (which has not been showing data for a while); the second one we ignore and never look at.

Solve this problem by replacing the counters with a HyperLogLog structure. This is now no longer an accurate counter -- it is allowed to have a 1% deviation from the actual number we are looking for. In return, its size is constant instead of ever-growing. We now no longer run the risk of growing our state object too large to serialize.

This drops support for most of the "failed packages" reports we used to create; we only collect a list of uninstallable packages. Otherwise, the other reports we used to create (missing documentation and corrupt assemblies per package version), are no longer created.

We can add those back if we ever see the need, and collect only those packages, instead of all data on all packages in the catalog in triplicate.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

The Inventory Canary (a Lambda function that calculates how many
packages, docsets, etc we have) has been failing for a long while.

The reason is that it accumulates data into an intermediary JSON object
that gets saved to S3 periodically when the Lambda is reaching its 15
minute timeout, then reloaded into the next Lambda instance. However,
after a certain point the JSON payload exceeds 512MB (which is the
maximum string size that V8 will serialize), and the Lambda fails.

Ultimately, we use this information for 2 purposes:

- Emit metrics about total counts of packges, submodules, docsets
  for each.
- Write detailed reports about missing documentation sets and corrupted
  assemblies etc.

The first one is used on the dashboard (which has not been showing
data for a while); the second one we ignore and never loop at.

Solve this problem by replacing the counters with a
[HyperLogLog](https://en.wikipedia.org/wiki/HyperLogLog) structure. This
is now no longer an accurate counter -- it is allowed to have a 1%
deviation from the actual number we are looking for. In return, its size
is constant instead of ever-growing. We now no longer run the risk of
growing our state object too large to serialize.

This drops support for most of the "failed packages" reports we used
to create; we only collect a list of uninstallable packages. Otherwise,
the other reports we used to create (missing documentation and
corrupt assemblies per package version), are no longer created.

We can add those back if we ever see the need, and collect only those
packages, instead of all data on all packages in the catalog in
triplicate.
@rix0rrr rix0rrr requested a review from a team November 22, 2023 13:56
@rix0rrr rix0rrr changed the title fix: inventory canary's state is too large fix: inventory canary fails because state is too large Nov 22, 2023
Copy link
Contributor

@madeline-k madeline-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solve this problem by replacing the counters with a HyperLogLog structure. This is now no longer an accurate counter -- it is allowed to have a 1% deviation from the actual number we are looking for. In return, its size is constant instead of ever-growing. We now no longer run the risk of growing our state object too large to serialize.

This is very cool! TIL!

Approving, since the code looks good to me. Thanks for putting in plenty of comments explaining! (However, the build is failing.)

@cdklabs-automation cdklabs-automation added this pull request to the merge queue Nov 23, 2023
Merged via the queue into main with commit 8c31ff9 Nov 23, 2023
6 checks passed
@cdklabs-automation cdklabs-automation deleted the huijbers/inventory-canary branch November 23, 2023 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants