Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trigger slog events for timer and install bundles #10332

Closed
mhofman opened this issue Oct 23, 2024 · 3 comments · Fixed by #10357
Closed

Add trigger slog events for timer and install bundles #10332

mhofman opened this issue Oct 23, 2024 · 3 comments · Fixed by #10357
Labels
cosmic-swingset package: cosmic-swingset enhancement New feature or request telemetry

Comments

@mhofman
Copy link
Member

mhofman commented Oct 23, 2024

What is the Problem Being Solved?

Timer polls and install bundles generate swingset runs (possibly empty), but unlike "bridge" and "deliver" (mailbox) triggers, we do not record a trigger event for these. This forces any contextualized slog processor (e.g. #10269 or #8680) to peek ahead at the first delivery to classify the type of run. We should also consider including more details in the existing trigger to enhance the classification.

Description of the Design

  • Add a cosmic-swingset-timer-poll slog event. Include the blockTime of the poll, and the number of wakes added to the queue.
  • Add a cosmic-swingset-install-bundle slog event. Include the endoZipBase64Sha512 of the bundler, and any error.
  • Add the body to existing cosmic-swingset-bridge-inbound slog events.
  • Consider adding messages/acks info to existing cosmic-swingset-deliver-inbound slog events

To avoid future regressions with missing triggers for swingset runs, we should consider plumbing this into runSwingset() somehow.

Security Considerations

None

Scaling Considerations

A little extra/duplicated data, but very little in comparison of the rest of the slog size.

Test Plan

Manual is good enough

Upgrade Considerations

Requires a change of the chain software, but it does not affect consensus.

@mhofman mhofman added enhancement New feature or request cosmic-swingset package: cosmic-swingset telemetry labels Oct 23, 2024
@warner
Copy link
Member

warner commented Oct 23, 2024

Yeah, that sounds reasonable to me. If it's easy, I'd add owner to the bridge-inbound records, that's one of the things I've needed to extract from the delivery to add to details fields (like identifying which oracle is submitting a PushPrice).

I think timer.poll() returns a boolean (wokeAnything in device-timer.js), not an integer, but still, including it in the slog event sounds useful.

Tell me more about "messages/acks info on cosmic-swingset-deliver-inbound" ? I don't think we have an event of that type. Are you thinking about whether a given bridge-inbound succeeded or failed? Or are you thinking about the sizes of the pending-sends and the other hangover-prevention data structures?

@mhofman
Copy link
Member Author

mhofman commented Oct 24, 2024

I'd add owner to the bridge-inbound records, that's one of the things I've needed to extract from the delivery to add to details fields (like identifying which oracle is submitting a PushPrice).

I believe all that is included in the body. See doBridgeInbound

I think timer.poll() returns a boolean (wokeAnything in device-timer.js), not an integer, but still, including it in the slog event sounds useful.

Yeah whatever the return value is. The name is addedToQueue in cosmic-swingset, which is not clear what type it was.

Tell me more about "messages/acks info on cosmic-swingset-deliver-inbound" ? I don't think we have an event of that type.

I'm talking about the deliverInvound of the mailbox, which arguably we don't really see on mainnet but that is technically enabled at that level (but are not provisioned so splats soon after I think).

@mhofman
Copy link
Member Author

mhofman commented Oct 24, 2024

Also I just remembered that any new events need to be added to the otel slog sender to avoid the problem of #8272/#9569 (which we likely should fix while we're at it).

@mergify mergify bot closed this as completed in #10357 Oct 29, 2024
mergify bot added a commit that referenced this issue Oct 29, 2024
closes: #10332 
closes: #8272
refs: #9569

## Description

Adds slog events for the bundle install and timer poll run triggers.
Add more context to the bridge and deliver inbound triggers.
Fixes the otel trace processing of upgrade events.

### Security Considerations
None

### Scaling Considerations
This adds a little more data to our slogs, but relatively minor compared to the amount of data we already generate.

### Documentation Considerations
None

### Testing Considerations

The integration test has limited coverage of the otel slog processor. I will repurpose #9569 to add testing in a3p, but in the mean time this change cannot affect operations and at worse the modification are not sufficient to our needs.

### Upgrade Considerations
Requires a chain software upgrade but the changes do not affect consensus and could be locally cherry-picked.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cosmic-swingset package: cosmic-swingset enhancement New feature or request telemetry
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants