Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

healthchecks : Improve self logs processing and healthchecks logs. #1674

Merged
merged 9 commits into from
Apr 22, 2024

Conversation

franciscovalentecastro
Copy link
Contributor

@franciscovalentecastro franciscovalentecastro commented Apr 12, 2024

Description

This PR adds several features and fixes that improve the runtime checks and self logs processing. Here is a detailed description :

  • Disable export of fluent-bit severity: debug logs (add self logs debug severity grep filter #1437, b/272779619).
    • Added integration test TestNoFluentBitDebugSelfLogs.
  • Simplify and clean self_logs.go implementation after fluent-bit 2.2 upgrade bug fixes. (b/328463822).
  • Remove sourceLocation field from health-checks.log which doesn't bring enough value and makes it harder to read.
  • Refactor all Runtime Check errors into healthchecks/error.go.

Before health-checks.log entry

{"severity":"INFO","time":"2024-04-16T16:24:01Z","logging.googleapis.com/sourceLocation":{"file":"/work/internal/healthchecks/healthchecks.go","function":"github.com/GoogleCloudPlatform/ops-agent/internal/healthchecks.HealthCheckResult.LogResult","line":"57"},"message":"[API Check] Result: PASS"}

After health-checks.log entry

{"severity":"INFO","time":"2024-04-15T21:33:08Z","message":"[API Check] Result: PASS"}

Note : This link points to the committed .go files without the goldens to make it easier for review.

Related issue

b/272779619, b/303073892, b/328463822

How has this been tested?

Checklist:

  • Unit tests
    • Unit tests do not apply.
    • Unit tests have been added/modified and passed for this PR.
  • Integration tests
    • Integration tests do not apply.
    • Integration tests have been added/modified and passed for this PR.
  • Documentation
    • This PR introduces no user visible changes.
    • This PR introduces user visible changes and the corresponding documentation change has been made.
  • Minor version bump
    • This PR introduces no new features.
    • This PR introduces new features, and there is a separate PR to bump the minor version since the last release already.
    • This PR bumps the version.

@franciscovalentecastro franciscovalentecastro force-pushed the fcovalente-remove-debug-logs branch from 38458ca to 29dd983 Compare April 15, 2024 16:35
@franciscovalentecastro franciscovalentecastro changed the title healthchecks : Improve runtime checks and self logs processing. healthchecks : Improve self logs processing and healthchecks logs. Apr 15, 2024
@franciscovalentecastro franciscovalentecastro force-pushed the fcovalente-remove-debug-logs branch 4 times, most recently from 993c83f to 8ffda40 Compare April 16, 2024 20:21
@franciscovalentecastro franciscovalentecastro requested review from a team and XuechunHou and removed request for a team April 18, 2024 14:14
Message: "Ops Agent failed to parse logs",
Action: "Refer to provided documentation link.",
ResourceLink: "https://cloud.google.com/stackdriver/docs/solutions/agents/ops-agent/troubleshoot-find-info",
IsFatal: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually a fatal error? Won't fluent-bit continue to run if it fails to parse a log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a fatal error. Thanks for mentioning. I changed to IsFatal: false.

@@ -182,13 +172,7 @@ func generateSelfLogsSamplingComponents(ctx context.Context) []fluentbit.Compone
// This method creates a component that enforces the `Structured Health Logs` format to
// all `ops-agent-health` logs. It sets `agentKind`, `agentVersion` and `schemaVersion`.
// It also translates `code` to the rich text message from the `selfLogTranslationList`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this comment line if it's no longer needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the // It also translates "code" to the rich text message from the "selfLogTranslationList" part of the comment which is not relevant anymore.

@franciscovalentecastro franciscovalentecastro force-pushed the fcovalente-remove-debug-logs branch from af7aaa6 to 113e038 Compare April 19, 2024 16:44
@franciscovalentecastro franciscovalentecastro merged commit 57c6c28 into master Apr 22, 2024
68 of 69 checks passed
@franciscovalentecastro franciscovalentecastro deleted the fcovalente-remove-debug-logs branch April 22, 2024 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants