Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore (sync service): report replication lag #2043

Merged
merged 4 commits into from
Nov 26, 2024
Merged

Conversation

kevin-dp
Copy link
Contributor

@kevin-dp kevin-dp commented Nov 26, 2024

Fixes #2031.

  • exports the replication lag in bytes as a metric to Prometheus
  • also creates a span including the replication lag in milliseconds for every transaction

Note on clock drift

The replication lag in milliseconds may be affected by clock drift between Electric and Postgres. This may occur because Electric and Postgres may be running on different machines and we compare the transaction's commit timestamp (generated by PG) to Electric's timestamp at the time of writing the transaction to the shape log.

@kevin-dp kevin-dp force-pushed the kevin/report-lag-size branch from e1fc9ac to ef60d00 Compare November 26, 2024 11:46
@kevin-dp kevin-dp changed the title chore (sync service): report replication lag in bytes chore (sync service): report replication lag Nov 26, 2024
@kevin-dp kevin-dp force-pushed the kevin/report-lag-size branch from 8d0444b to a133523 Compare November 26, 2024 13:23
@kevin-dp kevin-dp merged commit 4e50204 into main Nov 26, 2024
26 checks passed
@kevin-dp kevin-dp deleted the kevin/report-lag-size branch November 26, 2024 15:56
now = DateTime.utc_now()
lag = Kernel.max(0, DateTime.diff(now, commit_timestamp, :millisecond))

OpenTelemetry.with_span(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this just be an attribute on the handle transaction span?

kevin-dp added a commit that referenced this pull request Nov 26, 2024
This PR addresses [Kyle's
comment](#2043 (review))
and adds the replication lag to the current span instead of creating a
new span.

---------

Co-authored-by: Kyle Mathews <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Track replication lag in otel metrics and log warning when it gets too high
3 participants