Alco/acquire lock in replication client #1850

alco · 2024-10-14T22:49:34Z

PR chain: #1842 ← #1841 ← #1845 ← #1846 ← #1850

…nager By using the tuple {:shutdown, reason} we avoid logging scary errors about each of the connection processes linked to ConnectionManager dying. Also, it doesn't make much sense to tag the shutdown reason, given that this reason is propagated to the linked processes as an exit signal, which would result in each of the connection processes logging its exit reason using the same tag.

…ence

This partially reverts c886f86.

…etup phase We already have a state machine defined for the replication connection and there's no need to check out the whole connection from the DB pool (twice) just to look up some static PG info.

This change reduces coupling between Timeline and ShapeCache modules, resulting in easier comprehension and maintenance of this small part of the code. ConnectionManager serves as the supervisor of the connection startup procedure and that's where we keep the knowledge about how a change in PG timeline affects the shape cache. As a bonus, we don't have to first start all shape consumers just to bring them all down afterwards, as we've been doing up until this change.

…the code

…setup

netlify · 2024-10-14T22:50:43Z

✅ Deploy Preview for electric-next ready!

Name	Link
🔨 Latest commit	`15d90a5`
🔍 Latest deploy log	https://app.netlify.com/sites/electric-next/deploys/670da0004e321500087182a4
😎 Deploy Preview	https://deploy-preview-1850--electric-next.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

msfstef

Nice and clean! 🚀 🔒

msfstef · 2024-10-15T08:41:55Z

packages/sync-service/lib/electric/postgres/replication_client/connection_setup.ex

+    {:query, query, state}
+  end
+
+  defp acquire_lock_result([%Postgrex.Result{columns: ["pg_advisory_lock"]}], state) do


would we ever need to handle a Postgrex.Error from the advisory lock query? I imagine disconnections etc are handled by the replication client and connection manager, but I don't know if this query can ever fail

packages/sync-service/lib/electric/postgres/replication_client.ex

msfstef · 2024-10-15T08:49:14Z

packages/sync-service/test/electric/postgres/lock_connection_test.exs

@@ -6,6 +6,7 @@ defmodule Electric.Postgres.LockConnectionTest do
  alias Electric.Postgres.LockConnection

  @moduletag :capture_log
+  @moduletag :skip


I assume you intend to "move" the lock tests to the replication client tests later?

alco · 2024-10-16T16:07:42Z

After working a bit more on connection process life cycle management in #1841 I've decided to scratch this change. Even though moving the advisory lock acquisition to the replication connection would help us save one idle PG connection, it would rob us of the ability to restart the replication client independently of the other connections, namely, the lock connection and the DB pool.

If we merged this PR, restarting the replication client process (which we do to ensure consistent storaged of streamed transactions by shape consumers) would always necessitate restarting the whole Electric.Connection.Supervisor tree.

alco added 13 commits October 14, 2024 17:40

Start replication client before the shapes supervisor

97bb219

(tmp) Skip failing test

b7e2d23

Clarify the role of ConnectionManager in the application startup sequ…

2153f5b

…ence

Move PG version querying into ReplicationClient

4d4476a

This partially reverts c886f86.

Fetch PG system identifier and timeline ID with one query

86a5de4

Move the querying for PG timeline to ReplicationClient's connection s…

edbbccd

…etup phase We already have a state machine defined for the replication connection and there's no need to check out the whole connection from the DB pool (twice) just to look up some static PG info.

Make Timeline's dependency on PersistentKV more prominent

b2bc90a

Make Timeline's internal logic clearer through fewer indirections in …

fe25522

…the code

Update Timeline tests

ae60ac5

Acquire the exclusive connection lock as part of ReplicationClient's …

62e51d1

…setup

(tmp) Skip LockConnection tests

15d90a5

alco changed the base branch from main to alco/decouple-timeline-from-shape-cache October 14, 2024 22:49

alco mentioned this pull request Oct 14, 2024

chore(sync-service): Decouple Timeline from ShapeCache #1846

Closed

msfstef approved these changes Oct 15, 2024

View reviewed changes

alco force-pushed the alco/decouple-timeline-from-shape-cache branch from ae60ac5 to 8381a60 Compare October 16, 2024 15:58

alco closed this Oct 16, 2024

alco deleted the alco/acquire-lock-in-replication-client branch October 16, 2024 16:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alco/acquire lock in replication client #1850

Alco/acquire lock in replication client #1850

alco commented Oct 14, 2024 •

edited

Loading

netlify bot commented Oct 14, 2024

msfstef left a comment

msfstef Oct 15, 2024

msfstef Oct 15, 2024

alco commented Oct 16, 2024

Alco/acquire lock in replication client #1850

Alco/acquire lock in replication client #1850

Conversation

alco commented Oct 14, 2024 • edited Loading

netlify bot commented Oct 14, 2024

✅ Deploy Preview for electric-next ready!

msfstef left a comment

Choose a reason for hiding this comment

msfstef Oct 15, 2024

Choose a reason for hiding this comment

msfstef Oct 15, 2024

Choose a reason for hiding this comment

alco commented Oct 16, 2024

alco commented Oct 14, 2024 •

edited

Loading