Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send data out immediately on the write-path #1760

Open
balegas opened this issue Sep 25, 2024 · 7 comments
Open

Send data out immediately on the write-path #1760

balegas opened this issue Sep 25, 2024 · 7 comments
Assignees

Comments

@balegas
Copy link
Contributor

balegas commented Sep 25, 2024

When a transactions comes in, the server persist changes to shape logs before sending them to subscribers waiting for new data. This is adding a fair amount of latency to the write path. In this issue we want to stream rows to clients as soon as we find a shape match and persist changes to shape logs in the background.

In order to do that, we're going to buffer row -> [shapeIds] and have a process to persist changes for those shape . We don't ack a transaction from Postgres until all shapes for a transaction have been updated. This allows to recover gracefully from crashes, as the server can continue from where it stopped and just skip shape logs that have alreaby been written.

The buffer has a fixed size (configurable?). Pending transactions are written to disk:

  • at fixed delay, say 20ms
  • when the buffer fills (ideally it would stop ingesting more transactions from logical repl)
  • When there are no pending transactions, but this feels more optimization and we can handle that later

It might happen that Postgres writes to Electric at a faster speed than Electric can handle shape logs. The developer would need to handle that situation by increasing the buffer size, or account for PG WAL size increase.

This task should be done after #1744 as it builds on the assumption that a single process would determine what shapes need to be written (we'd have to revise approach otherwise)

@thruflo
Copy link
Contributor

thruflo commented Sep 25, 2024

Just an observation that a client may be able to re-connect after receiving a response within the buffer window. So we should serve new requests from memory where possible as well as currently blocked live requests.

@marc-shapiro
Copy link

Two conflicting statements: "persist changes to shape logs in the background" vs "We don't ack a transaction from Postgres until all shapes for a transaction have been updated". In fact, you can ack a transaction as soon as it is persisted, and you can perform shape matching and propagation in a parallel background task that doesn't have to ack. In case of a crash, either the transaction was not persisted hence not ack'ed, and you get it trom the server; or it was and you get it from the persisted log. Again, it helps to have a single on-disk log common to all shapes.

@balegas
Copy link
Contributor Author

balegas commented Sep 25, 2024

In case of a crash, either the transaction was not persisted hence not ack'ed, and you get it from the server

You can't get the operation from PG WAL once it's acked. We only ack on Pg to be able to recover from the point the server crashed.

Again, it helps to have a single on-disk log common to all shapes

yeah, we can come back to this. The reasoning is that we don't want to scan logs on reads because they are a lot more frequent than writes.

@balegas
Copy link
Contributor Author

balegas commented Sep 25, 2024

Just an observation that a client may be able to re-connect after receiving a response within the buffer window. So we should serve new requests from memory where possible as well as currently blocked live requests.

Yeah, the issue description is not clear about that. In the original RFC I suggest doing that by scanning the common buffer, or holding a buffer for each shape. That needs to be clarified during implementation.

Well spotted. Thanks for raising it.

@KyleAMathews
Copy link
Contributor

This was completed? I thought we're still writing to disk before sending to client? Or is that just for the initial snapshot?

@balegas
Copy link
Contributor Author

balegas commented Nov 25, 2024

@robacourt can you confirm?

@balegas balegas reopened this Nov 25, 2024
@KyleAMathews
Copy link
Contributor

@robacourt how much time would this cut off latency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants