-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send data out immediately on the write-path #1760
Comments
Just an observation that a client may be able to re-connect after receiving a response within the buffer window. So we should serve new requests from memory where possible as well as currently blocked live requests. |
Two conflicting statements: "persist changes to shape logs in the background" vs "We don't ack a transaction from Postgres until all shapes for a transaction have been updated". In fact, you can ack a transaction as soon as it is persisted, and you can perform shape matching and propagation in a parallel background task that doesn't have to ack. In case of a crash, either the transaction was not persisted hence not ack'ed, and you get it trom the server; or it was and you get it from the persisted log. Again, it helps to have a single on-disk log common to all shapes. |
You can't get the operation from PG WAL once it's acked. We only ack on Pg to be able to recover from the point the server crashed.
yeah, we can come back to this. The reasoning is that we don't want to scan logs on reads because they are a lot more frequent than writes. |
Yeah, the issue description is not clear about that. In the original RFC I suggest doing that by scanning the common buffer, or holding a buffer for each shape. That needs to be clarified during implementation. Well spotted. Thanks for raising it. |
This was completed? I thought we're still writing to disk before sending to client? Or is that just for the initial snapshot? |
@robacourt can you confirm? |
@robacourt how much time would this cut off latency |
When a transactions comes in, the server persist changes to shape logs before sending them to subscribers waiting for new data. This is adding a fair amount of latency to the write path. In this issue we want to stream rows to clients as soon as we find a shape match and persist changes to shape logs in the background.
In order to do that, we're going to buffer row -> [shapeIds] and have a process to persist changes for those shape . We don't ack a transaction from Postgres until all shapes for a transaction have been updated. This allows to recover gracefully from crashes, as the server can continue from where it stopped and just skip shape logs that have alreaby been written.
The buffer has a fixed size (configurable?). Pending transactions are written to disk:
It might happen that Postgres writes to Electric at a faster speed than Electric can handle shape logs. The developer would need to handle that situation by increasing the buffer size, or account for PG WAL size increase.
This task should be done after #1744 as it builds on the assumption that a single process would determine what shapes need to be written (we'd have to revise approach otherwise)
The text was updated successfully, but these errors were encountered: