-
-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PG_LOGICAL_SLOT_PEEK_CHANGES query crashes #546
Comments
Hi! With a smaller volume of data, when replication slots contains about to 1GB, the same behavior also occurs. There comes a point where, when trying to retrieve the changes, the connection is cut off. Cheers |
Hi and sorry about the delay on this. |
Hi @toluaina I just tested with the LOGICAL_SLOT_CHUNK_SIZE=100 with similar results: In one of the tests, the replication slot has been consumed by pgsync (takes about 45 min). The connection to the database is active, in the active connections, the active query was:
There isn't any limit to filter the input in the query. The limit in the PG_LOGICAL_SLOT_PEEK_CHANGES is the third parameter that is NULL (upto_nchanges). Inside the code there are the following notes;
I think that in this case, we need to set the number of the changes to read also, the upto_nchanges parameter. |
The idea was to filter on txmin and txmax and then paginate the resultset with limit/offset. |
Yes, I haven't checked it, but it's possible that the xmin value of the records is the same because we have a process with several steps involving transactions that perform large commits. For example, in one of the steps, files are loaded from S3 into PostgreSQL tables, but the files contain millions of records. I understand this already generates a high transaction log, although I'm not sure if all of them have the same xmin. Ok, thanks, can you create a new release or in a separate commit? |
|
Hi! Thanks for the fix, but yes, the max(lsm) is a very expensive query.
Current replication slots:
To solve this, It's not possible to create a loop with the upto_nchanges to avoid to decode all database files in the same interval of time? I suposse that would be slow but I thinks that works if the pgsync computes the max in a custom variable. |
Are you suggesting that you are still seeing this query below:
That query was removed and should no longer be present if you are running off the main branch. |
Sorry, I missplaced the source folder of the code. But I check again and the query with the max is again very expensive:
|
|
Yes, you are rigth, the new version work with the SLOT_SIZE:
|
But finally I got the same error:
|
I just tested with LOGICAL_SLOT_CHUNK_SIZE=1 but also fails, same error. |
I feel there are multiple things going on here.
|
PGSync version: 3.0.0
Postgres version: PostgreSQL 15.5 (AWS Aurora db.r7g.xlarge, with writer and reader nodes)
Elasticsearch/OpenSearch version: 2.1.1 (OpenSearch)
Redis version: 5.0.6 (ElastiCache)
Python version: 3.9
Error Message (if any):
Problem description:
Hi @toluaina,
We are experiencing issues in the following context:
We tried executing the query manually and encountered the same behavior:
The query terminates because PostgreSQL cuts off communication after 15-20 minutes. During execution, it uses up to 40 GB of disk space in our case.
The only way I've seen it finish is by including the limit
upto_nchanges
as a parameter (I tried with 100).In the postgresql there are two replication slots with 68 GB of data each one.
From what I've researched, this happens because the function
PG_LOGICAL_SLOT_PEEK_CHANGES
needs to convert the changes stored in files on disk into the storage of the database instance.https://pgpedia.info/p/pg_logical_slot_peek_changes.html
Also, in the past, I've seen work being done to solve this in pgsync:
#267
However, I have not found the parameter
PG_LOGICAL_SLOT_UPTO_NCHANGES
in either the version I use (3.0.0) or the most recent version of pgsync, so I have not been able to test it. In other versions (3.0.0), there is the parameterPG_LOGICAL_SLOT_UPTO_NCHANGES
, but it doesn't work either.Thank you very much!
The text was updated successfully, but these errors were encountered: