Skip to content

m99coder/pglogical-poc

Repository files navigation

pglogical-poc

Replicate from PostgreSQL 11.5 to 11.10 using pglogical 2.2.2

Dependencies

# install timeout command on macOS
brew install coreutils

Setup

In this PoC we logically replicate from a PostgreSQL 11.5 to a PostgreSQL 11.10. Both instances running in Docker containers and communicating with each other. Both have pglogical 2.2.2 installed.

# start containers
docker-compose up -d

# in case we need to rebuilt the images use
docker-compose up -d --build

# running services
docker-compose ps

# stop containers
docker-compose down --rmi all

For a more realistic setup there are three tables created: users, posts, and comments, where comments has a foreign key for posts and posts has a foreign key for users. The goal of this PoC is to move everything related to a specific user: 1 row from users, x rows from posts, and y rows from comments.

pglogical currently doesn’t support sub-queries in the row_filter. So we need an alternative approach.

invalid row_filter expression "post_id = IN (SELECT id FROM posts WHERE user_id = 1)"

For simplicity we choose de-normalizing the foreign key relation from comments to posts to users by adding a user_id column to the comments table directly, that will be populated with the randomly chosen user_id values set in posts table.

Now run replication queries:

# first for the provider:
#   - pglogical.create_node
#   - pglogical.create_replication_set
#   - pglogical.replication_set_add_table
docker exec -it pglogical-poc-pgprovider-1 \
  psql -U postgres -d pg_logical_replication -f /replication.sql

# second for the subscriber:
#   - pglogical.create_node
#   - pglogical.create_subscription
docker exec -it pglogical-poc-pgsubscriber-1 \
  psql -U postgres -d pg_logical_replication_results -f /replication.sql

And finally, check if the correct number of posts was replicated based on the arbitrary row filter user_id = 1:

# get number of posts having `user_id = 1`
docker exec -it pglogical-poc-pgprovider-1 \
  psql -U postgres -d pg_logical_replication \
    -c 'SELECT COUNT(*) FROM posts WHERE user_id = 1;'
 count
-------
    19
(1 row)

# get number of replicated posts
docker exec -it pglogical-poc-pgsubscriber-1 \
  psql -U postgres -d pg_logical_replication_results \
    -c 'SELECT COUNT(*) FROM posts;'
 count
-------
    19
(1 row)

The actual number of posts can differ between runs, as the initial data is generated randomly. The important thing is that the two numbers are indeed equal.

Try to add more posts and comments to the provider instance and check if the replication worked.

docker exec -it pglogical-poc-pgprovider-1 \
  psql -U postgres -d pg_logical_replication \
    -c 'INSERT INTO posts (SELECT generate_series(1001, 2000), FLOOR(random() * 50) + 1);'
INSERT 0 1000

docker exec -it pglogical-poc-pgprovider-1 \
  psql -U postgres -d pg_logical_replication \
    -c 'INSERT INTO comments (SELECT generate_series(201, 400), FLOOR(random()* 1000) + 1, 1, (ROUND(random())::int)::boolean);'
INSERT 0 200

docker exec -it pglogical-poc-pgprovider-1 \
  psql -U postgres -d pg_logical_replication \
    -c 'UPDATE comments
SET user_id = subquery.user_id
FROM (
  SELECT posts.user_id, comments.id
  FROM posts
  INNER JOIN comments ON posts.id = comments.post_id
) AS subquery
WHERE comments.id = subquery.id;'
UPDATE 400

Some useful SQL queries:

-- show subscription status
SELECT * FROM pglogical.show_subscription_status(
  subscription_name := 'pglogical_subscription'
);

-- show subscription table
SELECT * FROM pglogical.show_subscription_table(
  subscription_name := 'pglogical_subscription',
  relation := 'example'
);

-- show `pglogical` relations
\dt pglogical.

-- describe `pglogical.local_sync_status`
\d+ pglogical.local_sync_status

-- show local sync status
SELECT sync_status
  FROM pglogical.local_sync_status
  WHERE sync_nspname = 'public' AND sync_relname = 'example';

The sync states are defined here and mean the following:

  • \0: SYNC_STATUS_NONE (No sync)
  • i: SYNC_STATUS_INIT (Ask for sync)
  • s: SYNC_STATUS_STRUCTURE (Sync structure)
  • d: SYNC_STATUS_DATA (Sync data)
  • c: SYNC_STATUS_CONSTAINTS (Sync constraints)
  • w: SYNC_STATUS_SYNCWAIT (Table sync is waiting to get OK from main thread)
  • u: SYNC_STATUS_CATCHUP (Catching up)
  • y: SYNC_STATUS_SYNCDONE (Sync finished at LSN)
  • r: SYNC_STATUS_READY (Sync done)

Determine replication status

# check replication slots on provider
docker exec -it pglogical-poc-pgprovider-1 \
  psql -U postgres -d pg_logical_replication
psql (11.5 (Debian 11.5-3.pgdg90+1))
Type "help" for help.

pg_logical_replication=# \x
Expanded display is on.
pg_logical_replication=# SELECT * FROM pg_replication_slots;
-[ RECORD 1 ]-------+----------------------------------------------------
slot_name           | pgl_pg_logic194f0de_provider_pglogical_subscription
plugin              | pglogical_output
slot_type           | logical
datoid              | 16384
database            | pg_logical_replication
temporary           | f
active              | t
active_pid          | 103
xmin                |
catalog_xmin        | 577
restart_lsn         | 0/1826A30
confirmed_flush_lsn | 0/1826A68

pg_logical_replication=# exit

Find column descriptions here.

# check current WAL insert LSN
docker exec -it pglogical-poc-pgprovider-1 \
  psql -U postgres -d pg_logical_replication
psql (11.5 (Debian 11.5-3.pgdg90+1))
Type "help" for help.

pg_logical_replication=# \x
Expanded display is on.
pg_logical_replication=# SELECT pg_current_wal_insert_lsn();
-[ RECORD 1 ]-------------+----------
pg_current_wal_insert_lsn | 0/18264A8

pg_logical_replication=# exit
# check replication status on provider
docker exec -it pglogical-poc-pgprovider-1 \
  psql -U postgres -d pg_logical_replication
psql (11.5 (Debian 11.5-3.pgdg90+1))
Type "help" for help.

pg_logical_replication=# \x
Expanded display is on.
pg_logical_replication=# SELECT * FROM pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 101
usesysid         | 10
usename          | postgres
application_name | pglogical_subscription
client_addr      | 192.168.128.3
client_hostname  |
client_port      | 58410
backend_start    | 2021-03-17 16:48:24.83939+00
backend_xmin     |
state            | streaming
sent_lsn         | 0/18264A8
write_lsn        | 0/18264A8
flush_lsn        | 0/18264A8
replay_lsn       | 0/18264A8
write_lag        |
flush_lag        |
replay_lag       |
sync_priority    | 0
sync_state       | async

pg_logical_replication=# SELECT pg_size_pretty(pg_current_wal_insert_lsn() - '0/00000000'::pg_lsn);
-[ RECORD 1 ]--+------
pg_size_pretty | 24 MB

pg_logical_replication=# SELECT
pg_logical_replication-#   pg_current_wal_insert_lsn(),
pg_logical_replication-#   replay_lsn,
pg_logical_replication-#   pg_size_pretty(pg_current_wal_insert_lsn() - replay_lsn::pg_lsn) AS diff
pg_logical_replication-# FROM pg_stat_replication;
-[ RECORD 1 ]-------------+----------
pg_current_wal_insert_lsn | 0/1826588
replay_lsn                | 0/1826588
diff                      | 0 bytes

pg_logical_replication=# exit

Find column descriptions here.

# check local sync status on subscriber
docker exec -it pglogical-poc-pgsubscriber-1 \
  psql -U postgres -d pg_logical_replication_results
psql (11.10 (Debian 11.10-1.pgdg90+1))
Type "help" for help.

pg_logical_replication_results=# \x
Expanded display is on.
pg_logical_replication_results=# SELECT * FROM pglogical.local_sync_status;
-[ RECORD 1 ]--+-----------
sync_kind      | f
sync_subid     | 2875150205
sync_nspname   | public
sync_relname   | users
sync_status    | r
sync_statuslsn | 0/183A6D8
-[ RECORD 2 ]--+-----------
sync_kind      | f
sync_subid     | 2875150205
sync_nspname   | public
sync_relname   | posts
sync_status    | r
sync_statuslsn | 0/183A6D8
-[ RECORD 3 ]--+-----------
sync_kind      | f
sync_subid     | 2875150205
sync_nspname   | public
sync_relname   | comments
sync_status    | r
sync_statuslsn | 0/183A6D8
-[ RECORD 4 ]--+-----------
sync_kind      | d
sync_subid     | 2875150205
sync_nspname   |
sync_relname   |
sync_status    | r
sync_statuslsn | 0/0

pg_logical_replication_results=# exit

Metrics & Alerting

To monitor the overall performance and the replications in particular PostgreSQL Server Exporter is used to export metrics in the Prometheus format. A valid example configuration file for Prometheus can be found here. The effective configuration file is this one.

The following endpoints provide metrics respectively:

After having setup the initial dataset, we can see some values for how many tuples have been inserted (pg_stat_database_tup_inserted) in this graph as an example.

The mappings into the Prometheus format provided by PostgreSQL Server Exporter can be found here and the default custom queries added can be found here. If you don’t want to see the default metrics, set the environment variable PG_EXPORTER_DISABLE_DEFAULT_METRICS to true.

Prometheus will be configured with some example alerts:

  • PostgreSQLMaxConnectionsReached
  • PostgreSQLHighConnections
  • PostgreSQLDown
  • PostgreSQLSlowQueries
  • PostgreSQLQPS
  • PostgreSQLCacheHitRatio

Their definition can be found here.

Monitoring

In conjunction with Prometheus, Grafana can be used to monitor a whole bunch of different metrics provided by a variety of data sources. The custom credentials for Grafana are admin:s3cr3t. Provisioning capabilities are used to configure Prometheus as data source and also already create a useful dashboard.

Benchmarking

pgbench can be used to perform a benchmark. This is a 2-step process. First you need to initialize the database and then you can run the benchmark itself.

$ # init pgbench by creating the necessary tables
$ docker exec -it pglogical-poc-pgprovider-1 \
    pgbench -U postgres -d pg_logical_replication -i

$ # run pgbench
$ docker exec -it pglogical-poc-pgprovider-1 \
    pgbench -U postgres -d pg_logical_replication -c 10 -T 300
# ...
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 1
query mode: simple
number of clients: 10
number of threads: 1
duration: 300 s
number of transactions actually processed: 148141
latency average = 20.252 ms
tps = 493.772765 (including connections establishing)
tps = 493.777011 (excluding connections establishing)

Replication

We are aiming to utilize pgbench for our replication example. A good candidate is the pgbench_history table, that is holding almost ~2.4k records per teller (after running pgbench) and there have been 10 different tellers created. One caveat exist: This table doesn’t have a primary key, so we can only replicate INSERT statements. For this example it’s sufficient, though.

pgbench creates the following data:

  • pgbench_accounts: 100,000 accounts all related to branch 1 with a balance of 0
  • pgbench_branches: 1 branch with a total balance of 0
  • pgbench_history: a kind of ledger that is empty after initialization
  • pgbench_tellers: 10 tellers all related to branch 1 with a balance of 0

After having leveraged make start init replicate (or for short make run), we can check if the pgbench_history table in pgsubscriber is filled.

# count on provider side
docker exec -it pglogical-poc-pgprovider-1 \
  psql -U postgres -d pg_logical_replication \
    -c 'SELECT COUNT(*) FROM pgbench_history WHERE tid = 1;'
 count
-------
  2631
(1 row)

# count on subscriber side
docker exec -it pglogical-poc-pgsubscriber-1 \
  psql -U postgres -d pg_logical_replication_results \
    -c 'SELECT COUNT(*) FROM pgbench_history WHERE tid = 1;'
 count
-------
  2631
(1 row)

Convenience

You can use the following make targets to simplify processes:

  • build: Build containers
  • start: Start containers
  • wait: Wait for databases to be ready
  • init: Init databases with pgbench
  • reset: Reset databases
  • replicate: Run replication
  • run: Runs start, init and replicate
  • list: List running containers
  • stop: Stop containers
  • clean: Remove containers

Debugging

For debugging purposes we can run a Docker image that only provides the psql client as follows:

$ # run `psql` within the same docker network and access `pgprovider`
$ docker run -it --net pglogical-poc_default --rm jbergknoff/postgresql-client \
    "postgresql://postgres:s3cr3t@pgprovider:5432?sslmode=disable"

Resources

Postgres

Grafana