Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: adding AMQP heartbeat/retry/retrydelay support #549

Merged
merged 2 commits into from
Jun 14, 2024
Merged

feat: adding AMQP heartbeat/retry/retrydelay support #549

merged 2 commits into from
Jun 14, 2024

Conversation

kiraum
Copy link
Contributor

@kiraum kiraum commented Jun 7, 2024

Summary

This PR updates the AMQP configuration and connection handling to improve reliability and flexibility.

Changes

Configuration Struct Update:

  • Added amqp_heartbeat (int): Heartbeat interval for the AMQP connection.
  • Added amqp_retry (bool): Enables/disables retry logic for connection attempts.
  • Added amqp_retrydelay (int): Delay between retry attempts in seconds.

Configuration File Update:

Updated examples/carbon-relay-ng.ini to include amqp_heartbeat, amqp_retry, and amqp_retrydelay.

AMQP Connection Handling:

Added retry logic for connection attempts.
Configured heartbeat for the AMQP connection.
Added a goroutine to monitor and reconnect if the connection is closed.

Benefits

  • Improved Reliability: Automatic recovery from connection issues.
  • Configurable Heartbeat: Better connection health monitoring in the client side.
  • Enhanced Flexibility: More control over connection behavior.

Logs

Using the new branch/code, I am able to build and make, and the tests seem good. Additionally, I have been using this code in production for many days without any issues.

carbon-relay-ng$ git branch
* kiraum/adding_amqp_heartbeat_retry_support
  master

carbon-relay-ng$ cd cfg/

carbon-relay-ng/cfg$ go build && go test -v
=== RUN   TestTomlToGrafanaNetRoute
--- PASS: TestTomlToGrafanaNetRoute (0.05s)
PASS
ok      github.com/grafana/carbon-relay-ng/cfg  0.061s

carbon-relay-ng/cfg$ cd ..

carbon-relay-ng$ cd input

carbon-relay-ng/input$ go build && go test -v
=== RUN   TestAmqpSuccessfulShutdown
INFO[0000] consuming AMQP messages
INFO[0000] consumeAMQP: channel closed
INFO[0000] shutting down AMQP client
--- PASS: TestAmqpSuccessfulShutdown (0.00s)
=== RUN   TestTcpUdpShutdown
INFO[0000] listening on localhost:/udp
INFO[0000] listening on localhost:/tcp
INFO[0000] shutting down localhost:/tcp, closing socket
INFO[0000] shutting down localhost:/udp, closing socket
--- PASS: TestTcpUdpShutdown (0.00s)
=== RUN   TestTcpConnection
INFO[0000] listening on localhost:/udp
INFO[0000] listening on localhost:/tcp
WARN[0000] mock handler for 127.0.0.1:45796 returned: EOF. closing conn
INFO[0000] shutting down localhost:/tcp, closing socket
INFO[0000] shutting down localhost:/udp, closing socket
--- PASS: TestTcpConnection (0.10s)
=== RUN   TestUdpConnection
INFO[0000] listening on localhost:/tcp
INFO[0000] listening on localhost:/udp
WARN[0000] mock handler: EOF
INFO[0000] shutting down localhost:/tcp, closing socket
INFO[0000] shutting down localhost:/udp, closing socket
--- PASS: TestUdpConnection (0.10s)
PASS
ok      github.com/grafana/carbon-relay-ng/input        0.215s

carbon-relay-ng/input$ cd ..

carbon-relay-ng$ make LINUX_PACKAGE_GOARCH=amd64 build-linux
cd ui/web && go-bindata -pkg web admin_http_assets/...
find . -name '*.go' | grep -v '^\.\/vendor' | xargs gofmt -w -s
GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -ldflags "-X main.Version=4800408" -o carbon-relay-ng-linux-amd64 ./cmd/carbon-relay-ng

carbon-relay-ng$ docker build -t carbon-relay-ng:test . --no-cache --progress=plain
#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 331B done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [internal] load metadata for docker.io/library/alpine:latest
#3 ...

#4 [auth] library/alpine:pull token for registry-1.docker.io
#4 DONE 0.0s

#3 [internal] load metadata for docker.io/library/alpine:latest
#3 DONE 1.0s

#5 [1/5] FROM docker.io/library/alpine@sha256:77726ef6b57ddf65bb551896826ec38bc3e53f75cdde31354fbffb4f25238ebd
#5 CACHED

#6 [internal] load build context
#6 transferring context: 26.67MB 0.3s done
#6 DONE 0.3s

#7 [2/5] RUN apk --update add --no-cache ca-certificates
#7 0.284 fetch https://dl-cdn.alpinelinux.org/alpine/v3.20/main/x86_64/APKINDEX.tar.gz
#7 0.364 fetch https://dl-cdn.alpinelinux.org/alpine/v3.20/community/x86_64/APKINDEX.tar.gz
#7 0.667 (1/1) Installing ca-certificates (20240226-r0)
#7 0.686 Executing busybox-1.36.1-r28.trigger
#7 0.692 Executing ca-certificates-20240226-r0.trigger
#7 0.732 OK: 8 MiB in 15 packages
#7 DONE 0.9s

#8 [3/5] ADD carbon-relay-ng-linux-amd64 /bin/carbon-relay-ng
#8 DONE 0.1s

#9 [4/5] ADD examples/carbon-relay-ng.ini /conf/carbon-relay-ng.ini
#9 DONE 0.0s

#10 [5/5] RUN mkdir /var/spool/carbon-relay-ng
#10 DONE 0.3s

#11 exporting to image
#11 exporting layers
#11 exporting layers 0.2s done
#11 writing image sha256:09e9cad85205be0e0442fd866c4844e1c8973c32e029941f0e8c83d7547235c6 done
#11 naming to docker.io/library/carbon-relay-ng:test done
#11 DONE 0.2s

$ docker ps
CONTAINER ID   IMAGE                  COMMAND                  CREATED         STATUS         PORTS                                                                                                                                                                                                                   NAMES
f34043708cc1   carbon-relay-ng:test   "/bin/carbon-relay-n…"   3 seconds ago   Up 2 seconds   0.0.0.0:2023->2003/tcp, 0.0.0.0:2023->2003/udp, :::2023->2003/tcp, :::2023->2003/udp, 0.0.0.0:2024->2013/tcp, 0.0.0.0:2024->2013/udp, :::2024->2013/tcp, :::2024->2013/udp, 0.0.0.0:8281->8081/tcp, :::8281->8081/tcp   carbon-relay-ng2
83ae66214cef   carbon-relay-ng:test   "/bin/carbon-relay-n…"   3 seconds ago   Up 2 seconds   0.0.0.0:2013->2003/tcp, 0.0.0.0:2013->2003/udp, :::2013->2003/tcp, :::2013->2003/udp, 0.0.0.0:2014->2013/tcp, 0.0.0.0:2014->2013/udp, :::2014->2013/tcp, :::2014->2013/udp, 0.0.0.0:8181->8081/tcp, :::8181->8081/tcp   carbon-relay-ng1
69c9d5d5ba3b   carbon-relay-ng:test   "/bin/carbon-relay-n…"   3 seconds ago   Up 2 seconds   0.0.0.0:2033->2003/tcp, 0.0.0.0:2033->2003/udp, :::2033->2003/tcp, :::2033->2003/udp, 0.0.0.0:2034->2013/tcp, 0.0.0.0:2034->2013/udp, :::2034->2013/tcp, :::2034->2013/udp, 0.0.0.0:8381->8081/tcp, :::8381->8081/tcp   carbon-relay-ng3

The result is that now, if I bring RabbitMQ down, carbon-relay-ng notices the session is down and keeps retrying to connect based on the retry delay until it is able to reconnect:

carbon-relay-ng2    | 2024-06-07 20:48:00.167 [INFO] dest carbon-default_192_168_242_31_2003 new conn online
carbon-relay-ng2    | 2024-06-07 20:48:00.167 [INFO] dest carbon-default_192_168_242_33_2003 new conn online
carbon-relay-ng2    | 2024-06-07 20:48:00.167 [INFO] dest carbon-default_192_168_242_32_2003 new conn online
carbon-relay-ng2    | 2024-06-07 20:48:00.563 [INFO] consuming AMQP messages
carbon-relay-ng2    | 2024-06-07 20:49:01.001 [INFO] stats now connected to 192.168.242.33:2003
carbon-relay-ng2    | 2024-06-07 20:50:58.361 [INFO] AMQP connection closed.
carbon-relay-ng2    | 2024-06-07 20:50:58.361 [INFO] Attempting to reconnect...
carbon-relay-ng2    | 2024-06-07 20:50:58.361 [INFO] dialing AMQP: amqp://admin:[email protected]/
carbon-relay-ng3    | 2024-06-07 20:50:58.361 [INFO] AMQP connection closed.
carbon-relay-ng3    | 2024-06-07 20:50:58.361 [INFO] Attempting to reconnect...
carbon-relay-ng3    | 2024-06-07 20:50:58.361 [INFO] dialing AMQP: amqp://admin:[email protected]/
carbon-relay-ng1    | 2024-06-07 20:50:58.360 [INFO] AMQP connection closed.
carbon-relay-ng1    | 2024-06-07 20:50:58.360 [INFO] Attempting to reconnect...
carbon-relay-ng1    | 2024-06-07 20:50:58.360 [INFO] dialing AMQP: amqp://admin:[email protected]/
carbon-relay-ng2    | 2024-06-07 20:50:58.409 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng1    | 2024-06-07 20:50:58.409 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng3    | 2024-06-07 20:50:58.409 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng1    | 2024-06-07 20:51:28.459 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng2    | 2024-06-07 20:51:28.459 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng3    | 2024-06-07 20:51:28.459 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng1    | 2024-06-07 20:51:58.509 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng2    | 2024-06-07 20:51:58.509 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng3    | 2024-06-07 20:51:58.509 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng1    | 2024-06-07 20:52:28.559 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng3    | 2024-06-07 20:52:28.559 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng2    | 2024-06-07 20:52:28.559 [ERROR] Failed to connect to AMQP server: dial tcp 192.168.248.146:5672: connect: connection refused. Retrying in 30 seconds...
carbon-relay-ng3    | 2024-06-07 20:52:59.510 [INFO] Successfully reconnected to AMQP server.
carbon-relay-ng1    | 2024-06-07 20:52:59.510 [INFO] Successfully reconnected to AMQP server.
carbon-relay-ng2    | 2024-06-07 20:52:59.510 [INFO] Successfully reconnected to AMQP server.

Thanks for maintaining this project! Please, review and provide feedback if needed.

@CLAassistant
Copy link

CLAassistant commented Jun 7, 2024

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@npazosmendez npazosmendez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contributing! Just one thing

input/amqp.go Outdated Show resolved Hide resolved
input/amqp.go Outdated Show resolved Hide resolved
input/amqp.go Outdated Show resolved Hide resolved
Copy link
Contributor

@npazosmendez npazosmendez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@npazosmendez npazosmendez merged commit b30db90 into grafana:master Jun 14, 2024
2 checks passed
@kiraum
Copy link
Contributor Author

kiraum commented Jun 14, 2024

Thanks for the smooth review/interactions, really appreciated! :D

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants