Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement OpenTelemetry tracing #302

Draft
wants to merge 17 commits into
base: master
Choose a base branch
from
Draft

Implement OpenTelemetry tracing #302

wants to merge 17 commits into from

Conversation

lf-
Copy link
Contributor

@lf- lf- commented May 6, 2023

This is based on #298 and that one should be merged first.

This PR implements and documents optional OpenTelemetry based tracing for the
Nomos backend, which should make debugging a lot easier by providing
contextually relevant logs (especially SQL logs).

This is a very basic integration and more spans could be added to trace more
parts of the system as needed. We could also integrate logs into OTel in the
future so that logs appear in context of the request and span they occurred in.

Currently this is marked as a draft pending #298.

Let me know if there's anything I can do to make this easier to review.

lf- added 17 commits April 29, 2023 21:51
docker-compose, somewhat confusingly, has two overloaded meanings for
"env file":
1. Expanded inside the container, by passing environment variables to
   the started process inside the container, which is
   services.SVC.env_file
2. Expanded inside docker-compose files, which is --env-file

We were missing both of them in different places, and the latter is
fixed by this commit.

https://docs.docker.com/compose/environment-variables/set-environment-variables/

Fixes these warnings:
WARN[0000] The "NOMOS_DB_USER" variable is not set. Defaulting to a blank string.
WARN[0000] The "NOMOS_DB_PASSWORD" variable is not set. Defaulting to a blank string.
WARN[0000] The "NOMOS_RABBITMQ_USER" variable is not set. Defaulting to a blank string.
WARN[0000] The "NOMOS_RABBITMQ_PASSWORD" variable is not set. Defaulting to a blank string.
WARN[0000] The "NOMOS_RABBITMQ_VHOST" variable is not set. Defaulting to a blank string.
This seems to be the most useful configuration to use for development so
we should ship it for ease of use.
Previously, the webhooker service was always pulling defaults and was
not actually using any of its variables in the environment file.
I found several bugs:
1. The service was taking 30 seconds (!!!?!) to start on my machine
   (Arch Linux x86_64) and taking 16 GB of memory (...). This seems to
   be some kind of strange bug in the container, which is worked around
   by setting some ulimit values.
2. MYSQL_PASS is the wrong variable for the password; the correct one is
   MYSQL_PASSWORD.
3. A root password must be set for the container to start.
4. We forgot to set the database to create, which is important to giving
   the app access to it.
Previously this was done in the tools/vagrant_provision.sh script, but
I want to deprecate Vagrant, so we need a second way.
We were also missing some environment variables related to Stripe which
the backend was angry about, so I fixed those too.
This was causing errors such as the following in docker-compose:

```
network rabbitmq declared as external, but could not be found
```

This is because `external` means that the network should be managed
outside the lifecycle of the application and must be created first. This
seems like the incorrect move in this case.
Previously, it would try to do:
define('NOMOS_TRACE_URL_FORMAT=https://blah/?a=b', '')

This fixes that to now only eat up to the first = sign using the
${parameter/pattern/string} parameter expansion form in Bash.
This fixes an issue where the service is blocking on a network
round-trip off-machine to export all the traces to finish request
processing.

Noted upstream as a potential documentation/feature issue:
open-telemetry/opentelemetry-php#993
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant