You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently relayor got increasingly userfriendly prometheus support. This post should give you a short introduction
on how to make use of it in your environment to keep an eye on your relays and to configure the foundation needed
for nice looking grafana dashboards of your relays' metrics and rule based alerts that tell you when something needs your attention.
Tor's MetricsPort feature provides prometheus metrics for many tor relay properties that can help you understand operational issues, bottlenecks and generally how your relays are doing. It is important to understand that these metrics are sensitive and MUST NOT be made public.
Unlike more mature exporters like node_exporter (an exporter you should also deploy on all your tor servers) tor's builtin exporter (torrc: MetricsPort) does not come with security features like TLS and authentication. To workaround this limitation we use a well established webserver - nginx - as a reverse proxy to provide us with TLS and authentication features so we can collect (scrape) metrics from tor's MetricsPort over the internet without exposing them to the public.
Prometheus server software does not support conf.d style config folders where we could drop in the configuration needed for tor MetricsPort scraping without knowing or interfering with the rest of the configuration. Therefore we implemented support for conf.d style folder using ansible.
In scope tasks for relayor in the prometheus context:
torrc (MetricsPort/MetricsPortPolicy)
nginx reverse proxy configuration for MetricsPort
nginx authentication: htpasswd file generation incl. random password generation
reload nginx after nginx config changes
prometheus scrape configuration for tor MetricsPort incl. authentication
prometheus scrape configuration for blackbox exporter (optional)
prometheus alert rules for tor (optional)
reload prometheus after prometheus config changes
Out of scope tasks for relayor
prometheus server installation
nginx installation
TLS certificates (letsencrypt)
blackbox exporter installation (optional)
alertmanager installation (optional)
...other ansible roles are available for that.
Overview
To explain how to use relayor's prometheus feature we will use this example setup with two tor servers, running each two tor relay instances and one prometheus server that collects metrics from all 4 tor MetricsPorts via nginx.
Before you start using relayor's prometheus features make sure to at least have relayor version 23.1.0 or newer.
Overview of the following steps
prepare prometheus server requirements
prepare tor server requirements:
promexporters folder
include promexporters folder in nginx configuration
enable prometheus features in your ansible playbook.
Prepare Prometheus Server Requirements (prometheus.example.com)
If you already have a prometheus configuration, simply copy it to /etc/prometheus/conf.d/1_prometheus.yml
and make sure no tor scrape_configs are included and the file can be appended with additional scrape jobs at the end.
If you do not have a prometheus.yml file yet, you can create the first section of the prometheus configuration file and make sure the filename starts with "1_..." so it gets sorted before the "tor_..." files when assembling the global prometheus.yml file
/etc/prometheus/conf.d/1_prometheus.yml example:
global:
scrape_interval: 60s
# scrape_timeout is set to the global default (10s).
rule_files:
- "/etc/prometheus/rules/*.rules"
scrape_configs:
Also make sure promtool is installed on your prometheus server, relayor will use it to validate
the generated prometheus configuration files. The prometheus ansible role installs promtool by default.
relayor will create one configuration file per server in that conf.d folder:
and assemble the conf.d/* files into the globlal file /etc/prometheus/prometheus.yml and make backups in the same folder before generating the new file. Files in the conf.d subfolder are not backed up.
Tor Server Requirements (nginx)
have nginx and a TLS certificate installed for the hostname of the server (ansible_fqdn)
relayor connects to nginx on the default https port (443) if you want to use a non-default port, set the ansible variable tor_prometheus_scrape_port to your desired value.
relayor places its nginx configuration file in /etc/nginx/promexporters/tor_metricsports_relayor.conf by default but it can also be configured.
All targets are protected with HTTP basic authentication and random passwords (one per server).
Job names follow this scheme: tor-FQDN-hostname-counter, so for example the first job name is "tor-server1.example.com-0".
Since relayor has complete awareness over all torrc settings it also enriches the prometheus
scrape configuration with a few additional labels that tor does not include by default. They are handy when creating Grafana dashboards:
id (IP_ORPort)
relaytype (exit/nonexit)
tor_nickname
Prometheus Alert Rules (optional)
If you also have an Alertmanager connected to your prometheus server you can tell relayor's to enable the included alert rules, by setting this variable in your playbook:
tor_gen_prometheus_alert_rules: True
Blackbox Exporter (optional)
If you also have a blackbox_exporter running, you can also monitor all tor ports by telling relayor where your blackbox_exporter is running (from the point of view of prometheus.example.com) by setting the following variable:
tor_blackbox_exporter_host: 127.0.0.1:9115
relayor requires a simple tcp_probe module named tcp_connect in your blackbox_exporter configuration.
This is the minimal /etc/blackbox_exporter.yml configuration that would work with relayor:
modules:
tcp_connect:
prober: tcp
timeout: 5s
Also make sure your blackbox exporter has IPv6 connectivity when your relays have IPv6 enabled.
Next Steps
Now that all metrics data is collected on the prometheus server, the natural next step is to create a Grafana dashboard that displays the data to make sense of it. One challenge though, is that tor's metrics are not well documented yet.
More Alert Rules
Since relayor uses tor's OfflineMasterKeys feature by default there is always the risk, that the operator forget to renew the signing cert. Therefore it would be nice to ship an alert rule that warns operators when their signing cert is about to expire, but currently tor does not include the necessary metric yet.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Recently relayor got increasingly userfriendly prometheus support. This post should give you a short introduction
on how to make use of it in your environment to keep an eye on your relays and to configure the foundation needed
for nice looking grafana dashboards of your relays' metrics and rule based alerts that tell you when something needs your attention.
If you are new to prometheus I recommend starting with the prometheus documentation first.
Tor's MetricsPort feature provides prometheus metrics for many tor relay properties that can help you understand operational issues, bottlenecks and generally how your relays are doing. It is important to understand that these metrics are sensitive and MUST NOT be made public.
Unlike more mature exporters like node_exporter (an exporter you should also deploy on all your tor servers) tor's builtin exporter (torrc: MetricsPort) does not come with security features like TLS and authentication. To workaround this limitation we use a well established webserver - nginx - as a reverse proxy to provide us with TLS and authentication features so we can collect (scrape) metrics from tor's MetricsPort over the internet without exposing them to the public.
Prometheus server software does not support conf.d style config folders where we could drop in the configuration needed for tor MetricsPort scraping without knowing or interfering with the rest of the configuration. Therefore we implemented support for conf.d style folder using ansible.
In scope tasks for relayor in the prometheus context:
Out of scope tasks for relayor
...other ansible roles are available for that.
Overview
To explain how to use relayor's prometheus feature we will use this example setup with two tor servers, running each two tor relay instances and one prometheus server that collects metrics from all 4 tor MetricsPorts via nginx.
Before you start using relayor's prometheus features make sure to at least have relayor version 23.1.0 or newer.
Overview of the following steps
Prepare Prometheus Server Requirements (prometheus.example.com)
If you do not have a prometheus server yet, you might enjoy this ansible role:
https://github.com/prometheus-community/ansible/tree/main/roles/prometheus
Create conf.d Folder
Prometheus First Config Section
If you already have a prometheus configuration, simply copy it to
/etc/prometheus/conf.d/1_prometheus.yml
and make sure no tor scrape_configs are included and the file can be appended with additional scrape jobs at the end.
If you do not have a prometheus.yml file yet, you can create the first section of the prometheus configuration file and make sure the filename starts with "1_..." so it gets sorted before the "tor_..." files when assembling the global
prometheus.yml
file/etc/prometheus/conf.d/1_prometheus.yml example:
Also make sure promtool is installed on your prometheus server, relayor will use it to validate
the generated prometheus configuration files. The prometheus ansible role installs promtool by default.
relayor will create one configuration file per server in that conf.d folder:
and assemble the conf.d/* files into the globlal file
/etc/prometheus/prometheus.yml
and make backups in the same folder before generating the new file. Files in the conf.d subfolder are not backed up.Tor Server Requirements (nginx)
have nginx and a TLS certificate installed for the hostname of the server (
ansible_fqdn
)relayor connects to nginx on the default https port (443) if you want to use a non-default port, set the ansible variable
tor_prometheus_scrape_port
to your desired value.relayor places its nginx configuration file in
/etc/nginx/promexporters/tor_metricsports_relayor.conf
by default but it can also be configured.create the folder on the tor server:
Ansible Playbook
That is probably the easiest part, enable relayor's prometheus integration in your playbook, by adding at least these two variables:
and make sure the prometheus server is in your ansible inventory file and you have sudo privileges.
Now you can run your playbook and relayor should create all the file as seen in the overview diagram.
If everything went well your prometheus webinterface should show one new job per tor relay (4 in total in the example).
All targets are protected with HTTP basic authentication and random passwords (one per server).
Job names follow this scheme: tor-FQDN-hostname-counter, so for example the first job name is "tor-server1.example.com-0".
Since relayor has complete awareness over all torrc settings it also enriches the prometheus
scrape configuration with a few additional labels that tor does not include by default. They are handy when creating Grafana dashboards:
Prometheus Alert Rules (optional)
If you also have an Alertmanager connected to your prometheus server you can tell relayor's to enable the included alert rules, by setting this variable in your playbook:
Blackbox Exporter (optional)
If you also have a blackbox_exporter running, you can also monitor all tor ports by telling relayor where your blackbox_exporter is running (from the point of view of prometheus.example.com) by setting the following variable:
relayor requires a simple tcp_probe module named
tcp_connect
in your blackbox_exporter configuration.This is the minimal
/etc/blackbox_exporter.yml
configuration that would work with relayor:Also make sure your blackbox exporter has IPv6 connectivity when your relays have IPv6 enabled.
Next Steps
Now that all metrics data is collected on the prometheus server, the natural next step is to create a Grafana dashboard that displays the data to make sense of it. One challenge though, is that tor's metrics are not well documented yet.
More Alert Rules
Since relayor uses tor's
OfflineMasterKeys
feature by default there is always the risk, that the operator forget to renew the signing cert. Therefore it would be nice to ship an alert rule that warns operators when their signing cert is about to expire, but currently tor does not include the necessary metric yet.Beta Was this translation helpful? Give feedback.
All reactions