Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2: empty panel for cluster memory #896

Open
marcomusso opened this issue Nov 13, 2024 · 16 comments
Open

v2: empty panel for cluster memory #896

marcomusso opened this issue Nov 13, 2024 · 16 comments
Assignees

Comments

@marcomusso
Copy link

marcomusso commented Nov 13, 2024

In the Kubernetes overview on Grafana Cloud the "Memory Usage by Cluster" panel uses this query:


      1 - (
        sum by (cluster) (
            max by (cluster, node) (
            label_replace(
                windows_memory_available_bytes{cluster=~"<CLUSTER_NAME>"}
                OR
                node_memory_MemAvailable_bytes{cluster=~"<CLUSTER_NAME>"}
            , "node", "$1", "instance", "(.+)")
            )
        )
        / on (cluster)
        sum by (cluster) (
            max by (cluster, node) (
            kube_node_status_capacity{cluster=~"<CLUSTER_NAME>", resource="memory"}
            )
        )
      )

In the two metrics needed only kube_node_status_capacity has the cluster label while it's missing for node_memory_MemAvailable_bytes.

The result is an empty panel for cluster memory.

I remember it working so maybe something got lost?

A possible reason could be that the cluster label is added as an external label when remote_writing from the alloy-metrics instance but not for the alloy-module-system one which is in charge of scraping the node exporters.

@petewall petewall self-assigned this Nov 15, 2024
@petewall
Copy link
Collaborator

Would you be willing to check this with the latest (2.0.0-rc.3)? I just deployed this and am sending data to a Grafana Cloud stack and I'm able to see data on the memory percent per cluster field. The cluster label is present.

@marcomusso
Copy link
Author

mmm weird:

Image

I still don't see it:

Image

That metric should come from alloy-metrics instance right? I'll review the config to find if some local fact can be the reason why it doesn't work in my case but it's really strange...

@marcomusso
Copy link
Author

What is the expected job name in your case? Maybe it's not integrations/unix ?

@petewall
Copy link
Collaborator

integrations/node_exporter

@petewall
Copy link
Collaborator

Can you share your values.yaml file?

@marcomusso
Copy link
Author

Sure:

---
cluster:
  # -- The name for this cluster.
  # @section -- Cluster
  name: "{{ requiredEnv "CLUSTER_NAME" }}"

#
# Global settings
#
global:
  # -- The specific platform for this cluster. Will enable compatibility for some platforms. Supported options: (empty) or "openshift".
  # @section -- Global Settings
  platform: ""

  # -- How frequently to scrape metrics.
  # @section -- Global Settings
  scrapeInterval: 60s

  # -- Sets the max_cache_size for every prometheus.relabel component. ([docs](https://grafana.com/docs/alloy/latest/reference/components/prometheus.relabel/#arguments))
  # This should be at least 2x-5x your largest scrape target or samples appended rate.
  # @section -- Global Settings
  maxCacheSize: 100000

#
# Destinations
#

# -- The list of destinations where telemetry data will be sent.
# See the [destinations documentation](https://github.com/grafana/k8s-monitoring-helm/blob/main/charts/k8s-monitoring/docs/destinations/README.md) for more information.
# @section -- Destinations
destinations:
  - name: GrafanaCloudMetrics
    type: prometheus
    url: "{{ requiredEnv "GRAFANA_CLOUD_MIMIR_HOST" }}/api/prom/push"
    auth:
      type: basic
      username: "{{ requiredEnv "GRAFANA_CLOUD_MIMIR_USER" }}"
      password: "{{ requiredEnv "GRAFANA_CLOUD_TOKEN" }}"
  - name: GrafanaCloudLogs
    type: loki
    url: "{{ requiredEnv "GRAFANA_CLOUD_LOKI_HOST" }}/loki/api/v1/push"
    auth:
      type: basic
      username: "{{ requiredEnv "GRAFANA_CLOUD_LOKI_USER" }}"
      password: "{{ requiredEnv "GRAFANA_CLOUD_TOKEN" }}"
      #tenantId: "{{ requiredEnv "GRAFANA_CLOUD_INSTANCE_ID" }}"
  - name: GrafanaCloudOTLP
    type: otlp
    protocol: http
    url: "{{ requiredEnv "GRAFANA_CLOUD_OTLP_HOST" }}/otlp"
    tenantId: "{{ requiredEnv "GRAFANA_CLOUD_INSTANCE_ID" }}"
    auth:
      #type: bearerToken # or none or basic
      #bearerToken: "{{ requiredEnv "GRAFANA_CLOUD_TOKEN" }}"
      type: basic
      username: "{{ requiredEnv "GRAFANA_CLOUD_INSTANCE_ID" }}"
      password: "{{ requiredEnv "GRAFANA_CLOUD_TOKEN" }}"
    metrics:
      enabled: false
    logs:
      enabled: false
    traces:
      enabled: true
  - name: GrafanaCloudProfiles
    type: pyroscope
    url: "{{ requiredEnv "GRAFANA_CLOUD_PYROSCOPE_HOST" }}/PYROSCOPE/api/v1/push"
    auth:
      type: basic
      username: "{{ requiredEnv "GRAFANA_CLOUD_PYROSCOPE_USER" }}"
      password: "{{ requiredEnv "GRAFANA_CLOUD_TOKEN" }}"
      #tenantId: "{{ requiredEnv "GRAFANA_CLOUD_INSTANCE_ID" }}"

#
# Features
#

# -- Cluster Monitoring enables observability and monitoring for your Kubernetes Cluster itself.
# Requires a destination that supports metrics.
# To see the valid options, please see the [Cluster Monitoring feature documentation](https://github.com/grafana/k8s-monitoring-helm/tree/main/charts/feature-cluster-metrics).
# @default -- Disabled
# @section -- Features - Cluster Metrics
clusterMetrics:
  # -- Enable gathering Kubernetes Cluster metrics.
  # @section -- Features - Cluster Metrics
  enabled: true

  # -- The destinations where cluster metrics will be sent. If empty, all metrics-capable destinations will be used.
  # @section -- Features - Cluster Metrics
  destinations: []

  # -- Which collector to assign this feature to. Do not change this unless you are sure of what you are doing.
  # @section -- Features - Cluster Metrics
  # @ignored
  collector: alloy-metrics

# -- Cluster events.
# Requires a destination that supports logs.
# To see the valid options, please see the [Cluster Events feature documentation](https://github.com/grafana/k8s-monitoring-helm/tree/main/charts/feature-cluster-events).
# @default -- Disabled
# @section -- Features - Cluster Events
clusterEvents:
  # -- Enable gathering Kubernetes Cluster events.
  # @section -- Features - Cluster Events
  enabled: true

  # -- The destinations where cluster events will be sent. If empty, all logs-capable destinations will be used.
  # @section -- Features - Cluster Events
  destinations: []

  # -- Which collector to assign this feature to. Do not change this unless you are sure of what you are doing.
  # @section -- Features - Cluster Events
  # @ignored
  collector: alloy-singleton

# -- Pod logs.
# Requires a destination that supports logs.
# To see the valid options, please see the [Pod Logs feature documentation](https://github.com/grafana/k8s-monitoring-helm/tree/main/charts/feature-pod-logs).
# @default -- Disabled
# @section -- Features - Pod Logs
podLogs:
  # -- Enable gathering Kubernetes Pod logs.
  # @section -- Features - Pod Logs
  enabled: true

  # -- The destinations where logs will be sent. If empty, all logs-capable destinations will be used.
  # @section -- Features - Pod Logs
  destinations: []

  collector: alloy-logs

# -- Application Observability.
# Requires destinations that supports metrics, logs, and traces.
# To see the valid options, please see the [Application Observability feature documentation](https://github.com/grafana/k8s-monitoring-helm/tree/main/charts/feature-application-observability).
# @default -- Disabled
# @section -- Features - Application Observability
applicationObservability:
  # -- Enable gathering Kubernetes Pod logs.
  # @section -- Features - Application Observability
  enabled: true

  receivers:
    http:
      enabled: true

  # -- The destinations where application data will be sent. If empty, all capable destinations will be used.
  # @section -- Features - Application Observability
  destinations: []

  # -- Which collector to assign this feature to. Do not change this unless you are sure of what you are doing.
  # @section -- Features - Application Observability
  # @ignored
  collector: alloy-receiver

# -- Annotation Autodiscovery enables gathering metrics from Kubernetes Pods and Services discovered by special annotations.
# Requires a destination that supports metrics.
# To see the valid options, please see the [Annotation Autodiscovery feature documentation](https://github.com/grafana/k8s-monitoring-helm/tree/main/charts/feature-annotation-autodiscovery).
# @default -- Disabled
# @section -- Features - Annotation Autodiscovery
annotationAutodiscovery:
  # -- Enable gathering metrics from Kubernetes Pods and Services discovered by special annotations.
  # @section -- Features - Annotation Autodiscovery
  enabled: true

  # -- The destinations where cluster metrics will be sent. If empty, all metrics-capable destinations will be used.
  # @section -- Features - Annotation Autodiscovery
  destinations: []

  # -- Which collector to assign this feature to. Do not change this unless you are sure of what you are doing.
  # @section -- Features - Annotation Autodiscovery
  # @ignored
  collector: alloy-metrics

# -- Prometheus Operator Objects enables the gathering of metrics from objects like Probes, PodMonitors, and
# ServiceMonitors. Requires a destination that supports metrics.
# To see the valid options, please see the
# [Prometheus Operator Objects feature documentation](https://github.com/grafana/k8s-monitoring-helm/tree/main/charts/feature-prometheus-operator-objects).
# @default -- Disabled
# @section -- Features - Prometheus Operator Objects
prometheusOperatorObjects:
  # -- Enable gathering metrics from Prometheus Operator Objects.
  # @section -- Features - Prometheus Operator Objects
  enabled: true

  # -- The destinations where metrics will be sent. If empty, all metrics-capable destinations will be used.
  # @section -- Features - Prometheus Operator Objects
  destinations: []

  # -- Which collector to assign this feature to. Do not change this unless you are sure of what you are doing.
  # @section -- Features - Prometheus Operator Objects
  # @ignored
  collector: alloy-metrics

# -- Profiling enables gathering profiles from applications.
# Requires a destination that supports profiles.
# To see the valid options, please see the [Profiling feature documentation](https://github.com/grafana/k8s-monitoring-helm/tree/main/charts/feature-profiling).
# @default -- Disabled
# @section -- Features - Profiling
profiling:
  # -- Enable gathering profiles from applications.
  # @section -- Features - Profiling
  enabled: true

  # -- The destinations where profiles will be sent. If empty, all profiles-capable destinations will be used.
  # @section -- Features - Profiling
  destinations: []

  # -- Which collector to assign this feature to. Do not change this unless you are sure of what you are doing.
  # @section -- Features - Profiling
  # @ignored
  collector: alloy-profiles

# -- Service Integrations enables gathering telemetry data for common services and applications deployed to Kubernetes.
# To see the valid options, please see the [Service Integrations documentation](https://github.com/grafana/k8s-monitoring-helm/tree/main/charts/feature-integrations).
# @default -- No integrations enabled
# @section -- Features - Service Integrations
integrations:
  # -- Enable Service Integrations.
  # @section -- Features - Service Integrations
  enabled: true

  # -- The destinations where cluster events will be sent. If empty, all logs-capable destinations will be used.
  # @section -- Features - Service Integrations
  destinations: []

  alloy:
    instances:
      - name: alloy-metrics
        labelSelectors:
          app.kubernetes.io/name: alloy-metrics
      - name: alloy-logs
        labelSelectors:
          app.kubernetes.io/name: alloy-logs
      - name: alloy-receivers
        labelSelectors:
          app.kubernetes.io/name: alloy-receiver
      - name: alloy-singleton
        labelSelectors:
          app.kubernetes.io/name: alloy-singleton

  # -- Which collectors to assign this feature to. Do not change this unless you are sure of what you are doing.
  # @section -- Features - Service Integrations
  # @ignored
  collector: alloy-metrics

# -- Self-reporting creates a single metric and log that reports anonymized information about how this Helm chart was
# configured. It reports features enabled, destinations types used, and alloy instances enabled. It does not report any
# actual telemetry data, credentials or configuration, or send any data to any destination other than the ones
# configured above.
# @section -- Features - Self-reporting
selfReporting:
  # -- Enable Self-reporting.
  # @section -- Features - Self-reporting
  enabled: true

  # -- How frequently to generate self-report metrics. This does utilize the global scrapeInterval setting.
  # @section -- Features - Self-reporting
  scrapeInterval: 5m

#
# Collectors (Alloy instances)
#

# An Alloy instance for collecting metrics.
alloy-metrics:
  # -- Deploy the Alloy instance for collecting metrics.
  # @section -- Collectors - Alloy Metrics
  enabled: true

  # -- Extra Alloy configuration to be added to the configuration file.
  # @section -- Collectors - Alloy Metrics
  extraConfig: |-
    prometheus.exporter.self "integrations_alloy_health" { }

    discovery.relabel "integrations_alloy_health" {
      targets = prometheus.exporter.self.integrations_alloy_health.targets

      rule {
        replacement = "alloy-metrics"
        target_label  = "instance"
      }

      rule {
        target_label = "job"
        replacement  = "integrations/alloy"
      }
    }

    prometheus.scrape "integrations_alloy_health" {
      targets    = discovery.relabel.integrations_alloy_health.output
      forward_to = [prometheus.remote_write.grafanacloudmetrics.receiver]
      job_name   = "integrations/alloy"
    }

  # Remote configuration from a remote config server.
  remoteConfig:
    # -- Enable fetching configuration from a remote config server.
    # @section -- Collectors - Alloy Metrics
    enabled: false

    # -- The URL of the remote config server.
    # @section -- Collectors - Alloy Metrics
    url: ""

    auth:
      # -- The type of authentication to use for the remote config server.
      # @section -- Collectors - Alloy Metrics
      type: "none"

      # -- The username to use for the remote config server.
      # @section -- Collectors - Alloy Metrics
      username: ""
      # -- The key for storing the username in the secret.
      # @section -- Collectors - Alloy Metrics
      usernameKey: "username"
      # -- Raw config for accessing the password.
      # @section -- Collectors - Alloy Metrics
      usernameFrom: ""

      # -- The password to use for the remote config server.
      # @section -- Collectors - Alloy Metrics
      password: ""
      # -- The key for storing the password in the secret.
      # @section -- Collectors - Alloy Metrics
      passwordKey: "password"
      # -- Raw config for accessing the password.
      # @section -- Collectors - Alloy Metrics
      passwordFrom: ""

    secret:
      # -- Whether to create a secret for the remote config server.
      # @section -- Collectors - Alloy Metrics
      create: true
      # -- If true, skip secret creation and embed the credentials directly into the configuration.
      # @section -- Collectors - Alloy Metrics
      embed: false
      # -- The name of the secret to create.
      # @section -- Collectors - Alloy Metrics
      name: ""
      # -- The namespace for the secret.
      # @section -- Collectors - Alloy Metrics
      namespace: ""

    # -- (string) The unique identifier for this Alloy instance.
    # @default -- `<cluster>-<namespace>-<pod-name>`
    # @section -- Collectors - Alloy Metrics
    id: ""

    # -- The frequency at which to poll the remote config server for updates.
    # @section -- Collectors - Alloy Metrics
    pollFrequency: 5m

    # -- Attributes to be added to this collector when requesting configuration.
    # @section -- Collectors - Alloy Metrics
    extraAttributes: {}

  logging:
    # -- Level at which Alloy log lines should be written.
    # @section -- Collectors - Alloy Metrics
    level: info
    # -- Format to use for writing Alloy log lines.
    # @section -- Collectors - Alloy Metrics
    format: logfmt

  liveDebugging:
    # -- Enable live debugging for the Alloy instance.
    # Requires stability level to be set to "experimental".
    # @section -- Collectors - Alloy Metrics
    enabled: false

  # @ignored
  alloy:
    configMap: {create: false}

    # Enable clustering to ensure that scraping is distributed across all instances.
    # @ignored
    clustering:
      name: alloy-metrics
      enabled: true

    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: ["ALL"]
        add: ["CHOWN", "DAC_OVERRIDE", "FOWNER", "FSETID", "KILL", "SETGID", "SETUID", "SETPCAP", "NET_BIND_SERVICE", "NET_RAW", "SYS_CHROOT", "MKNOD", "AUDIT_WRITE", "SETFCAP"]
      seccompProfile:
        type: "RuntimeDefault"

  controller:
    # -- The type of controller to use for the Alloy Metrics instance.
    # @section -- Collectors - Alloy Metrics
    type: statefulset

    # -- The number of replicas for the Alloy Metrics instance.
    # @section -- Collectors - Alloy Metrics
    replicas: 1

    # @ignored
    nodeSelector:
      kubernetes.io/os: linux

    # @ignored
    podAnnotations:
      k8s.grafana.com/logs.job: integrations/alloy

  # Skip installation of the Grafana Alloy CRDs, since we don't use them in this chart
  # @ignored
  crds: {create: false}

# An Alloy instance for data sources required to be deployed on a single replica.
alloy-singleton:
  # -- Deploy the Alloy instance for data sources required to be deployed on a single replica.
  # @section -- Collectors - Alloy Singleton
  enabled: true

  # -- Extra Alloy configuration to be added to the configuration file.
  # @section -- Collectors - Alloy Singleton
  extraConfig: |-
    prometheus.exporter.self "integrations_alloy_health" { }

    discovery.relabel "integrations_alloy_health" {
      targets = prometheus.exporter.self.integrations_alloy_health.targets

      rule {
        replacement = "alloy-singleton"
        target_label  = "instance"
      }

      rule {
        target_label = "job"
        replacement  = "integrations/alloy"
      }
    }

  # Remote configuration from a remote config server.
  remoteConfig:
    # -- Enable fetching configuration from a remote config server.
    # @section -- Collectors - Alloy Singleton
    enabled: false

    # -- The URL of the remote config server.
    # @section -- Collectors - Alloy Singleton
    url: ""

    auth:
      # -- The type of authentication to use for the remote config server.
      # @section -- Collectors - Alloy Singleton
      type: "none"

      # -- The username to use for the remote config server.
      # @section -- Collectors - Alloy Singleton
      username: ""
      # -- The key for storing the username in the secret.
      # @section -- Collectors - Alloy Singleton
      usernameKey: "username"
      # -- Raw config for accessing the username.
      # @section -- Collectors - Alloy Singleton
      usernameFrom: ""

      # -- The password to use for the remote config server.
      # @section -- Collectors - Alloy Singleton
      password: ""
      # -- The key for storing the password in the secret.
      # @section -- Collectors - Alloy Singleton
      passwordKey: "password"
      # -- Raw config for accessing the password.
      # @section -- Collectors - Alloy Singleton
      passwordFrom: ""

    secret:
      # -- Whether to create a secret for the remote config server.
      # @section -- Collectors - Alloy Singleton
      create: true
      # -- If true, skip secret creation and embed the credentials directly into the configuration.
      # @section -- Collectors - Alloy Singleton
      embed: false
      # -- The name of the secret to create.
      # @section -- Collectors - Alloy Singleton
      name: ""
      # -- The namespace for the secret.
      # @section -- Collectors - Alloy Singleton
      namespace: ""

  logging:
    # -- Level at which Alloy log lines should be written.
    # @section -- Collectors - Alloy Singleton
    level: info
    # -- Format to use for writing Alloy log lines.
    # @section -- Collectors - Alloy Singleton
    format: logfmt

  liveDebugging:
    # -- Enable live debugging for the Alloy instance.
    # Requires stability level to be set to "experimental".
    # @section -- Collectors - Alloy Singleton
    enabled: false

  # @ignored
  alloy:
    # This chart is creating the configuration, so the alloy chart does not need to.
    configMap: {create: false}

    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: ["ALL"]
        add: ["CHOWN", "DAC_OVERRIDE", "FOWNER", "FSETID", "KILL", "SETGID", "SETUID", "SETPCAP", "NET_BIND_SERVICE", "NET_RAW", "SYS_CHROOT", "MKNOD", "AUDIT_WRITE", "SETFCAP"]
      seccompProfile:
        type: "RuntimeDefault"

  controller:
    # -- The type of controller to use for the Alloy Singleton instance.
    # @section -- Collectors - Alloy Singleton
    type: deployment
    # -- The number of replicas for the Alloy Singleton instance.
    # This should remain a single instance to avoid duplicate data.
    # @section -- Collectors - Alloy Singleton
    replicas: 1

    # @ignored
    nodeSelector:
      kubernetes.io/os: linux

    # @ignored
    podAnnotations:
      k8s.grafana.com/logs.job: integrations/alloy

  # Skip installation of the Grafana Alloy CRDs, since we don't use them in this chart
  # @ignored
  crds: {create: false}

# An Alloy instance for collecting log data.
alloy-logs:
  # -- Deploy the Alloy instance for collecting log data.
  # @section -- Collectors - Alloy Logs
  enabled: true

  # -- Extra Alloy configuration to be added to the configuration file.
  # @section -- Collectors - Alloy Logs
  extraConfig: |-
    prometheus.exporter.self "integrations_alloy_health" { }

    discovery.relabel "integrations_alloy_health" {
      targets = prometheus.exporter.self.integrations_alloy_health.targets

      rule {
        replacement = "alloy-logs"
        target_label  = "instance"
      }

      rule {
        target_label = "job"
        replacement  = "integrations/alloy"
      }
    }

  # Remote configuration from a remote config server.
  remoteConfig:
    # -- Enable fetching configuration from a remote config server.
    # @section -- Collectors - Alloy Logs
    enabled: false

    # -- The URL of the remote config server.
    # @section -- Collectors - Alloy Logs
    url: ""

    auth:
      # -- The type of authentication to use for the remote config server.
      # @section -- Collectors - Alloy Logs
      type: "none"

      # -- The username to use for the remote config server.
      # @section -- Collectors - Alloy Logs
      username: ""
      # -- The key for storing the username in the secret.
      # @section -- Collectors - Alloy Logs
      usernameKey: "username"
      # -- Raw config for accessing the username.
      # @section -- Collectors - Alloy Logs
      usernameFrom: ""

      # -- The password to use for the remote config server.
      # @section -- Collectors - Alloy Logs
      password: ""
      # -- The key for storing the username in the secret.
      # @section -- Collectors - Alloy Logs
      passwordKey: "password"
      # -- Raw config for accessing the password.
      # @section -- Collectors - Alloy Logs
      passwordFrom: ""

    secret:
      # -- Whether to create a secret for the remote config server.
      # @section -- Collectors - Alloy Logs
      create: true
      # -- If true, skip secret creation and embed the credentials directly into the configuration.
      # @section -- Collectors - Alloy Logs
      embed: false
      # -- The name of the secret to create.
      # @section -- Collectors - Alloy Logs
      name: ""
      # -- The namespace for the secret.
      # @section -- Collectors - Alloy Logs
      namespace: ""

  logging:
    # -- Level at which Alloy log lines should be written.
    # @section -- Collectors - Alloy Logs
    level: info
    # -- Format to use for writing Alloy log lines.
    # @section -- Collectors - Alloy Logs
    format: logfmt

  liveDebugging:
    # -- Enable live debugging for the Alloy instance.
    # Requires stability level to be set to "experimental".
    # @section -- Collectors - Alloy Logs
    enabled: false

  # @ignored
  alloy:
    # This chart is creating the configuration, so the alloy chart does not need to.
    configMap: {create: false}

    # Disabling clustering by default, because the default log gathering format does not require clusters.
    clustering: {enabled: false}

    # @ignored
    mounts:
      # Mount /var/log from the host into the container for log collection.
      varlog: true
      # Mount /var/lib/docker/containers from the host into the container for log
      # collection. Set to true if your cluster puts log files inside this directory.
      dockercontainers: true

    # @ignored
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: ["ALL"]
        add: ["CHOWN", "DAC_OVERRIDE", "FOWNER", "FSETID", "KILL", "SETGID", "SETUID", "SETPCAP", "NET_BIND_SERVICE", "NET_RAW", "SYS_CHROOT", "MKNOD", "AUDIT_WRITE", "SETFCAP"]
      seccompProfile:
        type: "RuntimeDefault"

  controller:
    # -- The type of controller to use for the Alloy Logs instance.
    # @section -- Collectors - Alloy Logs
    type: daemonset

    # @ignored
    nodeSelector:
      kubernetes.io/os: linux

# An Alloy instance for opening receivers to collect application data.
alloy-receiver:
  # -- Deploy the Alloy instance for opening receivers to collect application data.
  # @section -- Collectors - Alloy Receiver
  enabled: true

  # -- Extra Alloy configuration to be added to the configuration file.
  # @section -- Collectors - Alloy Receiver
  extraConfig: |-
    prometheus.exporter.self "integrations_alloy_health" { }

    discovery.relabel "integrations_alloy_health" {
      targets = prometheus.exporter.self.integrations_alloy_health.targets

      rule {
        replacement = "alloy-receivers"
        target_label  = "instance"
      }

      rule {
        target_label = "job"
        replacement  = "integrations/alloy"
      }
    }
  # Remote configuration from a remote config server.
  remoteConfig:
    # -- Enable fetching configuration from a remote config server.
    # @section -- Collectors - Alloy Receiver
    enabled: false

    # -- The URL of the remote config server.
    # @section -- Collectors - Alloy Receiver
    url: ""

    auth:
      # -- The type of authentication to use for the remote config server.
      # @section -- Collectors - Alloy Receiver
      type: "none"

      # -- The username to use for the remote config server.
      # @section -- Collectors - Alloy Receiver
      username: ""
      # -- The key for storing the username in the secret.
      # @section -- Collectors - Alloy Receiver
      usernameKey: "username"
      # -- Raw config for accessing the username.
      # @section -- Collectors - Alloy Receiver
      usernameFrom: ""

      # -- The password to use for the remote config server.
      # @section -- Collectors - Alloy Receiver
      password: ""
      # -- The key for storing the password in the secret.
      # @section -- Collectors - Alloy Receiver
      passwordKey: "password"
      # -- Raw config for accessing the password.
      # @section -- Collectors - Alloy Receiver
      passwordFrom: ""

    secret:
      # -- Whether to create a secret for the remote config server.
      # @section -- Collectors - Alloy Receiver
      create: true
      # -- If true, skip secret creation and embed the credentials directly into the configuration.
      # @section -- Collectors - Alloy Receiver
      embed: false
      # -- The name of the secret to create.
      # @section -- Collectors - Alloy Receiver
      name: ""
      # -- The namespace for the secret.
      # @section -- Collectors - Alloy Receiver
      namespace: ""

  logging:
    # -- Level at which Alloy log lines should be written.
    # @section -- Collectors - Alloy Receiver
    level: info
    # -- Format to use for writing Alloy log lines.
    # @section -- Collectors - Alloy Receiver
    format: logfmt

  liveDebugging:
    # -- Enable live debugging for the Alloy instance.
    # Requires stability level to be set to "experimental".
    # @section -- Collectors - Alloy Receiver
    enabled: true

  alloy:
    stabilityLevel: experimental
    # -- The ports to expose for the Alloy receiver.
    # @section -- Collectors - Alloy Receiver
    extraPorts:
      - name: otlp-http
        port: 4318
        targetPort: 4318
        protocol: TCP

    # This chart is creating the configuration, so the alloy chart does not need to.
    # @ignored
    configMap: {create: false}

    # @ignored
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop: ["ALL"]
        add: ["CHOWN", "DAC_OVERRIDE", "FOWNER", "FSETID", "KILL", "SETGID", "SETUID", "SETPCAP", "NET_BIND_SERVICE", "NET_RAW", "SYS_CHROOT", "MKNOD", "AUDIT_WRITE", "SETFCAP"]
      seccompProfile:
        type: "RuntimeDefault"

  controller:
    # -- The type of controller to use for the Alloy Receiver instance.
    # @section -- Collectors - Alloy Receiver
    type: daemonset

    # @ignored
    nodeSelector:
      kubernetes.io/os: linux

# An Alloy instance for gathering profiles.
alloy-profiles:
  # -- Deploy the Alloy instance for gathering profiles.
  # @section -- Collectors - Alloy Profiles
  enabled: true

  # -- Extra Alloy configuration to be added to the configuration file.
  # @section -- Collectors - Alloy Profiles
  extraConfig: |-
    prometheus.exporter.self "integrations_alloy_health" { }

    discovery.relabel "integrations_alloy_health" {
      targets = prometheus.exporter.self.integrations_alloy_health.targets

      rule {
        replacement = "alloy-profiles"
        target_label  = "instance"
      }

      rule {
        target_label = "job"
        replacement  = "integrations/alloy"
      }
    }

  # Remote configuration from a remote config server.
  remoteConfig:
    # -- Enable fetching configuration from a remote config server.
    # @section -- Collectors - Alloy Profiles
    enabled: false

    # -- The URL of the remote config server.
    # @section -- Collectors - Alloy Profiles
    url: ""

    auth:
      # -- The type of authentication to use for the remote config server.
      # @section -- Collectors - Alloy Profiles
      type: "none"

      # -- The username to use for the remote config server.
      # @section -- Collectors - Alloy Profiles
      username: ""
      # -- The key for storing the username in the secret.
      # @section -- Collectors - Alloy Profiles
      usernameKey: "username"
      # -- Raw config for accessing the username.
      # @section -- Collectors - Alloy Profiles
      usernameFrom: ""

      # -- The password to use for the remote config server.
      # @section -- Collectors - Alloy Profiles
      password: ""
      # -- The key for storing the password in the secret.
      # @section -- Collectors - Alloy Profiles
      passwordKey: "password"
      # -- Raw config for accessing the password.
      # @section -- Collectors - Alloy Profiles
      passwordFrom: ""

    secret:
      # -- Whether to create a secret for the remote config server.
      # @section -- Collectors - Alloy Profiles
      create: true
      # -- If true, skip secret creation and embed the credentials directly into the configuration.
      # @section -- Collectors - Alloy Profiles
      embed: false
      # -- The name of the secret to create.
      # @section -- Collectors - Alloy Profiles
      name: ""
      # -- The namespace for the secret.
      # @section -- Collectors - Alloy Profiles
      namespace: ""

  logging:
    # -- Level at which Alloy log lines should be written.
    # @section -- Collectors - Alloy Profiles
    level: info
    # -- Format to use for writing Alloy log lines.
    # @section -- Collectors - Alloy Profiles
    format: logfmt

  liveDebugging:
    # -- Enable live debugging for the Alloy instance.
    # Requires stability level to be set to "experimental".
    # @section -- Collectors - Alloy Profiles
    enabled: false

  # @ignored
  alloy:
    # Pyroscope components are currently in public preview
    stabilityLevel: public-preview

    # This chart is creating the configuration, so the alloy chart does not need to.
    configMap: {create: false}

    # Disabling clustering because each instance will gather profiles for the workloads on the same node.
    clustering:
      name: alloy-profiles
      enabled: false

    securityContext:
      privileged: true
      runAsGroup: 0
      runAsUser: 0

  controller:
    # -- The type of controller to use for the Alloy Profiles instance.
    # @section -- Collectors - Alloy Profiles
    type: daemonset

    # @ignored
    hostPID: true

    # @ignored
    nodeSelector:
      kubernetes.io/os: linux

    # @ignored
    tolerations:
      - effect: NoSchedule
        operator: Exists

  # Skip installation of the Grafana Alloy CRDs, since we don't use them in this chart
  # @ignored
  crds: {create: false}

# -- Deploy additional manifest objects
extraObjects: []

@marcomusso
Copy link
Author

Ok I can confirm that that job is not part of this helm chart but I don't get the metric at all at this point (ie that scrape job was coming from another alloy). Stil no metrics with job=integrations/node_exporter. But it's a small step forward at least.

@marcomusso
Copy link
Author

marcomusso commented Nov 15, 2024

Silly doubt: is the regexp "" correct in a relabelling config? Its default should be (.*) meaning match everything while an empty string might not trigger the replace action (meh it should work anyway)? Of course that doesn't explain why it works for you... Another option is that the source_label is not there in the first place: sure we want to rewrite the value for the cluster label but what if that label doesn't exist in the first place (I don't remember if it gets created or if the action silently fails)?

@marcomusso
Copy link
Author

Ok right now I can only confirm that I don't see any node_exporter metrics coming out of that alloy instance, will continue the investigation but it feels like being closer to the solution.

@marcomusso
Copy link
Author

FYI: my node exporters have label app.kubernetes.io/name: node-exporter and not app.kubernetes.io/name: prometheus-node-exporter. Maybe that's the reason why they don't get discovered?

@marcomusso
Copy link
Author

Small update:

  • at first I though node_exporter weren't scraped because I was put of by the default app.kubernetes.name: prometheus-node-exporter while the node-exporter pods have this annotation app.kubernetes.name: node-exporter. I guess it's a default overridden by the actual annotation used by this chart.
  • second I don't see anything referring to node-exporter scraping in the graph of the metrics instance, is that correct? I was expecting the graph to include the content of the included configMap alloy-module-system which is the one actually defining the scraping of the node_exporter pods
  • third: I see node metrics coming in with the correct job (integrations/node_exporter) but not node_memory_MemAvailable_bytes, just total and swap. I remember some old kernel versions didn't report that metrics so node_exporter cannot do that either. I wonder if that's still true but that could be my specific case so at least we can rule out a bug in this chart... I'll investigate next week to find proof.

@marcomusso
Copy link
Author

Another update: is the cluster label already existing when the metrics hits the remote_write relabel config? Because a relabel_config needs the source label to exist so I got that working by writing (in a custom alloy config):

    write_relabel_config {
      source_labels = ["__address__"]
      regex = ""
      replacement = sys.env("CLUSTER_NAME")
      target_label = "cluster"
    }

which will rewrite the value of the cluster label always.

Also the write_relabel_config in the docs shouldn't be in the endpoint block but it has to be in fact.

@marcomusso
Copy link
Author

marcomusso commented Nov 18, 2024

Related? Fro what I can see, unless relabelled, the prometheus.exporter.unix component is still not compatible. grafana/agent#3180

@petewall
Copy link
Collaborator

    write_relabel_config {
      source_labels = ["cluster"]
      regex = ""
      replacement = {{ $.Values.cluster.name | quote }}
      target_label = "cluster"
    }

This looks at the cluster and if it's not set regex = "", then it sets it. I've tested with a number of tools and it appears to work.

@petewall
Copy link
Collaborator

Related? Fro what I can see, unless relabelled, the prometheus.exporter.unix component is still not compatible. grafana/agent#3180

We do use the prometheus.exporter.unix component, but not for mimicking Node Exporter, but for providing the self-report. The main method for getting node metrics is to deploy OSS node exporter from its own Helm chart, and scrape it like any other metric source.

@marcomusso
Copy link
Author

I am using latest and a recent kernel but I don't get why the pod is not actually exporting node_memory_MemAvailable_bytes:

/ $ wget -qO- localhost:9100/metrics | grep memory_
# HELP node_memory_MemTotal_bytes Memory information field MemTotal_bytes.
# TYPE node_memory_MemTotal_bytes gauge
node_memory_MemTotal_bytes 2.199022592e+09
# HELP node_memory_SwapTotal_bytes Memory information field SwapTotal_bytes.
# TYPE node_memory_SwapTotal_bytes gauge
node_memory_SwapTotal_bytes 0
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 2.1839872e+07
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.27059968e+09
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19

So it seems a local problem and not a problem of the chart. You can close this issue if you want otherwise I'll report here the reason as soon as I discover it but right now seems really related to the local test setup (rancher desktop+k3d).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants