Skip to content

Latest commit

 

History

History
276 lines (211 loc) · 15.5 KB

openshift-monitoring.adoc

File metadata and controls

276 lines (211 loc) · 15.5 KB

Application Monitoring on Red Hat OpenShift Container Platform (RHOCP) with Prometheus and Grafana

The following guide has been tested and is supported with Red Hat OpenShift Container Platform (RHOCP) versions 4.12, 4.14, 4.15, 4.16, 4.17.

Prerequisites

Prior to installing application monitoring on RHOCP, ensure that there is a running application that has a service endpoint for outputting metrics in Prometheus format.

This guide assumes a running application has been deployed to the RHOCP cluster inside a project/namespace called myapp, and that its Prometheus metrics endpoint is exposed on path /metrics.

Application Monitoring on RHOCP 4.6+

In RHOCP 4.6+, application monitoring can be set up by enabling monitoring for user-defined projects without the need to install an additional monitoring solution. This solution will deploy a second Prometheus Operator instance inside the openshift-user-workload-monitoring namespace that is configured to monitor all namespaces excluding the openshift- prefixed namespaces already monitored by the cluster’s default platform monitoring.

There is no separate Grafana deployment that comes with enabling user-defined monitoring and the existing Grafana instance provided with the default monitoring stack is read-only, so a community-powered Grafana Operator must be installed later to create custom dashboards for viewing your metrics.

  1. To configure the monitoring stack, follow this OpenShift guide Configuring the monitoring stack. Here, you will create two config maps, cluster-monitoring-config and user-workload-monitoring-config, that will allow you to enable monitoring for user-defined projects in the next step. After creating the user-workload-monitoring-config config map, return to this guide to continue setting up monitoring for user-defined projects.

  2. To enable monitoring for user-defined projects, follow the OpenShift guide Enabling monitoring for user-defined projects.

  3. Deploy a Service Monitor inside the myapp namespace to define the service endpoint that will be monitored by the Prometheus instance. Add the following YAML to your OpenLibertyApplication Custom Resource (CR):

    kind: OpenLibertyApplication
    spec:
      
      monitoring:
        endpoints:
          - interval: 30s
  4. Under Observe > Metrics (formerly Monitoring > Metrics), confirm that your service is being monitored by inserting your own queries specific to your app’s metrics.

  5. To deploy an additional Grafana instance for creating custom dashboards, follow this RedHat blog Custom Grafana dashboards for Red Hat OpenShift Container Platform 4. This Grafana instance will be connected to OpenShift monitoring in the openshift-monitoring namespace and additional Grafana dashboards will increase the load on the Prometheus service, so pay attention to Prometheus' CPU and memory usage when deploying dashboards. Example Grafana instance YAML files can be found here. Note: The Grafana Operator’s apiVersion and Custom Resources have been renamed since the creation of the RedHat blog above. When you are deploying the Grafana Datasource in the blog, use the following YAML instead.

    kind: GrafanaDatasource
    apiVersion: grafana.integreatly.org/v1beta1
    metadata:
      name: prometheus-grafanadatasource
      namespace: my-grafana
    spec:
      datasource:
        access: proxy
        editable: true
        isDefault: true
        jsonData:
          httpHeaderName1: 'Authorization'
          timeInterval: 5s
          tlsSkipVerify: true
        name: Prometheus
        secureJsonData:
          httpHeaderValue1: 'Bearer ${BEARER_TOKEN}'
        type: prometheus
        url: 'https://thanos-querier.openshift-monitoring.svc.cluster.local:9091'
    instanceSelector:
        matchLabels:
          dashboards: grafana-a
      plugins:
        - name: grafana-clock-panel
          version: 1.3.0
  6. Deploy the following template Grafana Dashboard grafana_dashboard.yaml.txt to see all the services currently being monitored. Please ensure that the namespace in this yaml is configured to the namespace already deployed.

  7. Under Networking > Routes, visit Grafana to see the template dashboard. You can now consume all the application metrics gathered by Prometheus on the Grafana dashboard.

Application Monitoring on RHOCP 4.2-4.5

For application monitoring on RHOCP 4.2-4.5, you need to set up your own Prometheus and Grafana deployments. Both Prometheus and Grafana can be set up via Operator Lifecycle Manager (OLM).

Deploy Prometheus - Prometheus Operator

The Prometheus Operator is an open-source project originating from CoreOS and exists as a part of their Kubernetes Operator framework. The Kubernetes Operator framework is the preferred way to deploy Prometheus on a Kubernetes system. When the Prometheus Operator is installed on the Kubernetes system, you no longer need to hand-configure the Prometheus configuration. Instead, you create CoreOS ServiceMonitor resources for each of the service endpoints that needs to be monitored: this makes daily maintenance of the Prometheus server a lot easier. An architecture overview of the Prometheus Operator is shown in the following diagram:

Prometheus Operator

Using Operator Lifecycle Manager (OLM), Prometheus operator can be easily installed and configured in RHOCP Web Console.

Use these files while working through the guide:

Install Prometheus Operator Using Operator Lifecycle Manager (OLM)

The following procedure is based on Using the Operator Lifecycle Manager to deploy Prometheus on OpenShift, with the added inclusion of OpenShift commands needed to complete each step.

  1. Create a new namespace for our Prometheus Operator deployment

    oc new-project prometheus-operator
  2. Go to OpenShift Container Platform web console and Click on Operators > OperatorHub. Using the OLM, Operators can be easily pulled, installed and subscribed on the cluster. Ensure that the Project is set to prometheus-operator. Search for Prometheus Operator and install it. Choose prometheus-operator under A specific namespace on the cluster and subscribe.

  3. Click on Overview and create a service monitor instance. A ServiceMonitor defines a service endpoint that needs to be monitored by the Prometheus instance.

  4. Inside the Service Monitor YAML file, make sure metadata.namespace is your monitoring namespace. In this case, it will be prometheus-operator. spec.namespaceSelector and spec.selector for labels should be configured to match your app deployment’s namespace and label. For example, inside the service_monitor.yaml file, an application with label app: example-app from namespace myapp will be monitored by the service monitor. If the metrics endpoint is secured, you can define a secured endpoint with authentication configuration by following the endpoint API documentation of Prometheus Operator.

  5. Create a Service Account with Cluster role and Cluster role binding to ensure you have the permission to get nodes and pods in other namespaces at the cluster scope. Refer to the service_account.yaml file. Create the YAML file and apply it.

    oc apply -f service_account.yaml
  6. Click Overview and create a Prometheus instance. A Prometheus resource can scrape the targets defined in the ServiceMonitor resource.

  7. Inside the Prometheus YAML file, make sure metadata.namespace is prometheus-operator. Ensure spec.serviceAccountName is the Service Account’s name that you have applied in the previous step. You can set the match expression to select which Service Monitors you are interested in under spec.serviceMonitorSelector.matchExpressions as in the prometheus.yaml file.

  8. Verify that the Prometheus services have successfully started.

    [root@rhel7-ocp]# oc get svc -n prometheus-operator
    NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)          AGE
    prometheus-operated   ClusterIP   None             <none>        9090/TCP         19h
  9. Check the server logs from one of the target pods to see if the services are running properly.

    [root@rhel7-ocp]# oc get pods -n prometheus-operator
    NAME                                   READY     STATUS    RESTARTS   AGE
    prometheus-operator-7fccbd7c74-48m6v   1/1       Running   0          19h
    prometheus-prometheus-0                3/3       Running   1          19h
    prometheus-prometheus-1                3/3       Running   1          19h
    [root@rhel7-ocp]# oc logs prometheus-prometheus-0 -c prometheus -n prometheus-operator
  10. Expose the prometheus-operated service to use the Prometheus console externally.

    [root@rhel7-ocp]# oc expose svc/prometheus-operated -n prometheus-operator
    route.route.openshift.io/prometheus-operated exposed
    [root@rhel7-ocp]# oc get route -n prometheus-operator
    NAME         HOST/PORT                                                 PATH      SERVICES     PORT      TERMINATION   WILDCARD
    prometheus   prometheus-prometheus-operator.apps.9.37.135.153.nip.io             prometheus   web                     None
  11. Visit the Prometheus route and go to the Prometheus targets page. Check to see that the Prometheus targets page is picking up the target endpoints.

Prometheus Target Page

Deploy Grafana

Use these files while working with Grafana:

Use Grafana dashboards to visualize the metrics. Perform the following steps to deploy Grafana and ensure that Prometheus endpoints are reachable as a data source in Grafana.

  1. Choose the same namespace as Prometheus Operator deployment.

    oc project prometheus-operator
  2. Go to OpenShift Container Platform web console and click Operators > OperatorHub. Search for Grafana Operator and install it. For A specific namespace on the cluster, choose prometheus-operator, and subscribe.

  3. Click Overview and create a Grafana Data Source instance.

  4. In the Grafana Data Source YAML file, make sure metadata.namespace is prometheus-operator. Set spec.datasources.url to the URL of the target datasource. For example, inside the grafana_datasource.yaml file, the Prometheus service is prometheus-operated on port 9090, so the URL is set to http://prometheus-operated:9090.

  5. Click Overview and create a Grafana instance.

  6. In the Grafana YAML file, make sure metadata.namespace is prometheus-operator. You can define the match expression to select which Dashboards you are interested in under spec.dashboardLabelSelector.matchExpressions. For example, inside the grafana.yaml file, the Grafana will discover dashboards with app labels having a value of grafana.

  7. Click Overview and create a Grafana Dashboard instance.

  8. Copy the grafana_dashboard.yaml file to Grafana Dashboard YAML file to check the Data Source is connected and Prometheus endpoints are discoverable.

  9. Click Networking > Routes and go to Grafana’s location to see the template dashboard. You can now consume all the application metrics gathered by Prometheus on the Grafana dashboard.

    Template Dashboard
  10. When importing your own Grafana dashboard, your dashboard should be configured under spec.json in Grafana Dashboard YAML file. Make sure under "__inputs", the name matches with your Grafana Data Source’s spec.datasources. For example, inside the grafana_dashboard.yaml file, name is set to "Prometheus".

Configure Prometheus Operator to Detect Service Monitors in Other Namespaces

By default, the Prometheus Operator only watches the namespace it currently resides in, so in order to get the Prometheus Operator to detect service monitors created in other namespaces, you must apply the following configuration changes.

  1. In your monitoring namespace - in this case, the monitoring namespace is prometheus-operator - edit the OperatorGroup to add your application’s namespace, for example, myapp, to the list of targeted namesaces to be watched. This will change the olm.targetNamespaces variable that the Prometheus Operator uses for detecting namespaces to include your myapp namespace.

    oc edit operatorgroup
    spec:
        targetNamespaces:
        - prometheus-operator
        - myapp
  2. Since we have changed the prometheus-operator namespace’s OperatorGroup to monitor more than one namespace, the operators in this namespace must have the MultiNamespace installMode set to true. Prometheus Operator installed via OLM has the MultiNamespace installMode set to false by default, disabling monitoring for more

    oc edit csv prometheusoperator.0.32.0
    spec:
    installModes:
    - supported: true
        type: OwnNamespace
    - supported: true
        type: SingleNamespace
    - supported: true       # this line should be true
        type: MultiNamespace
    - supported: false
        type: AllNamespaces
  3. The same goes for the Grafana Operator, the MultiNamespace installMode should be set to true. Before editing the Grafana Operator, make sure you have zero Grafana Operator pods running. If you have any Grafana Operator pods running, the Grafana Operator will get stuck in the installation phase when it attempts to reinstall itself with the new MultiNamespace installMode. Scale the number of Grafana Operator pods to zero by editing the replicas definition in your grafana-operator deployment:

    oc edit deployment grafana-operator
    spec:
        ...
        replicas: 0     # set this value to 0

    Once you have zero Grafana Operator pods running, edit the operator using:

    oc edit csv grafana-operator.v2.0.0

    After editing the operator to have the MultiNamespace installMode set to true, check to make sure the Grafana Operator has a status of Succeeded under the Installed Operators page before scaling up the number of Grafana Operator pods again.

  4. Edit the Prometheus instance to add the serviceMonitorNamespaceSelector definition. The empty brackets {} allow Prometheus to scrape from all namespaces:

    oc edit prometheuses.monitoring.coreos.com prometheus
    spec:
    serviceMonitorNamespaceSelector: {}
  5. Restart the Prometheus Operator and Grafana Operator pods to see the changes.

Installation Complete

You now have the Prometheus and Grafana stack installed and configured to monitor your applications. Import custom dashboards and visit the Grafana route to see your metrics visualized.