OpenShift ships with a pre-installed monitoring stack for monitoring platform components out of the box. The monitoring stack is based on the Prometheus open source project and includes multiple components such as Grafana for analyzing and visualizing metrics. Developers and other users often need to monitor services and pods in their projects and namespaces. These users may be interested in observing application level metrics exposed by their services or other platform metrics such as CPU usage, memory usage, requests rate and timeouts within specific namespaces. Such requirements can be fulfilled when monitoring for user-defined projects is enabled by a cluster administrator.
We can further create alerting rules, trigger alerts when a metric reaches a set threshold and manage alerts within the Alerting UI in OpenShift. For example, we might want to trigger an alert when requests to a service times out after a specified number of consecutive requests within a time-period. This can be achieved with alerting rules. Alerting rules are defined in a YAML file and contains a PromQL expression of the alert condition.
Alert rules and their states can be viewed within the OpenShift Alerting UI, however application and platform teams might also have requirements to receive notifications on external systems when an alert is in firing state. OpenShift currently supports routing alert notifications to four external receiver types: Email, Slack, PagerDuty and Webhook. With OpenShift 4.10, there is one supported way of routing alerts to external systems: A cluster administrator configures the receiver from within Alertmanager UI or applies a YAML configuration. In the next sections, we will:
- Describe the steps for enabling user-defined project monitoring so we can monitor services within a specific namespace.
- Deploy a sample service within a specific namespace and create an Alerting rule.
- Configure Slack to post OpenShift alerts to a channel within a Slack workspace.
- Describe how to configure an alert receiver using both openhift UI and YAML configuration.