.. _minio-metrics-influxdb:
======================================
Monitoring and Alerting using InfluxDB
======================================
.. default-domain:: minio
.. contents:: Table of Contents
:local:
:depth: 1
MinIO publishes cluster and node metrics using the :prometheus-docs:`Prometheus Data Model `.
`InfluxDB `__ supports scraping MinIO metrics data for monitoring and alerting.
The procedure on this page documents the following:
- Configuring an InfluxDB service to scrape and display metrics from a MinIO deployment
- Configuring an Alert on a MinIO metric
.. admonition:: Prerequisites
:class: note
This procedure requires the following:
- An existing InfluxDB deployment configured with one or more :influxdb-docs:`notification endpoints `
- An existing MinIO deployment with network access to the InfluxDB deployment
- An :mc:`mc` installation on your local host configured to :ref:`access ` the MinIO deployment
.. cond:: k8s
This procedure assumes all necessary network control components, such as Ingress or Load Balancers, to facilitate access between the MinIO Tenant and the InfluxDB service.
Configure InfluxDB to Collect and Alert using MinIO Metrics
-----------------------------------------------------------
.. important::
This procedure specifically uses the InfluxDB UI to create a scraping endpoint.
The InfluxDB UI does not provide the same level of configuration as using `Telegraf `__ and the corresponding `Prometheus plugin `__.
Specifically:
- You cannot enable authenticated access to the MinIO metrics endpoint via the InfluxDB UI
- You cannot set a tag for collected metrics (e.g. ``url_tag``) for uniquely identifying the metrics for a given MinIO deployment
.. cond:: k8s
The Telegraf Prometheus plugin also supports Kubernetes-specific features, such as scraping the ``minio`` service for a given MinIO Tenant.
Configuring Telegraf is out of scope for this procedure.
You can use this procedure as general guidance for configuring Telegraf to scrape MinIO metrics.
.. container:: procedure
1. Configure Public Access to MinIO Metrics
Set the :envvar:`MINIO_PROMETHEUS_AUTH_TYPE` environment variable to ``"public"`` for all nodes in the MinIO deployment.
You can then restart the deployment to allow public access to MinIO metrics.
You can validate the change by attempting to ``curl`` the metrics endpoint:
.. code-block:: shell
:class: copyable
curl https://HOSTNAME/minio/v2/metrics/cluster
Replace ``HOSTNAME`` with the URL of the load balancer or reverse proxy through which you access the MinIO deployment.
You can alternatively specify any single node as ``HOSTNAME:PORT``, specifying the MinIO server API port in addition to the node hostname.
The response body should include a list of collected MinIO metrics.
#. Log into the InfluxDB UI and Create a Bucket
Select the :influxdb-docs:`Organization ` under which you want to store MinIO metrics.
Create a :influxdb-docs:`New Bucket ` in which to store metrics for the MinIO deployment.
#. Create a new Scraping Source
Create a :influxdb-docs:`new InfluxDB Scraper `.
Specify the full URL to the MinIO deployment, including the metrics endpoint:
.. code-block:: shell
:class: copyable
https://HOSTNAME/minio/v2/metrics/cluster
Replace ``HOSTNAME`` with the URL of the load balancer or reverse proxy through which you access the MinIO deployment.
You can alternatively specify any single node as ``HOSTNAME:PORT``, specifying the MinIO server API port in addition to the node hostname.
#. Validate the Data
Use the :influxdb-docs:`DataExplorer ` to visualize the collected MinIO data.
For example, you can set a filter on :metric:`minio_cluster_capacity_usable_total_bytes` and :metric:`minio_cluster_capacity_usable_free_bytes` to compare the total usable against total free space on the MinIO deployment.
#. Configure a Check
Create a :influxdb-docs:`new Check ` on a MinIO metric.
The following example check rules provide a baseline of alerts for a MinIO deployment.
You can modify or otherwise use these examples for guidance in building your own checks.
- Create a :guilabel:`Threshold Check` named ``MINIO_NODE_DOWN``.
Set the filter for the :metric:`minio_cluster_nodes_offline_total` key.
Set the :guilabel:`Thresholds` to :guilabel:`WARN` when the value is greater than :guilabel:`1`
- Create a :guilabel:`Threshold Check` named ``MINIO_QUORUM_WARNING``.
Set the filter for the :metric:`minio_cluster_disk_offline_total` key.
Set the :guilabel:`Thresholds` to :guilabel:`CRITICAL` when the value is one less than your configured :ref:`Erasure Code Parity ` setting.
For example, a deployment using EC:4 should set this value to ``3``.
Configure your :influxdb-docs:`Notification endpoints ` and :influxdb-docs:`Notification rules ` such that checks of each type trigger an appropriate response.