mirror of
https://github.com/minio/docs.git
synced 2025-08-09 13:02:53 +03:00
draft: more metrics v2/v3 rework
This commit is contained in:
@@ -1,25 +1,25 @@
|
||||
.. _minio-metrics-collect-using-prometheus:
|
||||
|
||||
========================================
|
||||
Monitoring and Alerting using Prometheus
|
||||
Monitoring and alerting using Prometheus
|
||||
========================================
|
||||
|
||||
.. default-domain:: minio
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:local:
|
||||
:depth: 1
|
||||
:depth: 2
|
||||
|
||||
.. container:: extlinks-video
|
||||
|
||||
- `Monitoring with MinIO and Prometheus: Overview <https://youtu.be/A3vCDaFWNNs?ref=docs>`__
|
||||
- `Monitoring with MinIO and Prometheus: Lab <https://youtu.be/Oix9iXndSUY?ref=docs>`__
|
||||
|
||||
MinIO publishes cluster, node, bucket, and resource metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/#data-model>`.
|
||||
MinIO publishes metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/#data-model>`.
|
||||
The procedure on this page documents the following:
|
||||
|
||||
- Configuring a Prometheus service to scrape and display metrics from a MinIO deployment
|
||||
- Configuring an Alert Rule on a MinIO Metric to trigger an AlertManager action
|
||||
- Configure a Prometheus service to scrape and display metrics from a MinIO deployment
|
||||
- Configure an Alert Rule on a MinIO Metric to trigger an AlertManager action
|
||||
|
||||
.. admonition:: Prerequisites
|
||||
:class: note
|
||||
@@ -32,129 +32,82 @@ The procedure on this page documents the following:
|
||||
|
||||
- An :mc:`mc` installation on your local host configured to :ref:`access <alias>` the MinIO deployment
|
||||
|
||||
Configure Prometheus to Collect and Alert using MinIO Metrics
|
||||
-------------------------------------------------------------
|
||||
.. admonition:: Metrics Version 2 Deprecated
|
||||
:class: note
|
||||
|
||||
1) Generate the Scrape Configuration
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Starting with MinIO Server :minio-release:`RELEASE.2024-07-15T19-02-30Z` and MinIO Client :mc-release:`RELEASE.2024-07-11T18-01-28Z`, metrics version 3 replaces the deprecated :ref:`metrics version 2 <minio-metrics-v2>`.
|
||||
|
||||
Use the :mc:`mc admin prometheus generate` command to generate the scrape configuration for use by Prometheus in making scraping requests:
|
||||
MinIO recommends new monitoring configurations use :ref:`version 3 metrics <minio-metrics-and-alerts>`.
|
||||
|
||||
.. tab-set::
|
||||
Collect and alert on metrics
|
||||
----------------------------
|
||||
|
||||
.. tab-item:: MinIO Server
|
||||
Generate the scrape configuration
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The following command scrapes metrics for the MinIO cluster.
|
||||
Use :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` to generate a scrape configuration for each :ref:`type of metric <minio-metrics-and-alerts-available-metrics>` you want to scrape with Prometheus.
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS
|
||||
For example, the following command scrapes all version 3 audit metrics for the MinIO cluster:
|
||||
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
The command returns output similar to the following:
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/cluster
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
.. tab-item:: Nodes
|
||||
mc admin prometheus generate ALIAS audit --api-version v3
|
||||
|
||||
The following command scrapes metrics for a node on the MinIO Server.
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS node
|
||||
The command returns output similar to the following:
|
||||
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-node
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/node
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
|
||||
|
||||
.. tab-item:: Buckets
|
||||
scrape_configs:
|
||||
- job_name: minio-job
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/metrics/v3
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
The following command scrapes metrics for buckets on the MinIO Server.
|
||||
To scrape multiple types of metrics, run :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` for each type and add the ``job_name`` section to the ``scrape_configs`` in your Prometheus configuration.
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS bucket
|
||||
The following example scrapes audit and system metrics every 60 seconds:
|
||||
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-bucket
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/bucket
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
.. tab-item:: Resources
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
.. versionadded:: RELEASE.2023-10-07T15-07-38Z
|
||||
scrape_configs:
|
||||
- job_name: minio-job-audit
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/metrics/v3/audit
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
The following command scrapes metrics for resources on the MinIO Server.
|
||||
- job_name: minio-job-system
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/metrics/v3/system
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
If needed, edit the generated configuration for your environment.
|
||||
Common changes include:
|
||||
|
||||
mc admin prometheus generate ALIAS resource
|
||||
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-resource
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/resource
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
- Set an appropriate ``scrape_interval`` value to ensure each scraping operation completes before the next one begins.
|
||||
The recommended value is 60 seconds.
|
||||
|
||||
Some deployments require a longer scrape interval due to the number of metrics being scraped.
|
||||
To reduce the load on your MinIO and Prometheus servers, choose the longest interval that meets your monitoring requirements.
|
||||
|
||||
You can specify a ``scrape_interval`` for each job in its ``job_name`` section, or all jobs in a separate ``global`` section.
|
||||
|
||||
- Set the ``job_name`` to a value associated to the MinIO deployment.
|
||||
|
||||
Use a unique value to ensure isolation of the deployment metrics from any others collected by that Prometheus service.
|
||||
Use a unique value for each job to ensure isolation of the deployment metrics from any others collected by that Prometheus service.
|
||||
|
||||
- MinIO deployments started with :envvar:`MINIO_PROMETHEUS_AUTH_TYPE` set to ``"public"`` can omit the ``bearer_token`` field.
|
||||
|
||||
@@ -170,177 +123,45 @@ Use the :mc:`mc admin prometheus generate` command to generate the scrape config
|
||||
|
||||
For Prometheus deployments external to the cluster, you must specify an ingress or load balancer endpoint configured to route connections to and from the MinIO Tenant.
|
||||
|
||||
2) Restart Prometheus with the Updated Configuration
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Restart Prometheus with the updated configuration
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Append the desired ``scrape_configs`` job generated in the previous step to the configuration file:
|
||||
|
||||
.. tab-set::
|
||||
|
||||
.. tab-item:: Cluster
|
||||
|
||||
Cluster metrics aggregate node-level metrics and, where appropriate, attach labels to metrics for the originating node.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/cluster
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
|
||||
.. tab-item:: Nodes
|
||||
|
||||
Node metrics are specific for node-level monitoring. You need to list all MinIO nodes for this configuration.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-node
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/node
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
|
||||
|
||||
|
||||
.. tab-item:: Bucket
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-bucket
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/bucket
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
.. tab-item:: Resource
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-resource
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/resource
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
Start the Prometheus cluster using the configuration file:
|
||||
Add the desired ``scrape_configs`` jobs to your Prometheus configuration file and start the Prometheus cluster:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
prometheus --config.file=prometheus.yaml
|
||||
|
||||
3) Analyze Collected Metrics
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Analyze collected metrics
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Prometheus includes an :prometheus-docs:`expression browser <prometheus/latest/getting_started/#using-the-expression-browser>`.
|
||||
You can execute queries here to analyze the collected metrics.
|
||||
|
||||
.. tab-set::
|
||||
The following query examples return metrics collected by Prometheus every five minutes for a scrape job named ``minio-job``:
|
||||
|
||||
.. tab-item:: Examples
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
The following query examples return metrics collected by Prometheus every five minutes for a scrape job named ``minio-job``:
|
||||
minio_system_drive_used_bytes{job-"minio-job"}[5m]
|
||||
minio_system_drive_used_inodes{job-"minio-job"}[5m]
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
minio_cluster_usage_buckets_total_bytes{job-"minio-job"}[5m]
|
||||
minio_cluster_usage_buckets_objects_count{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_free_bytes{job-"minio-job"}[5m]
|
||||
minio_node_drive_free_inodes{job-"minio-job"}[5m]
|
||||
minio_api_requests_total{job-"minio-job"}[5m]
|
||||
minio_api_requests_errors_total{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_latency_us{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_offline_total{job-"minio-job"}[5m]
|
||||
minio_node_drive_online_total{job-"minio-job"}[5m]
|
||||
Configure an alert rule
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
minio_node_drive_total{job-"minio-job"}[5m]
|
||||
To trigger alerts based on metrics, configure :prometheus-docs:`Alert Rules <prometheus/latest/configuration/alerting_rules/>` on the Prometheus deployment.
|
||||
|
||||
minio_node_drive_total_bytes{job-"minio-job"}[5m]
|
||||
minio_node_drive_used_bytes{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_errors_timeout{job-"minio-job"}[5m]
|
||||
minio_node_drive_errors_availability{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_io_waiting{job-"minio-job"}[5m]
|
||||
|
||||
.. tab-item:: Recommended Metrics
|
||||
|
||||
MinIO recommends the following as a basic set of metrics to monitor.
|
||||
|
||||
See :ref:`minio-metrics-and-alerts` for information about all available metrics.
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 40 60
|
||||
:width: 100%
|
||||
|
||||
* - Metric
|
||||
- Description
|
||||
|
||||
* - ``minio_node_drive_free_bytes``
|
||||
- Total storage available on a drive.
|
||||
|
||||
* - ``minio_node_drive_free_inodes``
|
||||
- Total free inodes.
|
||||
|
||||
* - ``minio_node_drive_latency_us``
|
||||
- Average last minute latency in µs for drive API storage operations.
|
||||
|
||||
* - ``minio_node_drive_offline_total``
|
||||
- Total drives offline in this node.
|
||||
|
||||
* - ``minio_node_drive_online_total``
|
||||
- Total drives online in this node.
|
||||
|
||||
* - ``minio_node_drive_total``
|
||||
- Total drives in this node.
|
||||
|
||||
* - ``minio_node_drive_total_bytes``
|
||||
- Total storage on a drive.
|
||||
|
||||
* - ``minio_node_drive_used_bytes``
|
||||
- Total storage used on a drive.
|
||||
|
||||
* - ``minio_node_drive_errors_timeout``
|
||||
- Total number of drive timeout errors since server start.
|
||||
|
||||
* - ``minio_node_drive_errors_availability``
|
||||
- Total number of drive I/O errors, permission denied and timeouts since server start.
|
||||
|
||||
* - ``minio_node_drive_io_waiting``
|
||||
- Total number of I/O operations waiting on drive.
|
||||
|
||||
4) Configure an Alert Rule using MinIO Metrics
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You must configure :prometheus-docs:`Alert Rules <prometheus/latest/configuration/alerting_rules/>` on the Prometheus deployment to trigger alerts based on collected MinIO metrics.
|
||||
|
||||
The following example alert rule files provide a baseline of alerts for a MinIO deployment.
|
||||
You can modify or otherwise use these examples as guidance in building your own alerts.
|
||||
The following example alert provides a baseline of alerts for a MinIO deployment.
|
||||
You can modify or use these examples as guidance for building your own alerts.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
@@ -349,7 +170,7 @@ You can modify or otherwise use these examples as guidance in building your own
|
||||
- name: minio-alerts
|
||||
rules:
|
||||
- alert: NodesOffline
|
||||
expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio-job"}[5m]) > 0
|
||||
expr: avg_over_time(minio_cluster_health_nodes_offline_count{job="minio-job"}[5m]) > 0
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warn
|
||||
@@ -358,7 +179,7 @@ You can modify or otherwise use these examples as guidance in building your own
|
||||
description: "Node(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"
|
||||
|
||||
- alert: DisksOffline
|
||||
expr: avg_over_time(minio_cluster_drive_offline_total{job="minio-job"}[5m]) > 0
|
||||
expr: avg_over_time(minio_system_drive_offline_count{job="minio-job"}[5m]) > 0
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warn
|
||||
@@ -378,5 +199,5 @@ Once triggered, Prometheus sends the alert to the configured AlertManager servic
|
||||
Dashboards
|
||||
----------
|
||||
|
||||
MinIO provides Grafana Dashboards to display metrics collected by Prometheus.
|
||||
For v2 metrics, MinIO provides Grafana Dashboards to display metrics collected by Prometheus.
|
||||
For more information, see :ref:`minio-grafana`
|
||||
|
@@ -17,10 +17,10 @@ Metrics and Alerts
|
||||
|
||||
Starting with MinIO Server :minio-release:`RELEASE.2024-07-15T19-02-30Z` and MinIO Client :mc-release:`RELEASE.2024-07-11T18-01-28Z`, metrics version 3 replaces the deprecated :ref:`metrics version 2 <minio-metrics-v2>`.
|
||||
|
||||
MinIO publishes cluster and node metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/>`.
|
||||
MinIO publishes metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/>`.
|
||||
You can use any scraping tool to pull metrics data from MinIO for further analysis and alerting.
|
||||
|
||||
For metrics version 3, all metrics are available under the base ``/minio/metrics/v3`` endpoint by appending an additional path for each category.
|
||||
For metrics version 3, all metrics are available under the base ``/minio/metrics/v3`` endpoint, optionally by appending an additional path for each category.
|
||||
|
||||
For example, the following endpoint returns audit metrics:
|
||||
|
||||
@@ -36,66 +36,80 @@ By default, MinIO requires authentication to scrape the metrics endpoints.
|
||||
To generate the needed bearer tokens, use :mc:`mc admin prometheus generate`.
|
||||
You can also disable metrics endpoint authentication by setting :envvar:`MINIO_PROMETHEUS_AUTH_TYPE` to ``public``.
|
||||
|
||||
MinIO provides the following scraping endpoints, relative to the base URL:
|
||||
You can also access metrics using :mc-cmd:`mc admin prometheus metrics` and the metric type for the desired category.
|
||||
For more information, see the :mc-cmd:`MinIO Admin Client reference <mc admin prometheus metrics>`.
|
||||
|
||||
MinIO provides the following types and scraping endpoints, relative to the base URL:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
:widths: 30 70
|
||||
:widths: 25 25 50
|
||||
:width: 100%
|
||||
|
||||
* - Category
|
||||
- Metric type
|
||||
- Path
|
||||
|
||||
* - API
|
||||
- ``api``
|
||||
- ``/api/requests``
|
||||
|
||||
|
||||
``/bucket/api``
|
||||
|
||||
* - Audit
|
||||
- ``audit``
|
||||
- ``/audit``
|
||||
|
||||
* - Cluster
|
||||
- ``cluster``
|
||||
- ``/cluster/config``
|
||||
|
||||
|
||||
``/cluster/erasure-set``
|
||||
|
||||
|
||||
``/cluster/health``
|
||||
|
||||
|
||||
``/cluster/iam``
|
||||
|
||||
|
||||
``/cluster/usage/buckets``
|
||||
|
||||
|
||||
``/cluster/usage/objects``
|
||||
|
||||
* - Debug
|
||||
- ``debug``
|
||||
- ``/debug/go``
|
||||
|
||||
* - ILM
|
||||
- ``ilm``
|
||||
- ``/ilm``
|
||||
|
||||
* - Logger webhook
|
||||
- ``logger``
|
||||
- ``/logger/webhook``
|
||||
|
||||
* - Notification
|
||||
- ``notification``
|
||||
- ``/notification``
|
||||
|
||||
* - Replication
|
||||
- ``replication``
|
||||
- ``/replication``
|
||||
|
||||
|
||||
``/bucket/replication``
|
||||
|
||||
* - Scanner
|
||||
- ``scanner``
|
||||
- ``/scanner``
|
||||
|
||||
* - System
|
||||
- ``system``
|
||||
- ``/system/drive``
|
||||
|
||||
|
||||
``/system/memory``
|
||||
|
||||
|
||||
``/system/cpu``
|
||||
|
||||
|
||||
``/system/network/internode``
|
||||
|
||||
|
||||
``/system/process``
|
||||
|
||||
For a complete list of metrics for each endpoint, see :ref:`Available Metrics <minio-metrics-and-alerts-available-metrics>`.
|
||||
@@ -103,23 +117,17 @@ For a complete list of metrics for each endpoint, see :ref:`Available Metrics <m
|
||||
.. cond:: k8s
|
||||
|
||||
The MinIO Operator supports deploying a per-tenant Prometheus instance configured to support metrics and visualization.
|
||||
|
||||
|
||||
If you deploy the Tenant with this feature disabled *but* still want the historical metric views, you can instead configure an external Prometheus service to scrape the Tenant metrics.
|
||||
Once configured, you can update the Tenant to query that Prometheus service to retrieve metric data:
|
||||
|
||||
.. cond:: linux or container or macos or windows
|
||||
|
||||
|
||||
To enable historical data visualization in MinIO Console, set the following environment variables on each node in the MinIO deployment:
|
||||
|
||||
- Set :envvar:`MINIO_PROMETHEUS_URL` to the URL of the Prometheus service
|
||||
- Set :envvar:`MINIO_PROMETHEUS_JOB_ID` to the unique job ID assigned to the collected metrics
|
||||
|
||||
MinIO Grafana Dashboard
|
||||
-----------------------
|
||||
|
||||
MinIO also publishes two :ref:`Grafana Dashboards <minio-grafana>` for visualizing collected metrics.
|
||||
For more complete documentation on configuring a Prometheus-compatible data source for Grafana, see the :prometheus-docs:`Prometheus documentation on Grafana Support <visualization/grafana/>`.
|
||||
|
||||
.. _minio-metrics-and-alerts-available-metrics:
|
||||
|
||||
Available Metrics
|
||||
|
@@ -9,14 +9,17 @@ Metrics Version 2
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:local:
|
||||
:depth: 3
|
||||
:depth: 1
|
||||
|
||||
.. admonition:: Metrics Version 2 Deprecated
|
||||
:class: note
|
||||
|
||||
Starting with MinIO Server :minio-release:`RELEASE.2024-07-15T19-02-30Z` and MinIO Client :mc-release:`RELEASE.2024-07-11T18-01-28Z`, :ref:`metrics version 3 <minio-metrics-and-alerts>` replaces the deprecated metrics version 2.
|
||||
|
||||
The following sections describe the deprecated endpoints and metrics.
|
||||
The following sections describe the deprecated version 2 endpoints and metrics.
|
||||
|
||||
Metrics version 2 endpoints
|
||||
---------------------------
|
||||
|
||||
.. tab-set::
|
||||
|
||||
@@ -67,6 +70,304 @@ The following sections describe the deprecated endpoints and metrics.
|
||||
For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.
|
||||
|
||||
|
||||
Configure Prometheus to Collect and Alert using MinIO Metrics
|
||||
-------------------------------------------------------------
|
||||
|
||||
1) Generate the Scrape Configuration
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Use the :mc:`mc admin prometheus generate` command to generate the scrape configuration for use by Prometheus in making scraping requests:
|
||||
|
||||
.. tab-set::
|
||||
|
||||
.. tab-item:: MinIO Server
|
||||
|
||||
The following command scrapes metrics for the MinIO cluster.
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS
|
||||
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
The command returns output similar to the following:
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/cluster
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
.. tab-item:: Nodes
|
||||
|
||||
The following command scrapes metrics for a node on the MinIO Server.
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS node
|
||||
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-node
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/node
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
|
||||
|
||||
.. tab-item:: Buckets
|
||||
|
||||
The following command scrapes metrics for buckets on the MinIO Server.
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS bucket
|
||||
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-bucket
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/bucket
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
.. tab-item:: Resources
|
||||
|
||||
.. versionadded:: RELEASE.2023-10-07T15-07-38Z
|
||||
|
||||
The following command scrapes metrics for resources on the MinIO Server.
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS resource
|
||||
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-resource
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/resource
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
- Set an appropriate ``scrape_interval`` value to ensure each scraping operation completes before the next one begins.
|
||||
The recommended value is 60 seconds.
|
||||
|
||||
Some deployments require a longer scrape interval due to the number of metrics being scraped.
|
||||
To reduce the load on your MinIO and Prometheus servers, choose the longest interval that meets your monitoring requirements.
|
||||
|
||||
- Set the ``job_name`` to a value associated to the MinIO deployment.
|
||||
|
||||
Use a unique value to ensure isolation of the deployment metrics from any others collected by that Prometheus service.
|
||||
|
||||
- MinIO deployments started with :envvar:`MINIO_PROMETHEUS_AUTH_TYPE` set to ``"public"`` can omit the ``bearer_token`` field.
|
||||
|
||||
- Set the ``scheme`` to http for MinIO deployments not using TLS.
|
||||
|
||||
- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
|
||||
|
||||
This can be any single node, or a load balancer/proxy which handles connections to the MinIO nodes.
|
||||
|
||||
.. cond:: k8s
|
||||
|
||||
For Prometheus deployments in the same cluster as the MinIO Tenant, you can specify the service DNS name for the ``minio`` service.
|
||||
|
||||
For Prometheus deployments external to the cluster, you must specify an ingress or load balancer endpoint configured to route connections to and from the MinIO Tenant.
|
||||
|
||||
2) Restart Prometheus with the Updated Configuration
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Append the desired ``scrape_configs`` job generated in the previous step to the configuration file:
|
||||
|
||||
.. tab-set::
|
||||
|
||||
.. tab-item:: Cluster
|
||||
|
||||
Cluster metrics aggregate node-level metrics and, where appropriate, attach labels to metrics for the originating node.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/cluster
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
|
||||
.. tab-item:: Nodes
|
||||
|
||||
Node metrics are specific for node-level monitoring. You need to list all MinIO nodes for this configuration.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-node
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/node
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
|
||||
|
||||
|
||||
.. tab-item:: Bucket
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-bucket
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/bucket
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
.. tab-item:: Resource
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 60s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-resource
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/resource
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
Start the Prometheus cluster using the configuration file:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
prometheus --config.file=prometheus.yaml
|
||||
|
||||
3) Analyze Collected Metrics
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Prometheus includes an :prometheus-docs:`expression browser <prometheus/latest/getting_started/#using-the-expression-browser>`.
|
||||
You can execute queries here to analyze the collected metrics.
|
||||
|
||||
|
||||
The following query examples return metrics collected by Prometheus every five minutes for a scrape job named ``minio-job``:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
minio_node_drive_free_bytes{job-"minio-job"}[5m]
|
||||
minio_node_drive_free_inodes{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_latency_us{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_offline_total{job-"minio-job"}[5m]
|
||||
minio_node_drive_online_total{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_total{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_total_bytes{job-"minio-job"}[5m]
|
||||
minio_node_drive_used_bytes{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_errors_timeout{job-"minio-job"}[5m]
|
||||
minio_node_drive_errors_availability{job-"minio-job"}[5m]
|
||||
|
||||
minio_node_drive_io_waiting{job-"minio-job"}[5m]
|
||||
|
||||
|
||||
4) Configure an Alert Rule using MinIO Metrics
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You must configure :prometheus-docs:`Alert Rules <prometheus/latest/configuration/alerting_rules/>` on the Prometheus deployment to trigger alerts based on collected MinIO metrics.
|
||||
|
||||
The following example alert rule files provide a baseline of alerts for a MinIO deployment.
|
||||
You can modify or otherwise use these examples as guidance in building your own alerts.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
groups:
|
||||
- name: minio-alerts
|
||||
rules:
|
||||
- alert: NodesOffline
|
||||
expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio-job"}[5m]) > 0
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warn
|
||||
annotations:
|
||||
summary: "Node down in MinIO deployment"
|
||||
description: "Node(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"
|
||||
|
||||
- alert: DisksOffline
|
||||
expr: avg_over_time(minio_cluster_drive_offline_total{job="minio-job"}[5m]) > 0
|
||||
for: 10m
|
||||
labels:
|
||||
severity: warn
|
||||
annotations:
|
||||
summary: "Disks down in MinIO deployment"
|
||||
description: "Disks(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"
|
||||
|
||||
In the Prometheus configuration, specify the path to the alert file in the ``rule_files`` key:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
rule_files:
|
||||
- minio-alerting.yml
|
||||
|
||||
Once triggered, Prometheus sends the alert to the configured AlertManager service.
|
||||
|
||||
|
||||
Available metrics
|
||||
-----------------
|
||||
|
||||
- :ref:`Cluster Metrics <minio-available-cluster-metrics>`
|
||||
- :ref:`Bucket Metrics <minio-available-bucket-metrics>`
|
||||
- :ref:`Resource Metrics <minio-available-resource-metrics>`
|
||||
|
@@ -10,6 +10,8 @@
|
||||
|
||||
.. mc:: mc admin prometheus generate
|
||||
|
||||
Starting with MinIO Server :minio-release:`RELEASE.2024-07-15T19-02-30Z` and MinIO Client :mc-release:`RELEASE.2024-07-11T18-01-28Z`, metrics version 3 replaces the deprecated :ref:`metrics version 2 <minio-metrics-v2>`.
|
||||
|
||||
Description
|
||||
-----------
|
||||
|
||||
@@ -19,7 +21,7 @@ The :mc:`mc admin prometheus generate` command generates a metrics scraping conf
|
||||
|
||||
.. end-mc-admin-prometheus-generate-desc
|
||||
|
||||
For more complete documentation on using MinIO with Prometheus, see :ref:`How to monitor MinIO server with Prometheus <minio-metrics-collect-using-prometheus>`
|
||||
For more complete documentation on using MinIO with Prometheus, see :ref:`How to monitor MinIO server with Prometheus <minio-metrics-collect-using-prometheus>` and :ref:`minio-metrics-and-alerts`.
|
||||
|
||||
.. admonition:: Use ``mc admin`` on MinIO Deployments Only
|
||||
:class: note
|
||||
@@ -32,12 +34,12 @@ For more complete documentation on using MinIO with Prometheus, see :ref:`How to
|
||||
|
||||
.. tab-item:: EXAMPLE
|
||||
|
||||
The following command generates a Prometheus scrape configuration that collects bucket metrics from the deployment at :term:`alias` ``myminio``:
|
||||
The following command generates a Prometheus scrape configuration that collects audit metrics from the deployment at :term:`alias` ``myminio``:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate myminio bucket
|
||||
mc admin prometheus generate myminio audit --api_version v3
|
||||
|
||||
.. tab-item:: SYNTAX
|
||||
|
||||
@@ -46,9 +48,11 @@ For more complete documentation on using MinIO with Prometheus, see :ref:`How to
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc [GLOBALFLAGS] admin prometheus generate \
|
||||
ALIAS \
|
||||
[TYPE]
|
||||
mc [GLOBALFLAGS] admin prometheus generate \
|
||||
ALIAS \
|
||||
[TYPE] \
|
||||
[--api_version v3] \
|
||||
[TYPE --bucket <bucket name> --api_version v3]
|
||||
|
||||
.. include:: /includes/common-minio-mc.rst
|
||||
:start-after: start-minio-syntax
|
||||
@@ -63,23 +67,58 @@ Parameters
|
||||
|
||||
The :mc:`alias <mc alias>` of a configured MinIO deployment for which the command generates a Prometheus-compatible configuration file.
|
||||
|
||||
.. mc-cmd:: --api-version v3
|
||||
:optional:
|
||||
|
||||
Generate a scrape configuration for metrics version 3.
|
||||
Omit to generate a metrics version 2 configuration.
|
||||
|
||||
.. mc-cmd:: --bucket
|
||||
:optional:
|
||||
|
||||
For v3 metric types that return bucket-level metrics, specify a bucket name.
|
||||
Use with :mc-cmd:`~mc admin prometheus generate --api-version`.
|
||||
|
||||
``--bucket`` works for the following v3 metric types:
|
||||
|
||||
- ``api``
|
||||
- ``replication``
|
||||
|
||||
The following example generates a configuration for API metrics from the bucket ``mybucket``:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate play api --bucket mybucket --api-version v3
|
||||
|
||||
.. mc-cmd:: TYPE
|
||||
:optional:
|
||||
|
||||
The type of metrics to scrape.
|
||||
|
||||
.. versionchanged:: RELEASE.2023-10-07T15-07-38Z
|
||||
Valid values for metrics version 3 are:
|
||||
|
||||
``resource`` metrics added
|
||||
- ``api``
|
||||
- ``audit``
|
||||
- ``cluster``
|
||||
- ``debug``
|
||||
- ``ilm``
|
||||
- ``logger``
|
||||
- ``notification``
|
||||
- ``replication``
|
||||
- ``scanner``
|
||||
- ``system``
|
||||
|
||||
Valid values are:
|
||||
If not specified, a ``v3`` command returns all metrics.
|
||||
|
||||
Valid values for metrics version 2 are:
|
||||
|
||||
- ``bucket``
|
||||
- ``cluster``
|
||||
- ``node``
|
||||
- ``resource``
|
||||
|
||||
If not specified, the command returns cluster metrics.
|
||||
If not specified, a ``v2`` command returns cluster metrics.
|
||||
Cluster metrics also include node metrics.
|
||||
|
||||
Global Flags
|
||||
@@ -90,18 +129,18 @@ Global Flags
|
||||
:end-before: end-minio-mc-globals
|
||||
|
||||
|
||||
Example
|
||||
-------
|
||||
Examples
|
||||
--------
|
||||
|
||||
Generate a scrape config for bucket metrics
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
Generate a default metrics v3 config
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Use :mc-cmd:`mc admin prometheus generate` to generate a scrape configuration that collects bucket metrics for a MinIO deployment:
|
||||
Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects all v3 metrics for a MinIO deployment:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS bucket
|
||||
mc admin prometheus generate ALIAS --api-version v3
|
||||
|
||||
- Replace ``ALIAS`` with the :mc-cmd:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
@@ -110,9 +149,67 @@ The output resembles the following:
|
||||
.. code-block:: shell
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-bucket
|
||||
- job_name: minio-job
|
||||
bearer_token: [auth token]
|
||||
metrics_path: /minio/v2/metrics/bucket
|
||||
metrics_path: /minio/metrics/v3
|
||||
scheme: http
|
||||
static_configs:
|
||||
- targets: ['localhost:9000']
|
||||
|
||||
|
||||
Generate a v3 bucket replication metrics config
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS replication --bucket mybucket --api-version v3
|
||||
|
||||
- Replace ``ALIAS`` with the :mc-cmd:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
The output resembles the following:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job-replication
|
||||
bearer_token: [auth token]
|
||||
metrics_path: /minio/metrics/v3/bucket/replication/mybucket
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [`localhost:9000`]
|
||||
|
||||
|
||||
Generate a default metrics v2 config
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
By default, :mc-cmd:`mc admin prometheus generate` generates a scrape configuration for v2 cluster metrics:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS
|
||||
|
||||
- Replace ``ALIAS`` with the :mc-cmd:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
The output resembles the following:
|
||||
|
||||
.. code-block:: shell
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job
|
||||
bearer_token: [auth token]
|
||||
metrics_path: /minio/v2/metrics
|
||||
scheme: http
|
||||
static_configs:
|
||||
- targets: ['localhost:9000']
|
||||
|
||||
To generate a configuration for another metric type, specify the type.
|
||||
The following generates a scrape configuration for v2 bucket metrics:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS bucket
|
||||
|
Reference in New Issue
Block a user