1
0
mirror of https://github.com/minio/docs.git synced 2025-08-09 13:02:53 +03:00

draft: more metrics v2/v3 rework

This commit is contained in:
Andrea Longo
2024-09-13 17:02:29 -06:00
parent 2285c68f1e
commit 3fd1af000a
4 changed files with 525 additions and 298 deletions

View File

@@ -1,25 +1,25 @@
.. _minio-metrics-collect-using-prometheus:
========================================
Monitoring and Alerting using Prometheus
Monitoring and alerting using Prometheus
========================================
.. default-domain:: minio
.. contents:: Table of Contents
:local:
:depth: 1
:depth: 2
.. container:: extlinks-video
- `Monitoring with MinIO and Prometheus: Overview <https://youtu.be/A3vCDaFWNNs?ref=docs>`__
- `Monitoring with MinIO and Prometheus: Lab <https://youtu.be/Oix9iXndSUY?ref=docs>`__
MinIO publishes cluster, node, bucket, and resource metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/#data-model>`.
MinIO publishes metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/#data-model>`.
The procedure on this page documents the following:
- Configuring a Prometheus service to scrape and display metrics from a MinIO deployment
- Configuring an Alert Rule on a MinIO Metric to trigger an AlertManager action
- Configure a Prometheus service to scrape and display metrics from a MinIO deployment
- Configure an Alert Rule on a MinIO Metric to trigger an AlertManager action
.. admonition:: Prerequisites
:class: note
@@ -32,129 +32,82 @@ The procedure on this page documents the following:
- An :mc:`mc` installation on your local host configured to :ref:`access <alias>` the MinIO deployment
Configure Prometheus to Collect and Alert using MinIO Metrics
-------------------------------------------------------------
.. admonition:: Metrics Version 2 Deprecated
:class: note
1) Generate the Scrape Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Starting with MinIO Server :minio-release:`RELEASE.2024-07-15T19-02-30Z` and MinIO Client :mc-release:`RELEASE.2024-07-11T18-01-28Z`, metrics version 3 replaces the deprecated :ref:`metrics version 2 <minio-metrics-v2>`.
Use the :mc:`mc admin prometheus generate` command to generate the scrape configuration for use by Prometheus in making scraping requests:
MinIO recommends new monitoring configurations use :ref:`version 3 metrics <minio-metrics-and-alerts>`.
.. tab-set::
Collect and alert on metrics
----------------------------
.. tab-item:: MinIO Server
Generate the scrape configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following command scrapes metrics for the MinIO cluster.
Use :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` to generate a scrape configuration for each :ref:`type of metric <minio-metrics-and-alerts-available-metrics>` you want to scrape with Prometheus.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS
For example, the following command scrapes all version 3 audit metrics for the MinIO cluster:
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
The command returns output similar to the following:
.. code-block:: shell
:class: copyable
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/cluster
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Nodes
mc admin prometheus generate ALIAS audit --api-version v3
The following command scrapes metrics for a node on the MinIO Server.
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS node
The command returns output similar to the following:
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: yaml
:class: copyable
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-node
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/node
scheme: https
static_configs:
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
.. tab-item:: Buckets
scrape_configs:
- job_name: minio-job
bearer_token: TOKEN
metrics_path: /minio/metrics/v3
scheme: https
static_configs:
- targets: [minio.example.net]
The following command scrapes metrics for buckets on the MinIO Server.
To scrape multiple types of metrics, run :mc-cmd:`mc admin prometheus generate --api-version v3 <mc admin prometheus generate --api-version>` for each type and add the ``job_name`` section to the ``scrape_configs`` in your Prometheus configuration.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS bucket
The following example scrapes audit and system metrics every 60 seconds:
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: yaml
:class: copyable
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-bucket
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/bucket
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Resources
global:
scrape_interval: 60s
.. versionadded:: RELEASE.2023-10-07T15-07-38Z
scrape_configs:
- job_name: minio-job-audit
bearer_token: TOKEN
metrics_path: /minio/metrics/v3/audit
scheme: https
static_configs:
- targets: [minio.example.net]
The following command scrapes metrics for resources on the MinIO Server.
- job_name: minio-job-system
bearer_token: TOKEN
metrics_path: /minio/metrics/v3/system
scheme: https
static_configs:
- targets: [minio.example.net]
.. code-block:: shell
:class: copyable
If needed, edit the generated configuration for your environment.
Common changes include:
mc admin prometheus generate ALIAS resource
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-resource
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/resource
scheme: https
static_configs:
- targets: [minio.example.net]
- Set an appropriate ``scrape_interval`` value to ensure each scraping operation completes before the next one begins.
The recommended value is 60 seconds.
Some deployments require a longer scrape interval due to the number of metrics being scraped.
To reduce the load on your MinIO and Prometheus servers, choose the longest interval that meets your monitoring requirements.
You can specify a ``scrape_interval`` for each job in its ``job_name`` section, or all jobs in a separate ``global`` section.
- Set the ``job_name`` to a value associated to the MinIO deployment.
Use a unique value to ensure isolation of the deployment metrics from any others collected by that Prometheus service.
Use a unique value for each job to ensure isolation of the deployment metrics from any others collected by that Prometheus service.
- MinIO deployments started with :envvar:`MINIO_PROMETHEUS_AUTH_TYPE` set to ``"public"`` can omit the ``bearer_token`` field.
@@ -170,177 +123,45 @@ Use the :mc:`mc admin prometheus generate` command to generate the scrape config
For Prometheus deployments external to the cluster, you must specify an ingress or load balancer endpoint configured to route connections to and from the MinIO Tenant.
2) Restart Prometheus with the Updated Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Restart Prometheus with the updated configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Append the desired ``scrape_configs`` job generated in the previous step to the configuration file:
.. tab-set::
.. tab-item:: Cluster
Cluster metrics aggregate node-level metrics and, where appropriate, attach labels to metrics for the originating node.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/cluster
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Nodes
Node metrics are specific for node-level monitoring. You need to list all MinIO nodes for this configuration.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-node
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/node
scheme: https
static_configs:
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
.. tab-item:: Bucket
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-bucket
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/bucket
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Resource
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-resource
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/resource
scheme: https
static_configs:
- targets: [minio.example.net]
Start the Prometheus cluster using the configuration file:
Add the desired ``scrape_configs`` jobs to your Prometheus configuration file and start the Prometheus cluster:
.. code-block:: shell
:class: copyable
prometheus --config.file=prometheus.yaml
3) Analyze Collected Metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Analyze collected metrics
~~~~~~~~~~~~~~~~~~~~~~~~~
Prometheus includes an :prometheus-docs:`expression browser <prometheus/latest/getting_started/#using-the-expression-browser>`.
You can execute queries here to analyze the collected metrics.
.. tab-set::
The following query examples return metrics collected by Prometheus every five minutes for a scrape job named ``minio-job``:
.. tab-item:: Examples
.. code-block:: shell
:class: copyable
The following query examples return metrics collected by Prometheus every five minutes for a scrape job named ``minio-job``:
minio_system_drive_used_bytes{job-"minio-job"}[5m]
minio_system_drive_used_inodes{job-"minio-job"}[5m]
.. code-block:: shell
:class: copyable
minio_cluster_usage_buckets_total_bytes{job-"minio-job"}[5m]
minio_cluster_usage_buckets_objects_count{job-"minio-job"}[5m]
minio_node_drive_free_bytes{job-"minio-job"}[5m]
minio_node_drive_free_inodes{job-"minio-job"}[5m]
minio_api_requests_total{job-"minio-job"}[5m]
minio_api_requests_errors_total{job-"minio-job"}[5m]
minio_node_drive_latency_us{job-"minio-job"}[5m]
minio_node_drive_offline_total{job-"minio-job"}[5m]
minio_node_drive_online_total{job-"minio-job"}[5m]
Configure an alert rule
~~~~~~~~~~~~~~~~~~~~~~~
minio_node_drive_total{job-"minio-job"}[5m]
To trigger alerts based on metrics, configure :prometheus-docs:`Alert Rules <prometheus/latest/configuration/alerting_rules/>` on the Prometheus deployment.
minio_node_drive_total_bytes{job-"minio-job"}[5m]
minio_node_drive_used_bytes{job-"minio-job"}[5m]
minio_node_drive_errors_timeout{job-"minio-job"}[5m]
minio_node_drive_errors_availability{job-"minio-job"}[5m]
minio_node_drive_io_waiting{job-"minio-job"}[5m]
.. tab-item:: Recommended Metrics
MinIO recommends the following as a basic set of metrics to monitor.
See :ref:`minio-metrics-and-alerts` for information about all available metrics.
.. list-table::
:header-rows: 1
:widths: 40 60
:width: 100%
* - Metric
- Description
* - ``minio_node_drive_free_bytes``
- Total storage available on a drive.
* - ``minio_node_drive_free_inodes``
- Total free inodes.
* - ``minio_node_drive_latency_us``
- Average last minute latency in µs for drive API storage operations.
* - ``minio_node_drive_offline_total``
- Total drives offline in this node.
* - ``minio_node_drive_online_total``
- Total drives online in this node.
* - ``minio_node_drive_total``
- Total drives in this node.
* - ``minio_node_drive_total_bytes``
- Total storage on a drive.
* - ``minio_node_drive_used_bytes``
- Total storage used on a drive.
* - ``minio_node_drive_errors_timeout``
- Total number of drive timeout errors since server start.
* - ``minio_node_drive_errors_availability``
- Total number of drive I/O errors, permission denied and timeouts since server start.
* - ``minio_node_drive_io_waiting``
- Total number of I/O operations waiting on drive.
4) Configure an Alert Rule using MinIO Metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You must configure :prometheus-docs:`Alert Rules <prometheus/latest/configuration/alerting_rules/>` on the Prometheus deployment to trigger alerts based on collected MinIO metrics.
The following example alert rule files provide a baseline of alerts for a MinIO deployment.
You can modify or otherwise use these examples as guidance in building your own alerts.
The following example alert provides a baseline of alerts for a MinIO deployment.
You can modify or use these examples as guidance for building your own alerts.
.. code-block:: yaml
:class: copyable
@@ -349,7 +170,7 @@ You can modify or otherwise use these examples as guidance in building your own
- name: minio-alerts
rules:
- alert: NodesOffline
expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio-job"}[5m]) > 0
expr: avg_over_time(minio_cluster_health_nodes_offline_count{job="minio-job"}[5m]) > 0
for: 10m
labels:
severity: warn
@@ -358,7 +179,7 @@ You can modify or otherwise use these examples as guidance in building your own
description: "Node(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"
- alert: DisksOffline
expr: avg_over_time(minio_cluster_drive_offline_total{job="minio-job"}[5m]) > 0
expr: avg_over_time(minio_system_drive_offline_count{job="minio-job"}[5m]) > 0
for: 10m
labels:
severity: warn
@@ -378,5 +199,5 @@ Once triggered, Prometheus sends the alert to the configured AlertManager servic
Dashboards
----------
MinIO provides Grafana Dashboards to display metrics collected by Prometheus.
For v2 metrics, MinIO provides Grafana Dashboards to display metrics collected by Prometheus.
For more information, see :ref:`minio-grafana`

View File

@@ -17,10 +17,10 @@ Metrics and Alerts
Starting with MinIO Server :minio-release:`RELEASE.2024-07-15T19-02-30Z` and MinIO Client :mc-release:`RELEASE.2024-07-11T18-01-28Z`, metrics version 3 replaces the deprecated :ref:`metrics version 2 <minio-metrics-v2>`.
MinIO publishes cluster and node metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/>`.
MinIO publishes metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/>`.
You can use any scraping tool to pull metrics data from MinIO for further analysis and alerting.
For metrics version 3, all metrics are available under the base ``/minio/metrics/v3`` endpoint by appending an additional path for each category.
For metrics version 3, all metrics are available under the base ``/minio/metrics/v3`` endpoint, optionally by appending an additional path for each category.
For example, the following endpoint returns audit metrics:
@@ -36,66 +36,80 @@ By default, MinIO requires authentication to scrape the metrics endpoints.
To generate the needed bearer tokens, use :mc:`mc admin prometheus generate`.
You can also disable metrics endpoint authentication by setting :envvar:`MINIO_PROMETHEUS_AUTH_TYPE` to ``public``.
MinIO provides the following scraping endpoints, relative to the base URL:
You can also access metrics using :mc-cmd:`mc admin prometheus metrics` and the metric type for the desired category.
For more information, see the :mc-cmd:`MinIO Admin Client reference <mc admin prometheus metrics>`.
MinIO provides the following types and scraping endpoints, relative to the base URL:
.. list-table::
:header-rows: 1
:widths: 30 70
:widths: 25 25 50
:width: 100%
* - Category
- Metric type
- Path
* - API
- ``api``
- ``/api/requests``
``/bucket/api``
* - Audit
- ``audit``
- ``/audit``
* - Cluster
- ``cluster``
- ``/cluster/config``
``/cluster/erasure-set``
``/cluster/health``
``/cluster/iam``
``/cluster/usage/buckets``
``/cluster/usage/objects``
* - Debug
- ``debug``
- ``/debug/go``
* - ILM
- ``ilm``
- ``/ilm``
* - Logger webhook
- ``logger``
- ``/logger/webhook``
* - Notification
- ``notification``
- ``/notification``
* - Replication
- ``replication``
- ``/replication``
``/bucket/replication``
* - Scanner
- ``scanner``
- ``/scanner``
* - System
- ``system``
- ``/system/drive``
``/system/memory``
``/system/cpu``
``/system/network/internode``
``/system/process``
For a complete list of metrics for each endpoint, see :ref:`Available Metrics <minio-metrics-and-alerts-available-metrics>`.
@@ -103,23 +117,17 @@ For a complete list of metrics for each endpoint, see :ref:`Available Metrics <m
.. cond:: k8s
The MinIO Operator supports deploying a per-tenant Prometheus instance configured to support metrics and visualization.
If you deploy the Tenant with this feature disabled *but* still want the historical metric views, you can instead configure an external Prometheus service to scrape the Tenant metrics.
Once configured, you can update the Tenant to query that Prometheus service to retrieve metric data:
.. cond:: linux or container or macos or windows
To enable historical data visualization in MinIO Console, set the following environment variables on each node in the MinIO deployment:
- Set :envvar:`MINIO_PROMETHEUS_URL` to the URL of the Prometheus service
- Set :envvar:`MINIO_PROMETHEUS_JOB_ID` to the unique job ID assigned to the collected metrics
MinIO Grafana Dashboard
-----------------------
MinIO also publishes two :ref:`Grafana Dashboards <minio-grafana>` for visualizing collected metrics.
For more complete documentation on configuring a Prometheus-compatible data source for Grafana, see the :prometheus-docs:`Prometheus documentation on Grafana Support <visualization/grafana/>`.
.. _minio-metrics-and-alerts-available-metrics:
Available Metrics

View File

@@ -9,14 +9,17 @@ Metrics Version 2
.. contents:: Table of Contents
:local:
:depth: 3
:depth: 1
.. admonition:: Metrics Version 2 Deprecated
:class: note
Starting with MinIO Server :minio-release:`RELEASE.2024-07-15T19-02-30Z` and MinIO Client :mc-release:`RELEASE.2024-07-11T18-01-28Z`, :ref:`metrics version 3 <minio-metrics-and-alerts>` replaces the deprecated metrics version 2.
The following sections describe the deprecated endpoints and metrics.
The following sections describe the deprecated version 2 endpoints and metrics.
Metrics version 2 endpoints
---------------------------
.. tab-set::
@@ -67,6 +70,304 @@ The following sections describe the deprecated endpoints and metrics.
For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.
Configure Prometheus to Collect and Alert using MinIO Metrics
-------------------------------------------------------------
1) Generate the Scrape Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use the :mc:`mc admin prometheus generate` command to generate the scrape configuration for use by Prometheus in making scraping requests:
.. tab-set::
.. tab-item:: MinIO Server
The following command scrapes metrics for the MinIO cluster.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
The command returns output similar to the following:
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/cluster
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Nodes
The following command scrapes metrics for a node on the MinIO Server.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS node
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-node
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/node
scheme: https
static_configs:
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
.. tab-item:: Buckets
The following command scrapes metrics for buckets on the MinIO Server.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS bucket
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-bucket
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/bucket
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Resources
.. versionadded:: RELEASE.2023-10-07T15-07-38Z
The following command scrapes metrics for resources on the MinIO Server.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS resource
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-resource
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/resource
scheme: https
static_configs:
- targets: [minio.example.net]
- Set an appropriate ``scrape_interval`` value to ensure each scraping operation completes before the next one begins.
The recommended value is 60 seconds.
Some deployments require a longer scrape interval due to the number of metrics being scraped.
To reduce the load on your MinIO and Prometheus servers, choose the longest interval that meets your monitoring requirements.
- Set the ``job_name`` to a value associated to the MinIO deployment.
Use a unique value to ensure isolation of the deployment metrics from any others collected by that Prometheus service.
- MinIO deployments started with :envvar:`MINIO_PROMETHEUS_AUTH_TYPE` set to ``"public"`` can omit the ``bearer_token`` field.
- Set the ``scheme`` to http for MinIO deployments not using TLS.
- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
This can be any single node, or a load balancer/proxy which handles connections to the MinIO nodes.
.. cond:: k8s
For Prometheus deployments in the same cluster as the MinIO Tenant, you can specify the service DNS name for the ``minio`` service.
For Prometheus deployments external to the cluster, you must specify an ingress or load balancer endpoint configured to route connections to and from the MinIO Tenant.
2) Restart Prometheus with the Updated Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Append the desired ``scrape_configs`` job generated in the previous step to the configuration file:
.. tab-set::
.. tab-item:: Cluster
Cluster metrics aggregate node-level metrics and, where appropriate, attach labels to metrics for the originating node.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/cluster
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Nodes
Node metrics are specific for node-level monitoring. You need to list all MinIO nodes for this configuration.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-node
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/node
scheme: https
static_configs:
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
.. tab-item:: Bucket
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-bucket
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/bucket
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Resource
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-resource
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/resource
scheme: https
static_configs:
- targets: [minio.example.net]
Start the Prometheus cluster using the configuration file:
.. code-block:: shell
:class: copyable
prometheus --config.file=prometheus.yaml
3) Analyze Collected Metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prometheus includes an :prometheus-docs:`expression browser <prometheus/latest/getting_started/#using-the-expression-browser>`.
You can execute queries here to analyze the collected metrics.
The following query examples return metrics collected by Prometheus every five minutes for a scrape job named ``minio-job``:
.. code-block:: shell
:class: copyable
minio_node_drive_free_bytes{job-"minio-job"}[5m]
minio_node_drive_free_inodes{job-"minio-job"}[5m]
minio_node_drive_latency_us{job-"minio-job"}[5m]
minio_node_drive_offline_total{job-"minio-job"}[5m]
minio_node_drive_online_total{job-"minio-job"}[5m]
minio_node_drive_total{job-"minio-job"}[5m]
minio_node_drive_total_bytes{job-"minio-job"}[5m]
minio_node_drive_used_bytes{job-"minio-job"}[5m]
minio_node_drive_errors_timeout{job-"minio-job"}[5m]
minio_node_drive_errors_availability{job-"minio-job"}[5m]
minio_node_drive_io_waiting{job-"minio-job"}[5m]
4) Configure an Alert Rule using MinIO Metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You must configure :prometheus-docs:`Alert Rules <prometheus/latest/configuration/alerting_rules/>` on the Prometheus deployment to trigger alerts based on collected MinIO metrics.
The following example alert rule files provide a baseline of alerts for a MinIO deployment.
You can modify or otherwise use these examples as guidance in building your own alerts.
.. code-block:: yaml
:class: copyable
groups:
- name: minio-alerts
rules:
- alert: NodesOffline
expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio-job"}[5m]) > 0
for: 10m
labels:
severity: warn
annotations:
summary: "Node down in MinIO deployment"
description: "Node(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"
- alert: DisksOffline
expr: avg_over_time(minio_cluster_drive_offline_total{job="minio-job"}[5m]) > 0
for: 10m
labels:
severity: warn
annotations:
summary: "Disks down in MinIO deployment"
description: "Disks(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"
In the Prometheus configuration, specify the path to the alert file in the ``rule_files`` key:
.. code-block:: yaml
rule_files:
- minio-alerting.yml
Once triggered, Prometheus sends the alert to the configured AlertManager service.
Available metrics
-----------------
- :ref:`Cluster Metrics <minio-available-cluster-metrics>`
- :ref:`Bucket Metrics <minio-available-bucket-metrics>`
- :ref:`Resource Metrics <minio-available-resource-metrics>`

View File

@@ -10,6 +10,8 @@
.. mc:: mc admin prometheus generate
Starting with MinIO Server :minio-release:`RELEASE.2024-07-15T19-02-30Z` and MinIO Client :mc-release:`RELEASE.2024-07-11T18-01-28Z`, metrics version 3 replaces the deprecated :ref:`metrics version 2 <minio-metrics-v2>`.
Description
-----------
@@ -19,7 +21,7 @@ The :mc:`mc admin prometheus generate` command generates a metrics scraping conf
.. end-mc-admin-prometheus-generate-desc
For more complete documentation on using MinIO with Prometheus, see :ref:`How to monitor MinIO server with Prometheus <minio-metrics-collect-using-prometheus>`
For more complete documentation on using MinIO with Prometheus, see :ref:`How to monitor MinIO server with Prometheus <minio-metrics-collect-using-prometheus>` and :ref:`minio-metrics-and-alerts`.
.. admonition:: Use ``mc admin`` on MinIO Deployments Only
:class: note
@@ -32,12 +34,12 @@ For more complete documentation on using MinIO with Prometheus, see :ref:`How to
.. tab-item:: EXAMPLE
The following command generates a Prometheus scrape configuration that collects bucket metrics from the deployment at :term:`alias` ``myminio``:
The following command generates a Prometheus scrape configuration that collects audit metrics from the deployment at :term:`alias` ``myminio``:
.. code-block:: shell
:class: copyable
mc admin prometheus generate myminio bucket
mc admin prometheus generate myminio audit --api_version v3
.. tab-item:: SYNTAX
@@ -46,9 +48,11 @@ For more complete documentation on using MinIO with Prometheus, see :ref:`How to
.. code-block:: shell
:class: copyable
mc [GLOBALFLAGS] admin prometheus generate \
ALIAS \
[TYPE]
mc [GLOBALFLAGS] admin prometheus generate \
ALIAS \
[TYPE] \
[--api_version v3] \
[TYPE --bucket <bucket name> --api_version v3]
.. include:: /includes/common-minio-mc.rst
:start-after: start-minio-syntax
@@ -63,23 +67,58 @@ Parameters
The :mc:`alias <mc alias>` of a configured MinIO deployment for which the command generates a Prometheus-compatible configuration file.
.. mc-cmd:: --api-version v3
:optional:
Generate a scrape configuration for metrics version 3.
Omit to generate a metrics version 2 configuration.
.. mc-cmd:: --bucket
:optional:
For v3 metric types that return bucket-level metrics, specify a bucket name.
Use with :mc-cmd:`~mc admin prometheus generate --api-version`.
``--bucket`` works for the following v3 metric types:
- ``api``
- ``replication``
The following example generates a configuration for API metrics from the bucket ``mybucket``:
.. code-block:: shell
:class: copyable
mc admin prometheus generate play api --bucket mybucket --api-version v3
.. mc-cmd:: TYPE
:optional:
The type of metrics to scrape.
.. versionchanged:: RELEASE.2023-10-07T15-07-38Z
Valid values for metrics version 3 are:
``resource`` metrics added
- ``api``
- ``audit``
- ``cluster``
- ``debug``
- ``ilm``
- ``logger``
- ``notification``
- ``replication``
- ``scanner``
- ``system``
Valid values are:
If not specified, a ``v3`` command returns all metrics.
Valid values for metrics version 2 are:
- ``bucket``
- ``cluster``
- ``node``
- ``resource``
If not specified, the command returns cluster metrics.
If not specified, a ``v2`` command returns cluster metrics.
Cluster metrics also include node metrics.
Global Flags
@@ -90,18 +129,18 @@ Global Flags
:end-before: end-minio-mc-globals
Example
-------
Examples
--------
Generate a scrape config for bucket metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Generate a default metrics v3 config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use :mc-cmd:`mc admin prometheus generate` to generate a scrape configuration that collects bucket metrics for a MinIO deployment:
Use :mc-cmd:`mc admin prometheus generate --api-version v3` to generate a scrape configuration that collects all v3 metrics for a MinIO deployment:
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS bucket
mc admin prometheus generate ALIAS --api-version v3
- Replace ``ALIAS`` with the :mc-cmd:`alias <mc alias>` of the MinIO deployment.
@@ -110,9 +149,67 @@ The output resembles the following:
.. code-block:: shell
scrape_configs:
- job_name: minio-job-bucket
- job_name: minio-job
bearer_token: [auth token]
metrics_path: /minio/v2/metrics/bucket
metrics_path: /minio/metrics/v3
scheme: http
static_configs:
- targets: ['localhost:9000']
Generate a v3 bucket replication metrics config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following example generates a scrape configuration for v3 replication metrics of bucket ``mybucket``:
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS replication --bucket mybucket --api-version v3
- Replace ``ALIAS`` with the :mc-cmd:`alias <mc alias>` of the MinIO deployment.
The output resembles the following:
.. code-block:: shell
scrape_configs:
- job_name: minio-job-replication
bearer_token: [auth token]
metrics_path: /minio/metrics/v3/bucket/replication/mybucket
scheme: https
static_configs:
- targets: [`localhost:9000`]
Generate a default metrics v2 config
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
By default, :mc-cmd:`mc admin prometheus generate` generates a scrape configuration for v2 cluster metrics:
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS
- Replace ``ALIAS`` with the :mc-cmd:`alias <mc alias>` of the MinIO deployment.
The output resembles the following:
.. code-block:: shell
scrape_configs:
- job_name: minio-job
bearer_token: [auth token]
metrics_path: /minio/v2/metrics
scheme: http
static_configs:
- targets: ['localhost:9000']
To generate a configuration for another metric type, specify the type.
The following generates a scrape configuration for v2 bucket metrics:
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS bucket