1
0
mirror of https://github.com/minio/docs.git synced 2025-08-06 14:42:56 +03:00
Files
docs/source/operations/monitoring/metrics-v2.rst
Andrea Longo a5bfd27b80 partial edits
2024-09-27 14:50:06 -06:00

403 lines
12 KiB
ReStructuredText

.. _minio-metrics-v2:
=================
Metrics Version 2
=================
.. default-domain:: minio
.. contents:: Table of Contents
:local:
:depth: 1
Metrics version 2
-----------------
For metrics version 2, all metrics are available under the base ``/minio/v2/metrics`` endpoint, optionally appending an additional path for each category.
For example, the following endpoint returns bucket metrics:
.. code-block:: shell
:class: copyable
http://HOSTNAME:PORT/minio/v2/metrics/bucket
Replace ``HOSTNAME:PORT`` with the :abbr:`FQDN (Fully Qualified Domain Name)` and port of the MinIO deployment.
For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.
.. tab-set::
.. tab-item:: Cluster Metrics
You can scrape :ref:`cluster-level metrics <minio-available-cluster-metrics>` using the following URL endpoint:
.. code-block:: shell
:class: copyable
http://HOSTNAME:PORT/minio/v2/metrics/cluster
Replace ``HOSTNAME:PORT`` with the :abbr:`FQDN (Fully Qualified Domain Name)` and port of the MinIO deployment.
For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.
.. tab-item:: Bucket Metrics
.. versionchanged:: MinIO RELEASE.2023-07-21T21-12-44Z
Bucket metrics have moved to use their own, separate endpoint.
.. versionchanged:: RELEASE.2023-08-31T15-31-16Z
You can scrape :ref:`bucket-level metrics <minio-available-bucket-metrics>` using the following URL endpoint:
.. code-block:: shell
:class: copyable
http://HOSTNAME:PORT/minio/v2/metrics/bucket
Replace ``HOSTNAME:PORT`` with the :abbr:`FQDN (Fully Qualified Domain Name)` and port of the MinIO deployment.
For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.
.. tab-item:: Resource Metrics
.. versionadded:: RELEASE.2023-10-07T15-07-38Z
You can scrape :ref:`resource metrics <minio-available-resource-metrics>` using the following URL endpoint:
.. code-block:: shell
:class: copyable
http://HOSTNAME:PORT/minio/v2/metrics/resource
Replace ``HOSTNAME:PORT`` with the :abbr:`FQDN (Fully Qualified Domain Name)` and port of the MinIO deployment.
For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.
Configure Prometheus to collect and alert using MinIO Metrics
-------------------------------------------------------------
1) Generate a v2 scrape configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use the :mc:`mc admin prometheus generate` command to generate the scrape configuration for use by Prometheus in making scraping requests:
.. tab-set::
.. tab-item:: MinIO Server
The following command scrapes metrics for the MinIO cluster.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
The command returns output similar to the following:
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/cluster
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Nodes
The following command scrapes metrics for a node on the MinIO Server.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS node
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-node
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/node
scheme: https
static_configs:
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
.. tab-item:: Buckets
The following command scrapes metrics for buckets on the MinIO Server.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS bucket
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-bucket
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/bucket
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Resources
.. versionadded:: RELEASE.2023-10-07T15-07-38Z
The following command scrapes metrics for resources on the MinIO Server.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS resource
Replace :mc-cmd:`ALIAS <mc admin prometheus generate ALIAS>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-resource
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/resource
scheme: https
static_configs:
- targets: [minio.example.net]
- Set an appropriate ``scrape_interval`` value to ensure each scraping operation completes before the next one begins.
The recommended value is 60 seconds.
Some deployments require a longer scrape interval due to the number of metrics being scraped.
To reduce the load on your MinIO and Prometheus servers, choose the longest interval that meets your monitoring requirements.
- Set the ``job_name`` to a value associated to the MinIO deployment.
Use a unique value to ensure isolation of the deployment metrics from any others collected by that Prometheus service.
- MinIO deployments started with :envvar:`MINIO_PROMETHEUS_AUTH_TYPE` set to ``"public"`` can omit the ``bearer_token`` field.
- Set the ``scheme`` to http for MinIO deployments not using TLS.
- Set the ``targets`` array with a hostname that resolves to the MinIO deployment.
This can be any single node, or a load balancer/proxy which handles connections to the MinIO nodes.
.. cond:: k8s
For Prometheus deployments in the same cluster as the MinIO Tenant, you can specify the service DNS name for the ``minio`` service.
For Prometheus deployments external to the cluster, you must specify an ingress or load balancer endpoint configured to route connections to and from the MinIO Tenant.
2) Restart Prometheus with the updated configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Append the desired ``scrape_configs`` job generated in the previous step to the configuration file:
.. tab-set::
.. tab-item:: Cluster
Cluster metrics aggregate node-level metrics and, where appropriate, attach labels to metrics for the originating node.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/cluster
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Nodes
Node metrics are specific for node-level monitoring. You need to list all MinIO nodes for this configuration.
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-node
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/node
scheme: https
static_configs:
- targets: [minio-1.example.net, minio-2.example.net, minio-N.example.net]
.. tab-item:: Bucket
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-bucket
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/bucket
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Resource
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 60s
scrape_configs:
- job_name: minio-job-resource
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/resource
scheme: https
static_configs:
- targets: [minio.example.net]
Start the Prometheus cluster using the configuration file:
.. code-block:: shell
:class: copyable
prometheus --config.file=prometheus.yaml
3) Analyze collected metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Prometheus includes an :prometheus-docs:`expression browser <prometheus/latest/getting_started/#using-the-expression-browser>`.
You can execute queries here to analyze the collected metrics.
The following query examples return metrics collected by Prometheus every five minutes for a scrape job named ``minio-job``:
.. code-block:: shell
:class: copyable
minio_node_drive_free_bytes{job-"minio-job"}[5m]
minio_node_drive_free_inodes{job-"minio-job"}[5m]
minio_node_drive_latency_us{job-"minio-job"}[5m]
minio_node_drive_offline_total{job-"minio-job"}[5m]
minio_node_drive_online_total{job-"minio-job"}[5m]
minio_node_drive_total{job-"minio-job"}[5m]
minio_node_drive_total_bytes{job-"minio-job"}[5m]
minio_node_drive_used_bytes{job-"minio-job"}[5m]
minio_node_drive_errors_timeout{job-"minio-job"}[5m]
minio_node_drive_errors_availability{job-"minio-job"}[5m]
minio_node_drive_io_waiting{job-"minio-job"}[5m]
4) Configure an alert rule using MinIO Metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You must configure :prometheus-docs:`Alert Rules <prometheus/latest/configuration/alerting_rules/>` on the Prometheus deployment to trigger alerts based on collected MinIO metrics.
The following example alert rule files provide a baseline of alerts for a MinIO deployment.
You can modify or otherwise use these examples as guidance in building your own alerts.
.. code-block:: yaml
:class: copyable
groups:
- name: minio-alerts
rules:
- alert: NodesOffline
expr: avg_over_time(minio_cluster_nodes_offline_total{job="minio-job"}[5m]) > 0
for: 10m
labels:
severity: warn
annotations:
summary: "Node down in MinIO deployment"
description: "Node(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"
- alert: DisksOffline
expr: avg_over_time(minio_cluster_drive_offline_total{job="minio-job"}[5m]) > 0
for: 10m
labels:
severity: warn
annotations:
summary: "Disks down in MinIO deployment"
description: "Disks(s) in cluster {{ $labels.instance }} offline for more than 5 minutes"
In the Prometheus configuration, specify the path to the alert file in the ``rule_files`` key:
.. code-block:: yaml
rule_files:
- minio-alerting.yml
Once triggered, Prometheus sends the alert to the configured AlertManager service.
Dashboards
----------
For v2 metrics, MinIO provides Grafana Dashboards to display the metrics collected by Prometheus.
For more information, see :ref:`minio-grafana`
.. _minio-metrics-and-alerts-available-v2-metrics:
Available v2 metrics
--------------------
- :ref:`Cluster Metrics <minio-available-cluster-metrics>`
- :ref:`Bucket Metrics <minio-available-bucket-metrics>`
- :ref:`Resource Metrics <minio-available-resource-metrics>`
.. _minio-available-cluster-metrics:
.. include:: /includes/common-metrics-cluster.md
:parser: myst_parser.sphinx_
.. _minio-available-bucket-metrics:
.. include:: /includes/common-metrics-bucket.md
:parser: myst_parser.sphinx_
.. _minio-available-resource-metrics:
.. include:: /includes/common-metrics-resource.md
:parser: myst_parser.sphinx_