1
0
mirror of https://github.com/minio/docs.git synced 2025-07-28 19:42:10 +03:00

Grafana and metric updates (#953)

- Adds a new page for Grafana to overview.
- Replaces the list of metrics in the Metrics and Alerts page with an
include to pull the list of metrics maintained in GitHub.
- Removes use of the :metric: role throughout the docs.
- Adds note about the introduction of a new bucket metric endpoint.

Partially addresses #930
Partially addresses #931
Partially addresses #898
Closes #864

Staged:
-
http://192.241.195.202:9000/staging/grafana/operations/monitoring/grafana.html
-
http://192.241.195.202:9000/staging/grafana/operations/monitoring/grafana.html
This commit is contained in:
Daryl White
2023-08-17 09:01:46 -05:00
committed by GitHub
parent 1a1c340c3c
commit 20644952de
17 changed files with 634 additions and 524 deletions

1
.gitignore vendored
View File

@ -19,4 +19,5 @@ source/developers/haskell/*.md
source/developers/java/*.md
source/developers/javascript/*.md
source/developers/python/*.md
source/operations/monitoring/*.md
*.inv

View File

@ -864,7 +864,7 @@ services:
.. policy-action:: admin:Prometheus
Allows access to MinIO :ref:`metrics <minio-metrics-and-alerts-endpoints>`.
Allows access to MinIO :ref:`metrics <minio-metrics-and-alerts>`.
Only required if MinIO requires authentication for scraping metrics.
.. policy-action:: admin:ListBatchJobs

View File

@ -37,7 +37,7 @@ Deployment Metrics
MinIO provides a Prometheus-compatible endpoint for supporting time-series querying of metrics.
MinIO deployments :ref:`configured to enable Prometheus scraping <minio-metrics-and-alerts-endpoints>` provide a detailed metrics view through the MinIO Console.
MinIO deployments :ref:`configured to enable Prometheus scraping <minio-metrics-and-alerts>` provide a detailed metrics view through the MinIO Console.
Server Logs
-----------

View File

@ -311,9 +311,3 @@ a notification.
:class: copyable
mc cp ~/data/new-object.txt ALIAS/BUCKET
Webhook Metrics
---------------
MinIO publishes several :ref:`metrics <minio-metrics-and-alerts>` for monitoring webhook endpoints.
See :ref:`minio-metrics-and-alerts-webhook` for a list of available metrics.

View File

@ -125,9 +125,7 @@ As the cluster or workload increases, scanner performance decreases as it yields
Consider regularly checking cluster metrics, capacity, and resource usage to ensure the cluster hardware is scaling alongside cluster and workload growth:
- :ref:`minio-metrics-and-alerts-capacity`
- :ref:`minio-metrics-and-alerts-lifecycle-management`
- :ref:`minio-metrics-and-alerts-scanner`
- :ref:`minio-metrics-and-alerts`
.. toctree::
:hidden:

View File

@ -535,5 +535,11 @@ for display. This is intentional (For now).
These are nested and linked.
Images
------
.. image:: /images/minio-console/minio-console.png
:width: 600px
:alt: MinIO Console Landing Page provides a view of the Object Browser for the authenticated user
:align: center

Binary file not shown.

After

Width:  |  Height:  |  Size: 474 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 469 KiB

View File

@ -38,7 +38,10 @@ MinIO Pre-requisites
- Load balancer to handle routing of requests (for example, `NGINX <https://www.nginx.com/>`__)
* - :octicon:`circle`
- :ref:`Prometheus / Grafana <minio-metrics-collect-using-prometheus>` setup for monitoring and metrics
- :ref:`Prometheus <minio-metrics-collect-using-prometheus>` setup for monitoring and metrics
* - :octicon:`circle`
- :ref:`Grafana configured <minio-grafana>` for dashboards
* - :octicon:`circle`
- (optional) :mc:`mc` installed on the local host system

View File

@ -70,4 +70,6 @@ See :ref:`minio-healthcheck-api` for more information.
/operations/monitoring/metrics-and-alerts
/operations/monitoring/minio-logging
/operations/monitoring/healthcheck-probe
/operations/monitoring/healthcheck-probe
/operations/monitoring/grafana

View File

@ -15,7 +15,7 @@ Monitoring and Alerting using Prometheus
- `Monitoring with MinIO and Prometheus: Overview <https://youtu.be/A3vCDaFWNNs?ref=docs>`__
- `Monitoring with MinIO and Prometheus: Lab <https://youtu.be/Oix9iXndSUY?ref=docs>`__
MinIO publishes cluster and node metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/#data-model>`.
MinIO publishes cluster, node, and bucket metrics using the :prometheus-docs:`Prometheus Data Model <concepts/data_model/#data-model>`.
The procedure on this page documents the following:
- Configuring a Prometheus service to scrape and display metrics from a MinIO deployment
@ -40,12 +40,40 @@ Configure Prometheus to Collect and Alert using MinIO Metrics
Use the :mc-cmd:`mc admin prometheus generate` command to generate the scrape configuration for use by Prometheus in making scraping requests:
.. code-block:: shell
:class: copyable
.. tab-set::
mc admin prometheus generate ALIAS
.. tab-item:: MinIO Server
Replace :mc-cmd:`ALIAS <mc admin prometheus generate TARGET>` with the :mc:`alias <mc alias>` of the MinIO deployment.
The following command scrapes metrics for the MinIO cluster.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS
Replace :mc-cmd:`ALIAS <mc admin prometheus generate TARGET>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. tab-item:: Nodes
The following command scrapes metrics for a nodes on the MinIO Server.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS node
Replace :mc-cmd:`ALIAS <mc admin prometheus generate TARGET>` with the :mc:`alias <mc alias>` of the MinIO deployment.
.. tab-item:: Buckets
The following command scrapes metrics for buckets on the MinIO Server.
.. code-block:: shell
:class: copyable
mc admin prometheus generate ALIAS bucket
Replace :mc-cmd:`ALIAS <mc admin prometheus generate TARGET>` with the :mc:`alias <mc alias>` of the MinIO deployment.
The command returns output similar to the following:
@ -81,21 +109,44 @@ The command returns output similar to the following:
2) Restart Prometheus with the Updated Configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Append the ``scrape_configs`` job generated in the previous step to the configuration file:
Append the desired ``scrape_configs`` job generated in the previous step to the configuration file:
.. code-block:: yaml
:class: copyable
.. tab-set::
.. tab-item:: Cluster metrics
For server metrics:
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 15s
scrape_configs:
- job_name: minio-job
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/cluster
scheme: https
static_configs:
- targets: [minio.example.net]
.. tab-item:: Bucket metrics:
.. code-block:: yaml
:class: copyable
global:
scrape_interval: 15s
scrape_configs:
- job_name: minio-job-bucket
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/bucket
scheme: https
static_configs:
- targets: [minio.example.net]
global:
scrape_interval: 15s
scrape_configs:
- job_name: minio-job
bearer_token: TOKEN
metrics_path: /minio/v2/metrics/cluster
scheme: https
static_configs:
- targets: [minio.example.net]
Start the Prometheus cluster using the configuration file:
@ -122,9 +173,9 @@ The following query examples return metrics collected by Prometheus:
minio_cluster_capacity_usable_free_bytes{job="minio-job"}[5m]
See :ref:`minio-metrics-and-alerts-available-metrics` for a complete list of published metrics.
See :ref:`minio-metrics-and-alerts` for information about metrics.
4) Configure an Alert Rule using MinIO Metrics
1) Configure an Alert Rule using MinIO Metrics
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
You must configure :prometheus-docs:`Alert Rules <prometheus/latest/configuration/alerting_rules/>` on the Prometheus deployment to trigger alerts based on collected MinIO metrics.
@ -184,3 +235,9 @@ To enable historical data visualization in MinIO Console, set the following envi
- Set :envvar:`MINIO_PROMETHEUS_JOB_ID` to the unique job ID assigned to the collected metrics
Restart the MinIO deployment and visit the :ref:`Monitoring <minio-console-monitoring>` pane to see the historical data views.
Dashboards
----------
MinIO provides Grafana Dashboards to display metrics collected by Prometheus.
For more information, see :ref:`minio-grafana`

View File

@ -0,0 +1,60 @@
.. _minio-grafana:
===================================
Monitor a MinIO Server with Grafana
===================================
.. default-domain:: minio
.. contents:: Table of Contents
:local:
:depth: 2
`Grafana <https://grafana.com/>`__ allows you to query, visualize, alert on and understand your metrics no matter where they are stored.
Create, explore, and share dashboards with your team and foster a data driven culture.
Prerequisites
-------------
- An existing :prometheus-docs:`Prometheus deployment <prometheus/latest/installation/>` with backing :prometheus-docs:`Alert Manager <alerting/latest/overview/>`
- An existing MinIO deployment with network access to the Prometheus deployment
- `Grafana installed <https://grafana.com/grafana/download>`__
MinIO Grafana Dashboard
-----------------------
MinIO provides two official Grafana Dashboards you can download from the Grafana Dashboard portal.
1. :ref:`MinIO Server metrics <minio-server-grafana-metrics>`
2. :ref:`MinIO Bucket metrics <minio-buckets-grafana-metrics>`
To track changes to the Grafana dashboard, introspect the JSON files for the `server <https://github.com/minio/minio/blob/master/docs/metrics/prometheus/grafana/minio-dashboard.json>`__ or `bucket <https://github.com/minio/minio/blob/master/docs/metrics/prometheus/grafana/minio-bucket.json>`__ dashboards in the MinIO Server GitHub repository.
.. _minio-server-grafana-metrics:
MinIO Server Metrics Dashboard
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Visualize MinIO metrics with the official MinIO Grafana dashboard for the MinIO Server available on the `Grafana dashboard portal <https://grafana.com/grafana/dashboards/13502-minio-dashboard/>`__.
MinIO provides a Grafana Dashboard for MinIO Server metrics.
For specifics on the dashboard's configuration, see the `JSON file on GitHub <https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/minio-dashboard.json>`__.
.. image:: /images/grafana-minio.png
:width: 600px
:alt: A sample of the MinIO Grafana dashboard showing many different captured metrics on a MinIO Server.
:align: center
.. _minio-buckets-grafana-metrics:
MinIO Bucket Metrics Dashboard
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Visualize MinIO bucket metrics with the official MinIO Grafana dashboard for buckets available on the `Grafana dashboard portal <https://grafana.com/grafana/dashboards/19237-minio-bucket-dashboard//>`__.
Bucket metrics can be viewed in the Grafana dashboard using the `bucket JSON file on GitHub <https://raw.githubusercontent.com/minio/minio/master/docs/metrics/prometheus/grafana/minio-bucket.json>`__.
.. image:: /images/grafana-bucket.png
:width: 600px
:alt: A sample of the MinIO Grafana dashboard showing many different captured metrics MinIO buckets.
:align: center

View File

@ -35,8 +35,8 @@ the server, such as a transient network issue or potential downtime.
The healthcheck probe alone cannot determine if a MinIO server is offline - only
that the current host machine cannot reach the server. Consider configuring
a Prometheus :ref:`alert <minio-metrics-and-alerts-alerting>` using the
:metric:`minio_cluster_nodes_offline_total` metric to detect whether one or
a Prometheus :ref:`alert <minio-metrics-and-alerts>` using the
``minio_cluster_nodes_offline_total`` metric to detect whether one or
more MinIO nodes are offline.
Cluster Write Quorum
@ -63,13 +63,13 @@ The healthcheck probe alone cannot determine if a MinIO server is offline or
processing write operations normally - only whether enough MinIO servers are
online to meet write quorum requirements based on the configured
:ref:`erasure code parity <minio-ec-parity>`. Consider configuring a Prometheus
:ref:`alert <minio-metrics-and-alerts-alerting>` using one of the following
:ref:`alert <minio-metrics-and-alerts>` using one of the following
metrics to detect potential issues or errors on the MinIO cluster:
- :metric:`minio_cluster_nodes_offline_total` to alert if one or more
- ``minio_cluster_nodes_offline_total`` to alert if one or more
MinIO nodes are offline.
- :metric:`minio_node_disk_free_bytes` to alert if the cluster is running
- ``minio_node_disk_free_bytes`` to alert if the cluster is running
low on free drive space.
Cluster Read Quorum
@ -96,8 +96,8 @@ The healthcheck probe alone cannot determine if a MinIO server is offline or
processing read operations normally - only whether enough MinIO servers are
online to meet read quorum requirements based on the configured
:ref:`erasure code parity <minio-ec-parity>`. Consider configuring a Prometheus
:ref:`alert <minio-metrics-and-alerts-alerting>` using the
:metric:`minio_cluster_nodes_offline_total` metric to detect whether one or more
:ref:`alert <minio-metrics-and-alerts>` using the
``minio_cluster_nodes_offline_total`` metric to detect whether one or more
MinIO nodes are offline.
Cluster Maintenance Check
@ -125,6 +125,5 @@ The healthcheck probe alone cannot determine if a MinIO server is offline - only
whether enough MinIO servers will be online after taking the node down for
maintenance to meet read and write quorum requirements based on the configured
:ref:`erasure code parity <minio-ec-parity>`. Consider configuring a Prometheus
:ref:`alert <minio-metrics-and-alerts-alerting>` using the
:metric:`minio_cluster_nodes_offline_total` metric to detect whether one or more
:ref:`alert <minio-metrics-and-alerts>` using the ``minio_cluster_nodes_offline_total`` metric to detect whether one or more
MinIO nodes are offline.

File diff suppressed because it is too large Load Diff

View File

@ -94,7 +94,7 @@ Configure InfluxDB to Collect and Alert using MinIO Metrics
Use the :influxdb-docs:`DataExplorer <query-data/execute-queries/data-explorer/>` to visualize the collected MinIO data.
For example, you can set a filter on :metric:`minio_cluster_capacity_usable_total_bytes` and :metric:`minio_cluster_capacity_usable_free_bytes` to compare the total usable against total free space on the MinIO deployment.
For example, you can set a filter on ``minio_cluster_capacity_usable_total_bytes`` and ``minio_cluster_capacity_usable_free_bytes`` to compare the total usable against total free space on the MinIO deployment.
#. Configure a Check
@ -105,13 +105,13 @@ Configure InfluxDB to Collect and Alert using MinIO Metrics
- Create a :guilabel:`Threshold Check` named ``MINIO_NODE_DOWN``.
Set the filter for the :metric:`minio_cluster_nodes_offline_total` key.
Set the filter for the ``minio_cluster_nodes_offline_total`` key.
Set the :guilabel:`Thresholds` to :guilabel:`WARN` when the value is greater than :guilabel:`1`
- Create a :guilabel:`Threshold Check` named ``MINIO_QUORUM_WARNING``.
Set the filter for the :metric:`minio_cluster_disk_offline_total` key.
Set the filter for the ``minio_cluster_disk_offline_total`` key.
Set the :guilabel:`Thresholds` to :guilabel:`CRITICAL` when the value is one less than your configured :ref:`Erasure Code Parity <minio-erasure-coding>` setting.

View File

@ -43,7 +43,7 @@ Syntax
.. code-block:: shell
:class: copyable
mc admin prometheus generate TARGET
mc admin prometheus generate TARGET TYPE
The command accepts the following arguments:
@ -52,3 +52,11 @@ Syntax
The :mc:`alias <mc alias>` of a configured MinIO deployment for which
the command generates a Prometheus-compatible configuration file.
.. mc-cmd:: TYPE
The type of metrics to scrape.
Valid values are ``cluster``, ``node``, or ``bucket``.
If not specified, the command returns cluster metrics.

View File

@ -601,7 +601,7 @@ logging. See :ref:`minio-metrics-and-alerts` for more information.
.. envvar:: MINIO_PROMETHEUS_AUTH_TYPE
Specifies the authentication mode for the Prometheus
:ref:`scraping endpoints <minio-metrics-and-alerts-endpoints>`.
:ref:`scraping endpoints <minio-metrics-and-alerts>`.
- ``jwt`` - *Default* MinIO requires that the scraping client specify a JWT
token for authenticating requests. Use