1
0
mirror of https://github.com/minio/docs.git synced 2025-05-28 00:41:14 +03:00
docs/source/monitoring/metrics-alerts/minio-metrics-and-alerts.rst

10 KiB

Metrics and Alerts

minio

Table of Contents

MinIO leverages Prometheus for metrics and alerts. Prometheus is an Open-Source systems and service monitoring system which supports analyzing and alerting based on collected metrics. The Prometheus ecosystem includes multiple integrations <operating/integrations/>, allowing wide latitude in processing and storing collected metrics.

  • MinIO publishes Prometheus-compatible scraping endpoints for cluster and node-level metrics. See minio-metrics-and-alerts-endpoints for more information.
  • For alerts, use Prometheus Alerting Rules <prometheus/latest/configuration/alerting_rules/> and the Alert Manager <alerting/latest/overview/> to trigger alerts based on collected metrics. See minio-metrics-and-alerts-alerting for more information.

MinIO publishes collected metrics data using Prometheus-compatible data structures. Any Prometheus-compatible scraping software can ingest and process MinIO metrics for analysis, visualization, and alerting.

Metrics

MinIO provides a scraping endpoint for cluster-level metrics:

http://minio.example.net:9000/minio/v2/metrics/cluster

Replace http://minio.example.net with the hostname of any node in the MinIO deployment. For deployments with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.

Create a new scraping configuration <prometheus/latest/configuration/configuration/#scrape_config> to begin collecting metrics from the MinIO deployment. See minio-metrics-collect-using-prometheus for a complete tutorial.

The following example describes a scrape_configs entry for collecting cluster metrics.

scrape_configs:
- job_name: minio-job
  bearer_token: <secret>
  metrics_path: /minio/v2/metrics/cluster
  scheme: https
  static_configs:
  - targets: ['minio.example.net:9000']
job_name The name of the scraping job.

bearer_token

The JWT token generated by mc admin prometheus generate.

Omit this field if the MinIO deployment was started with MINIO_PROMETHEUS_AUTH_TYPE set to public.

targets The endpoint for the MinIO deployment. You can specify any node in the deployment for collecting cluster metrics. For clusters with a load balancer managing connections between MinIO nodes, specify the address of the load balancer.

MinIO by default requires authentication for scraping the metrics endpoints. Use the mc admin prometheus generate command to generate the necessary bearer tokens for use with configuring the scrape_configs.bearer_token field. You can alternatively disable metrics endpoint authentication by setting MINIO_PROMETHEUS_AUTH_TYPE to public.

Visualizing Metrics

The MinIO Console uses the metrics collected by Prometheus to populate the Dashboard metrics:

MinIO Console Dashboard displaying Monitoring Data

Set the MINIO_PROMETHEUS_URL environment variable to the URL of the Prometheus service to allow the Console to retrieve and display collected metrics. See minio-metrics-collect-using-prometheus for a complete example.

MinIO also publishes a Grafana Dashboard for visualizing collected metrics. For more complete documentation on configuring a Prometheus data source for Grafana, see Grafana Support for Prometheus <visualization/grafana/>.

Available Metrics

MinIO publishes the following metrics, where each metric includes a label for the MinIO server which generated that metric.

Object Metrics

minio_bucket_objects_size_distribution

Distribution of object sizes in the bucket, includes label for the bucket name.

Replication Metrics

These metrics are only populated for MinIO clusters with minio-bucket-replication-serverside enabled.

minio_bucket_replication_failed_bytes

Total number of bytes failed at least once to replicate.

minio_bucket_replication_pending_bytes

Total bytes pending to replicate.

minio_bucket_replication_received_bytes

Total number of bytes replicated to this bucket from another source bucket.

minio_bucket_replication_sent_bytes

Total number of bytes replicated to the target bucket.

minio_bucket_replication_pending_count

Total number of replication operations pending for this bucket.

minio_bucket_replication_failed_count

Total number of replication operations failed for this bucket.

Bucket Metrics

minio_bucket_usage_object_total

Total number of objects

minio_bucket_usage_total_bytes

Total bucket size in bytes

Cache Metrics

minio_cache_hits_total

Total number of disk cache hits

minio_cache_missed_total

Total number of disk cache misses

minio_cache_sent_bytes

Total number of bytes served from cache

minio_cache_total_bytes

Total size of cache disk in bytes

minio_cache_usage_info

Total percentage cache usage, value of 1 indicates high and 0 low, label level is set as well

minio_cache_used_bytes

Current cache usage in bytes

Cluster Metrics

minio_cluster_capacity_raw_free_bytes

Total free capacity online in the cluster.

minio_cluster_capacity_raw_total_bytes

Total capacity online in the cluster.

minio_cluster_capacity_usable_free_bytes

Total free usable capacity online in the cluster.

minio_cluster_capacity_usable_total_bytes

Total usable capacity online in the cluster.

Node Metrics

minio_cluster_nodes_offline_total

Total number of MinIO nodes offline.

minio_cluster_nodes_online_total

Total number of MinIO nodes online.

minio_heal_objects_error_total

Objects for which healing failed in current self healing run

minio_heal_objects_heal_total

Objects healed in current self healing run

minio_heal_objects_total

Objects scanned in current self healing run

minio_heal_time_last_activity_nano_seconds

Time elapsed (in nano seconds) since last self healing activity. This is set to -1 until initial self heal

minio_inter_node_traffic_received_bytes

Total number of bytes received from other peer nodes.

minio_inter_node_traffic_sent_bytes

Total number of bytes sent to the other peer nodes.

minio_node_disk_free_bytes

Total storage available on a disk.

minio_node_disk_total_bytes

Total storage on a disk.

minio_node_disk_used_bytes

Total storage used on a disk.

minio_node_file_descriptor_limit_total

Limit on total number of open file descriptors for the MinIO Server process.

minio_node_file_descriptor_open_total

Total number of open file descriptors by the MinIO Server process.

minio_node_io_rchar_bytes

Total bytes read by the process from the underlying storage system including cache, /proc/[pid]/io rchar

minio_node_io_read_bytes

Total bytes read by the process from the underlying storage system, /proc/[pid]/io read_bytes

minio_node_io_wchar_bytes

Total bytes written by the process to the underlying storage system including page cache, /proc/[pid]/io wchar

minio_node_io_write_bytes

Total bytes written by the process to the underlying storage system, /proc/[pid]/io write_bytes

minio_node_process_starttime_seconds

Start time for MinIO process per node, time in seconds since Unix epoch.

minio_node_process_uptime_seconds

Uptime for MinIO process per node in seconds.

minio_node_syscall_read_total

Total read SysCalls to the kernel. /proc/[pid]/io syscr

minio_node_syscall_write_total

Total write SysCalls to the kernel. /proc/[pid]/io syscw

S3 Metrics

minio_s3_requests_error_total

Total number S3 requests with errors

minio_s3_requests_inflight_total

Total number of S3 requests currently in flight

minio_s3_requests_total

Total number S3 requests

minio_s3_time_ttbf_seconds_distribution

Distribution of the time to first byte across API calls.

minio_s3_traffic_received_bytes

Total number of s3 bytes received.

minio_s3_traffic_sent_bytes

Total number of s3 bytes sent

Software Metrics

minio_software_commit_info

Git commit hash for the MinIO release.

minio_software_version_info

MinIO Release tag for the server

Alerts

You can configure alerts using Prometheus Alerting Rules <prometheus/latest/configuration/alerting_rules/> based on the collected MinIO metrics. The Prometheus Alert Manager <alerting/latest/overview/> supports managing alerts produced by the configured alerting rules. Prometheus also supports a Webhook Receiver <operating/integrations/#alertmanager-webhook-receiver> for publishing alerts to mechanisms not supported by Prometheus AlertManager.

/monitoring/metrics-alerts/collect-minio-metrics-using-prometheus