mirror of
https://github.com/minio/docs.git
synced 2025-07-28 19:42:10 +03:00
Docs Multiplatform Slice
This commit is contained in:
@ -0,0 +1,511 @@
|
||||
.. _minio-metrics-collect-using-prometheus:
|
||||
.. _minio-metrics-and-alerts:
|
||||
|
||||
======================================
|
||||
Collect MinIO Metrics Using Prometheus
|
||||
======================================
|
||||
|
||||
.. default-domain:: minio
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:local:
|
||||
:depth: 1
|
||||
|
||||
MinIO leverages `Prometheus <https://prometheus.io/>`__ for metrics and alerts.
|
||||
MinIO publishes Prometheus-compatible scraping endpoints for cluster and
|
||||
node-level metrics. See :ref:`minio-metrics-and-alerts-endpoints` for more
|
||||
information.
|
||||
|
||||
The procedure on this page documents scraping the MinIO metrics
|
||||
endpoints using a Prometheus instance, including deploying and configuring
|
||||
a simple Prometheus server for collecting metrics.
|
||||
|
||||
This procedure is not a replacement for the official
|
||||
:prometheus-docs:`Prometheus Documentation <>`. Any specific guidance
|
||||
related to configuring, deploying, and using Prometheus is made on a best-effort
|
||||
basis.
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
Install and Configure ``mc`` with Access to the MinIO Cluster
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This procedure uses :mc:`mc` for performing operations on the MinIO
|
||||
deployment. Install ``mc`` on a machine with network access to the
|
||||
deployment. See the ``mc`` :ref:`Installation Quickstart <mc-install>` for
|
||||
more complete instructions.
|
||||
|
||||
Prometheus Service
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This procedure provides instruction for deploying Prometheus for rapid local
|
||||
evaluation and development. All other environments should have an existing
|
||||
Prometheus or Prometheus-compatible service with access to the MinIO cluster.
|
||||
|
||||
Procedure
|
||||
---------
|
||||
|
||||
1) Generate the Bearer Token
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
MinIO by default requires authentication for requests made to the metrics
|
||||
endpoints. While step is not required for MinIO deployments started with
|
||||
:envvar:`MINIO_PROMETHEUS_AUTH_TYPE` set to ``"public"``, you can still use the
|
||||
command output for retrieving a Prometheus ``scrape_configs`` entry.
|
||||
|
||||
Use the :mc-cmd:`mc admin prometheus generate` command to generate a
|
||||
JWT bearer token for use by Prometheus in making authenticated scraping
|
||||
requests:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin prometheus generate ALIAS
|
||||
|
||||
Replace :mc-cmd:`ALIAS <mc admin prometheus generate TARGET>` with the
|
||||
:mc:`alias <mc alias>` of the MinIO deployment.
|
||||
|
||||
The command returns output similar to the following:
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/cluster
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
The ``targets`` array can contain the hostname for any node in the deployment.
|
||||
For clusters with a load balancer managing connections between MinIO nodes,
|
||||
specify the address of the load balancer.
|
||||
|
||||
Specify the output block to the
|
||||
:prometheus-docs:`scrape_config
|
||||
<prometheus/latest/configuration/configuration/#scrape_config>` section of
|
||||
the Prometheus configuration.
|
||||
|
||||
2) Configure and Run Prometheus
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Follow the Prometheus :prometheus-docs:`Getting Started
|
||||
<prometheus/latest/getting_started/#downloading-and-running-prometheus>` guide
|
||||
to download and run Prometheus locally.
|
||||
|
||||
Append the ``scrape_configs`` job generated in the previous step to the
|
||||
configuration file:
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
global:
|
||||
scrape_interval: 15s
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job
|
||||
bearer_token: TOKEN
|
||||
metrics_path: /minio/v2/metrics/cluster
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: [minio.example.net]
|
||||
|
||||
Start the Prometheus cluster using the configuration file:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
prometheus --config.file=prometheus.yaml
|
||||
|
||||
3) Analyze Collected Metrics
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Prometheus includes a
|
||||
:prometheus-docs:`expression browser
|
||||
<prometheus/latest/getting_started/#using-the-expression-browser>`. You can
|
||||
execute queries here to analyze the collected metrics.
|
||||
|
||||
The following query examples return metrics collected by Prometheus:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
minio_cluster_disk_online_total{job="minio-job"}[5m]
|
||||
minio_cluster_disk_offline_total{job="minio-job"}[5m]
|
||||
|
||||
minio_bucket_usage_object_total{job="minio-job"}[5m]
|
||||
|
||||
minio_cluster_capacity_usable_free_bytes{job="minio-job"}[5m]
|
||||
|
||||
See :ref:`minio-metrics-and-alerts-available-metrics` for a complete
|
||||
list of published metrics.
|
||||
|
||||
4) Visualize Collected Metrics
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The :minio-git:`MinIO Console <console>` supports visualizing collected metrics
|
||||
from Prometheus. Specify the URL of the Prometheus service to the
|
||||
:envvar:`MINIO_PROMETHEUS_URL` environment variable to each MinIO server
|
||||
in the deployment:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
export MINIO_PROMETHEUS_URL="https://prometheus.example.net"
|
||||
|
||||
If you set a custom ``job_name`` for the Prometheus scraping job, you must also
|
||||
set :envvar:`MINIO_PROMETHEUS_JOB_ID` to match that job name.
|
||||
|
||||
Restart the deployment using :mc-cmd:`mc admin service restart` to apply the
|
||||
changes.
|
||||
|
||||
The MinIO Console uses the metrics collected by the ``minio-job`` scraping
|
||||
job to populate the Dashboard metrics:
|
||||
|
||||
.. image:: /images/minio-console/console-metrics.png
|
||||
:width: 600px
|
||||
:alt: MinIO Console Dashboard displaying Monitoring Data
|
||||
:align: center
|
||||
|
||||
MinIO also publishes a `Grafana Dashboard
|
||||
<https://grafana.com/grafana/dashboards/13502>`_ for visualizing collected
|
||||
metrics. For more complete documentation on configuring a Prometheus data source
|
||||
for Grafana, see :prometheus-docs:`Grafana Support for Prometheus
|
||||
<visualization/grafana/>`.
|
||||
|
||||
Prometheus includes a :prometheus-docs:`graphing interface
|
||||
<prometheus/latest/getting_started/#using-the-graphing-interface>` for
|
||||
visualizing collected metrics.
|
||||
|
||||
.. _minio-metrics-and-alerts-endpoints:
|
||||
|
||||
Metrics
|
||||
-------
|
||||
|
||||
MinIO provides a scraping endpoint for cluster-level metrics:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
http://minio.example.net:9000/minio/v2/metrics/cluster
|
||||
|
||||
Replace ``http://minio.example.net`` with the hostname of any node in the MinIO
|
||||
deployment. For deployments with a load balancer managing connections between
|
||||
MinIO nodes, specify the address of the load balancer.
|
||||
|
||||
Create a new :prometheus-docs:`scraping configuration
|
||||
<prometheus/latest/configuration/configuration/#scrape_config>` to begin
|
||||
collecting metrics from the MinIO deployment. See
|
||||
:ref:`minio-metrics-collect-using-prometheus` for a complete tutorial.
|
||||
|
||||
The following example describes a ``scrape_configs`` entry for collecting
|
||||
cluster metrics.
|
||||
|
||||
.. code-block:: yaml
|
||||
:class: copyable
|
||||
|
||||
scrape_configs:
|
||||
- job_name: minio-job
|
||||
bearer_token: <secret>
|
||||
metrics_path: /minio/v2/metrics/cluster
|
||||
scheme: https
|
||||
static_configs:
|
||||
- targets: ['minio.example.net:9000']
|
||||
|
||||
.. list-table::
|
||||
:stub-columns: 1
|
||||
:widths: 20 80
|
||||
:width: 100%
|
||||
|
||||
* - ``job_name``
|
||||
- The name of the scraping job.
|
||||
|
||||
* - ``bearer_token``
|
||||
- The JWT token generated by :mc-cmd:`mc admin prometheus generate`.
|
||||
|
||||
Omit this field if the MinIO deployment was started with
|
||||
:envvar:`MINIO_PROMETHEUS_AUTH_TYPE` set to ``public``.
|
||||
|
||||
* - ``targets``
|
||||
- The endpoint for the MinIO deployment. You can specify any node in the
|
||||
deployment for collecting cluster metrics. For clusters with a load
|
||||
balancer managing connections between MinIO nodes, specify the
|
||||
address of the load balancer.
|
||||
|
||||
MinIO by default requires authentication for scraping the metrics endpoints.
|
||||
Use the :mc-cmd:`mc admin prometheus generate` command to generate the
|
||||
necessary bearer tokens for use with configuring the
|
||||
``scrape_configs.bearer_token`` field. You can alternatively disable
|
||||
metrics endpoint authentication by setting
|
||||
:envvar:`MINIO_PROMETHEUS_AUTH_TYPE` to ``public``.
|
||||
|
||||
Visualizing Metrics
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The MinIO Console uses the metrics collected by Prometheus to populate the
|
||||
Dashboard metrics:
|
||||
|
||||
.. image:: /images/minio-console/console-metrics.png
|
||||
:width: 600px
|
||||
:alt: MinIO Console displaying Prometheus-backed Monitoring Data
|
||||
:align: center
|
||||
|
||||
Set the :envvar:`MINIO_PROMETHEUS_URL` environment variable to the URL of the
|
||||
Prometheus service to allow the Console to retrieve and display collected
|
||||
metrics. See :ref:`minio-metrics-collect-using-prometheus` for a complete
|
||||
example.
|
||||
|
||||
MinIO also publishes a `Grafana Dashboard
|
||||
<https://grafana.com/grafana/dashboards/13502>`_ for visualizing collected
|
||||
metrics. For more complete documentation on configuring a Prometheus data source
|
||||
for Grafana, see :prometheus-docs:`Grafana Support for Prometheus
|
||||
<visualization/grafana/>`.
|
||||
|
||||
.. _minio-metrics-and-alerts-available-metrics:
|
||||
|
||||
Available Metrics
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
MinIO publishes the following metrics, where each metric includes a label for
|
||||
the MinIO server which generated that metric.
|
||||
|
||||
Object Metrics
|
||||
++++++++++++++
|
||||
|
||||
.. metric:: minio_bucket_objects_size_distribution
|
||||
|
||||
Distribution of object sizes in the bucket, includes label for the bucket
|
||||
name.
|
||||
|
||||
Replication Metrics
|
||||
+++++++++++++++++++
|
||||
|
||||
These metrics are only populated for MinIO clusters with
|
||||
:ref:`minio-bucket-replication-serverside` enabled.
|
||||
|
||||
.. metric:: minio_bucket_replication_failed_bytes
|
||||
|
||||
Total number of bytes failed at least once to replicate.
|
||||
|
||||
.. metric:: minio_bucket_replication_pending_bytes
|
||||
|
||||
Total bytes pending to replicate.
|
||||
|
||||
.. metric:: minio_bucket_replication_received_bytes
|
||||
|
||||
Total number of bytes replicated to this bucket from another source bucket.
|
||||
|
||||
.. metric:: minio_bucket_replication_sent_bytes
|
||||
|
||||
Total number of bytes replicated to the target bucket.
|
||||
|
||||
.. metric:: minio_bucket_replication_pending_count
|
||||
|
||||
Total number of replication operations pending for this bucket.
|
||||
|
||||
.. metric:: minio_bucket_replication_failed_count
|
||||
|
||||
Total number of replication operations failed for this bucket.
|
||||
|
||||
Bucket Metrics
|
||||
++++++++++++++
|
||||
|
||||
.. metric:: minio_bucket_usage_object_total
|
||||
|
||||
Total number of objects
|
||||
|
||||
.. metric:: minio_bucket_usage_total_bytes
|
||||
|
||||
Total bucket size in bytes
|
||||
|
||||
Cache Metrics
|
||||
+++++++++++++
|
||||
|
||||
.. metric:: minio_cache_hits_total
|
||||
|
||||
Total number of disk cache hits
|
||||
|
||||
.. metric:: minio_cache_missed_total
|
||||
|
||||
Total number of disk cache misses
|
||||
|
||||
.. metric:: minio_cache_sent_bytes
|
||||
|
||||
Total number of bytes served from cache
|
||||
|
||||
.. metric:: minio_cache_total_bytes
|
||||
|
||||
Total size of cache disk in bytes
|
||||
|
||||
.. metric:: minio_cache_usage_info
|
||||
|
||||
Total percentage cache usage, value of 1 indicates high and 0 low, label
|
||||
level is set as well
|
||||
|
||||
.. metric:: minio_cache_used_bytes
|
||||
|
||||
Current cache usage in bytes
|
||||
|
||||
Cluster Metrics
|
||||
+++++++++++++++
|
||||
|
||||
.. metric:: minio_cluster_capacity_raw_free_bytes
|
||||
|
||||
Total free capacity online in the cluster.
|
||||
|
||||
.. metric:: minio_cluster_capacity_raw_total_bytes
|
||||
|
||||
Total capacity online in the cluster.
|
||||
|
||||
.. metric:: minio_cluster_capacity_usable_free_bytes
|
||||
|
||||
Total free usable capacity online in the cluster.
|
||||
|
||||
.. metric:: minio_cluster_capacity_usable_total_bytes
|
||||
|
||||
Total usable capacity online in the cluster.
|
||||
|
||||
Node Metrics
|
||||
++++++++++++
|
||||
|
||||
.. metric:: minio_cluster_nodes_offline_total
|
||||
|
||||
Total number of MinIO nodes offline.
|
||||
|
||||
.. metric:: minio_cluster_nodes_online_total
|
||||
|
||||
Total number of MinIO nodes online.
|
||||
|
||||
.. metric:: minio_heal_objects_error_total
|
||||
|
||||
Objects for which healing failed in current self healing run
|
||||
|
||||
.. metric:: minio_heal_objects_heal_total
|
||||
|
||||
Objects healed in current self healing run
|
||||
|
||||
.. metric:: minio_heal_objects_total
|
||||
|
||||
Objects scanned in current self healing run
|
||||
|
||||
.. metric:: minio_heal_time_last_activity_nano_seconds
|
||||
|
||||
Time elapsed (in nano seconds) since last self healing activity. This is set
|
||||
to -1 until initial self heal
|
||||
|
||||
.. metric:: minio_inter_node_traffic_received_bytes
|
||||
|
||||
Total number of bytes received from other peer nodes.
|
||||
|
||||
.. metric:: minio_inter_node_traffic_sent_bytes
|
||||
|
||||
Total number of bytes sent to the other peer nodes.
|
||||
|
||||
.. metric:: minio_node_disk_free_bytes
|
||||
|
||||
Total storage available on a disk.
|
||||
|
||||
.. metric:: minio_node_disk_total_bytes
|
||||
|
||||
Total storage on a disk.
|
||||
|
||||
.. metric:: minio_node_disk_used_bytes
|
||||
|
||||
Total storage used on a disk.
|
||||
|
||||
.. metric:: minio_node_file_descriptor_limit_total
|
||||
|
||||
Limit on total number of open file descriptors for the MinIO Server process.
|
||||
|
||||
.. metric:: minio_node_file_descriptor_open_total
|
||||
|
||||
Total number of open file descriptors by the MinIO Server process.
|
||||
|
||||
.. metric:: minio_node_io_rchar_bytes
|
||||
|
||||
Total bytes read by the process from the underlying storage system including
|
||||
cache, ``/proc/[pid]/io rchar``
|
||||
|
||||
.. metric:: minio_node_io_read_bytes
|
||||
|
||||
Total bytes read by the process from the underlying storage system,
|
||||
``/proc/[pid]/io read_bytes``
|
||||
|
||||
.. metric:: minio_node_io_wchar_bytes
|
||||
|
||||
Total bytes written by the process to the underlying storage system including
|
||||
page cache, ``/proc/[pid]/io wchar``
|
||||
|
||||
.. metric:: minio_node_io_write_bytes
|
||||
|
||||
Total bytes written by the process to the underlying storage system,
|
||||
``/proc/[pid]/io write_bytes``
|
||||
|
||||
.. metric:: minio_node_process_starttime_seconds
|
||||
|
||||
Start time for MinIO process per node, time in seconds since Unix epoch.
|
||||
|
||||
.. metric:: minio_node_process_uptime_seconds
|
||||
|
||||
Uptime for MinIO process per node in seconds.
|
||||
|
||||
.. metric:: minio_node_syscall_read_total
|
||||
|
||||
Total read SysCalls to the kernel. ``/proc/[pid]/io syscr``
|
||||
|
||||
.. metric:: minio_node_syscall_write_total
|
||||
|
||||
Total write SysCalls to the kernel. ``/proc/[pid]/io syscw``
|
||||
|
||||
S3 Metrics
|
||||
++++++++++
|
||||
|
||||
.. metric:: minio_s3_requests_error_total
|
||||
|
||||
Total number S3 requests with errors
|
||||
|
||||
.. metric:: minio_s3_requests_inflight_total
|
||||
|
||||
Total number of S3 requests currently in flight
|
||||
|
||||
.. metric:: minio_s3_requests_total
|
||||
|
||||
Total number S3 requests
|
||||
|
||||
.. metric:: minio_s3_time_ttbf_seconds_distribution
|
||||
|
||||
Distribution of the time to first byte across API calls.
|
||||
|
||||
.. metric:: minio_s3_traffic_received_bytes
|
||||
|
||||
Total number of s3 bytes received.
|
||||
|
||||
.. metric:: minio_s3_traffic_sent_bytes
|
||||
|
||||
Total number of s3 bytes sent
|
||||
|
||||
Software Metrics
|
||||
++++++++++++++++
|
||||
|
||||
.. metric:: minio_software_commit_info
|
||||
|
||||
Git commit hash for the MinIO release.
|
||||
|
||||
.. metric:: minio_software_version_info
|
||||
|
||||
MinIO Release tag for the server
|
||||
|
||||
.. _minio-metrics-and-alerts-alerting:
|
||||
|
||||
Alerts
|
||||
------
|
||||
|
||||
You can configure alerts using Prometheus :prometheus-docs:`Alerting Rules
|
||||
<prometheus/latest/configuration/alerting_rules/>` based on the collected MinIO
|
||||
metrics. The Prometheus :prometheus-docs:`Alert Manager
|
||||
<alerting/latest/overview/>` supports managing alerts produced by the configured
|
||||
alerting rules. Prometheus also supports a :prometheus-docs:`Webhook Receiver
|
||||
<operating/integrations/#alertmanager-webhook-receiver>` for publishing alerts
|
||||
to mechanisms not supported by Prometheus AlertManager.
|
130
source/operations/monitoring/healthcheck-probe.rst
Normal file
130
source/operations/monitoring/healthcheck-probe.rst
Normal file
@ -0,0 +1,130 @@
|
||||
.. _minio-healthcheck-api:
|
||||
|
||||
===============
|
||||
Healthcheck API
|
||||
===============
|
||||
|
||||
.. default-domain:: minio
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:local:
|
||||
:depth: 1
|
||||
|
||||
MinIO exposes unauthenticated endpoints for probing node uptime and cluster
|
||||
:ref:`high availability <minio-ec-parity>` for simple healthchecks. These
|
||||
endpoints return an HTTP status code indicating whether the underlying
|
||||
resource is healthy or satisfies read/write quorum. MinIO exposes no other data
|
||||
through these endpoints.
|
||||
|
||||
Node Liveness
|
||||
-------------
|
||||
|
||||
Use the following endpoint to test if a MinIO server is online:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
curl -I https://minio.example.net:9000/minio/health/live
|
||||
|
||||
Replace ``https://minio.example.net:9000`` with the DNS hostname of the
|
||||
MinIO server to check.
|
||||
|
||||
A response code of ``200 OK`` indicates the MinIO server is
|
||||
online and functional. Any other HTTP codes indicate an issue with reaching
|
||||
the server, such as a transient network issue or potential downtime.
|
||||
|
||||
The healthcheck probe alone cannot determine if a MinIO server is offline - only
|
||||
that the current host machine cannot reach the server. Consider configuring
|
||||
a Prometheus :ref:`alert <minio-metrics-and-alerts-alerting>` using the
|
||||
:metric:`minio_cluster_nodes_offline_total` metric to detect whether one or
|
||||
more MinIO nodes are offline.
|
||||
|
||||
Cluster Write Quorum
|
||||
--------------------
|
||||
|
||||
Use the following endpoint to test if a MinIO cluster has
|
||||
:ref:`write quorum <minio-ec-parity>`:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
curl -I https://minio.example.net:9000/minio/health/cluster
|
||||
|
||||
Replace ``https://minio.example.net:9000`` with the DNS hostname of a node
|
||||
in the MinIO cluster to check. For clusters using a load balancer to manage
|
||||
incoming connections, specify the hostname for the load balancer.
|
||||
|
||||
A response code of ``200 OK`` indicates that the MinIO cluster has
|
||||
sufficient MinIO servers online to meet write quorum. A response code of
|
||||
``503 Service Unavailable`` indicates the cluster does not currently have
|
||||
write quorum.
|
||||
|
||||
The healthcheck probe alone cannot determine if a MinIO server is offline or
|
||||
processing write operations normally - only whether enough MinIO servers are
|
||||
online to meet write quorum requirements based on the configured
|
||||
:ref:`erasure code parity <minio-ec-parity>`. Consider configuring a Prometheus
|
||||
:ref:`alert <minio-metrics-and-alerts-alerting>` using one of the following
|
||||
metrics to detect potential issues or errors on the MinIO cluster:
|
||||
|
||||
- :metric:`minio_cluster_nodes_offline_total` to alert if one or more
|
||||
MinIO nodes are offline.
|
||||
|
||||
- :metric:`minio_node_disk_free_bytes` to alert if the cluster is running
|
||||
low on free disk space.
|
||||
|
||||
Cluster Read Quorum
|
||||
--------------------
|
||||
|
||||
Use the following endpoint to test if a MinIO cluster has
|
||||
:ref:`read quorum <minio-ec-parity>`:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
curl -I https://minio.example.net:9000/minio/health/cluster/read
|
||||
|
||||
Replace ``https://minio.example.net:9000`` with the DNS hostname of a node
|
||||
in the MinIO cluster to check. For clusters using a load balancer to manage
|
||||
incoming connections, specify the hostname for the load balancer.
|
||||
|
||||
A response code of ``200 OK`` indicates that the MinIO cluster has
|
||||
sufficient MinIO servers online to meet read quorum. A response code of
|
||||
``503 Service Unavailable`` indicates the cluster does not currently have
|
||||
read quorum.
|
||||
|
||||
The healthcheck probe alone cannot determine if a MinIO server is offline or
|
||||
processing read operations normally - only whether enough MinIO servers are
|
||||
online to meet read quorum requirements based on the configured
|
||||
:ref:`erasure code parity <minio-ec-parity>`. Consider configuring a Prometheus
|
||||
:ref:`alert <minio-metrics-and-alerts-alerting>` using the
|
||||
:metric:`minio_cluster_nodes_offline_total` metric to detect whether one or more
|
||||
MinIO nodes are offline.
|
||||
|
||||
Cluster Maintenance Check
|
||||
-------------------------
|
||||
|
||||
Use the following endpoint to test if the MinIO cluster can maintain
|
||||
both :ref:`read <minio-ec-parity>` and :ref:`write <minio-ec-parity>`
|
||||
if the specified MinIO server is taken down for maintenance:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
curl -I https://minio.example.net:9000/minio/health/cluster?maintenance=true
|
||||
|
||||
Replace ``https://minio.example.net:9000`` with the DNS hostname of a node
|
||||
in the MinIO cluster to check. For clusters using a load balancer to manage
|
||||
incoming connections, specify the hostname for the load balancer.
|
||||
|
||||
A response code of ``200 OK`` indicates that the MinIO cluster has
|
||||
sufficient MinIO servers online to meet write quorum. A response code of
|
||||
``412 Precondition Failed`` indicates the cluster will lose quorum if the
|
||||
MinIO server goes offline.
|
||||
|
||||
The healthcheck probe alone cannot determine if a MinIO server is offline - only
|
||||
whether enough MinIO servers will be online after taking the node down for
|
||||
maintenance to meet read and write quorum requirements based on the configured
|
||||
:ref:`erasure code parity <minio-ec-parity>`. Consider configuring a Prometheus
|
||||
:ref:`alert <minio-metrics-and-alerts-alerting>` using the
|
||||
:metric:`minio_cluster_nodes_offline_total` metric to detect whether one or more
|
||||
MinIO nodes are offline.
|
290
source/operations/monitoring/minio-logging.rst
Normal file
290
source/operations/monitoring/minio-logging.rst
Normal file
@ -0,0 +1,290 @@
|
||||
.. _minio-logging:
|
||||
|
||||
===================================================
|
||||
Publish Server or Audit Logs to an External Service
|
||||
===================================================
|
||||
|
||||
.. default-domain:: minio
|
||||
|
||||
.. contents:: Table of Contents
|
||||
:local:
|
||||
:depth: 1
|
||||
|
||||
MinIO publishes all :mc:`minio server` operations to the system console.
|
||||
Reading these logs depends on how the server process is managed.
|
||||
For example, if the server is managed through a ``systemd`` script,
|
||||
you can read the logs using ``journalctl -u SERVICENAME.service``. Replace
|
||||
``SERVICENAME`` with the name of the MinIO service.
|
||||
|
||||
MinIO also supports publishing server logs and audit logs to an HTTP webhook.
|
||||
|
||||
- :ref:`Server logs <minio-logging-publish-server-logs>` contain the same
|
||||
:mc:`minio server` operations logged to the system console. Server logs
|
||||
support general monitoring and troubleshooting of operations.
|
||||
|
||||
- :ref:`Audit logs <minio-logging-publish-audit-logs>` are more granular
|
||||
descriptions of each operation on the MinIO deployment. Audit logging
|
||||
supports security standards and regulations which require detailed tracking
|
||||
of operations.
|
||||
|
||||
MinIO publishes logs as a JSON document as a ``PUT`` request to each configured
|
||||
endpoint. The endpoint server is responsible for processing each JSON document.
|
||||
MinIO requires explicit configuration of each webhook endpoint and does *not*
|
||||
publish logs to a webhook by default.
|
||||
|
||||
.. _minio-logging-publish-server-logs:
|
||||
|
||||
Publish Server Logs to HTTP Webhook
|
||||
-----------------------------------
|
||||
|
||||
You can configure a new HTTP webhook endpoint to which MinIO publishes
|
||||
:mc:`minio server` logs using either environment variables *or* by setting
|
||||
runtime configuration settings.
|
||||
|
||||
.. tab-set::
|
||||
|
||||
.. tab-item:: Environment Variables
|
||||
|
||||
MinIO supports specifying the :mc:`minio server` log HTTP webhook endpoint
|
||||
and associated configuration settings using :ref:`environment variables
|
||||
<minio-sever-envvar-logging-regular>`.
|
||||
|
||||
The following example code sets *all* environment variables related to
|
||||
configuring a log HTTP webhook endpoint. The minimum *required* variables
|
||||
are:
|
||||
|
||||
- :envvar:`MINIO_LOGGER_WEBHOOK_ENABLE`
|
||||
- :envvar:`MINIO_LOGGER_WEBHOOK_ENDPOINT`
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
set MINIO_LOGGER_WEBHOOK_ENABLE_<IDENTIFIER>="on"
|
||||
set MINIO_LOGGER_WEBHOOK_ENDPOINT_<IDENTIFIER>="https://webhook-1.example.net"
|
||||
set MINIO_LOGGER_WEBHOOK_AUTH_TOKEN_<IDENTIFIER>="TOKEN"
|
||||
|
||||
- Replace ``<IDENTIFIER>`` with a unique descriptive string for the
|
||||
HTTP webhook endpoint. Use the same ``<IDENTIFIER>`` for all environment
|
||||
variables related to the new log HTTP webhook.
|
||||
|
||||
If the specified ``<IDENTIFIER>`` matches an existing log endpoint,
|
||||
the new settings *override* any existing settings for that endpoint.
|
||||
Use :mc-cmd:`mc admin config get logger_webhook <mc admin config get>`
|
||||
to review the currently configured log HTTP webhook endpoints.
|
||||
|
||||
- Replace ``https://webhook-1.example.net`` with the URL of the HTTP
|
||||
webhook endpoint.
|
||||
|
||||
- Replace ``TOKEN`` with a JSON Web Token (JWT) to use for authenticating
|
||||
to the webhook endpoints. Omit for endpoints which do not require
|
||||
authentication.
|
||||
|
||||
Restart the MinIO server to apply the new configuration settings. You
|
||||
must specify the same environment variables and settings on
|
||||
*all* MinIO servers in the deployment.
|
||||
|
||||
.. tab-item:: Configuration Settings
|
||||
|
||||
MinIO supports adding or updating log HTTP webhook endpoints on a MinIO
|
||||
deployment using the :mc-cmd:`mc admin config set` command and the
|
||||
:mc-conf:`logger_webhook` configuration key. You must restart the
|
||||
MinIO deployment to apply any new or updated configuration settings.
|
||||
|
||||
The following example code sets *all* settings related to configuring
|
||||
a log HTTP webhook endpoint. The minimum *required* setting is
|
||||
:mc-conf:`logger_webhook endpoint <logger_webhook.endpoint>`:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin config set ALIAS/ logger_webhook:IDENTIFIER \
|
||||
endpoint="https://webhook-1.example.net" \
|
||||
auth_token="TOKEN"
|
||||
|
||||
- Replace ``<IDENTIFIER>`` with a unique descriptive string for the
|
||||
HTTP webhook endpoint. Use the same ``<IDENTIFIER>`` for all environment
|
||||
variables related to the new log HTTP webhook.
|
||||
|
||||
If the specified ``<IDENTIFIER>`` matches an existing log endpoint,
|
||||
the new settings *override* any existing settings for that endpoint.
|
||||
Use :mc-cmd:`mc admin config get logger_webhook <mc admin config get>`
|
||||
to review the currently configured log HTTP webhook endpoints.
|
||||
|
||||
- Replace ``https://webhook-1.example.net`` with the URL of the HTTP
|
||||
webhook endpoint.
|
||||
|
||||
- Replace ``TOKEN`` with a JSON Web Token (JWT) to use for authenticating
|
||||
to the webhook endpoints. Omit for endpoints which do not require
|
||||
authentication.
|
||||
|
||||
.. _minio-logging-publish-audit-logs:
|
||||
|
||||
Publish Audit Logs to HTTP Webhook
|
||||
----------------------------------
|
||||
|
||||
You can configure a new HTTP webhook endpoint to which MinIO publishes audit
|
||||
logs using either environment variables *or* by setting runtime configuration
|
||||
settings:
|
||||
|
||||
.. tab-set::
|
||||
|
||||
.. tab-item:: Environment Variables
|
||||
|
||||
MinIO supports specifying the audit log HTTP webhook endpoint and
|
||||
associated configuration settings using :ref:`environment variables
|
||||
<minio-sever-envvar-logging-audit>`.
|
||||
|
||||
The following example code sets *all* environment variables related to
|
||||
configuring a audit log HTTP webhook endpoint. The minimum *required*
|
||||
variables are:
|
||||
|
||||
- :envvar:`MINIO_AUDIT_WEBHOOK_ENABLE`
|
||||
- :envvar:`MINIO_AUDIT_WEBHOOK_ENDPOINT`
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
set MINIO_AUDIT_WEBHOOK_ENABLE_<IDENTIFIER>="on"
|
||||
set MINIO_AUDIT_WEBHOOK_ENDPOINT_<IDENTIFIER>="https://webhook-1.example.net"
|
||||
set MINIO_AUDIT_WEBHOOK_AUTH_TOKEN_<IDENTIFIER>="TOKEN"
|
||||
set MINIO_AUDIT_WEBHOOK_CLIENT_CERT_<IDENTIFIER>="cert.pem"
|
||||
set MINIO_AUDIT_WEBHOOK_CLIENT_KEY_<IDENTIFIER>="cert.key"
|
||||
|
||||
- Replace ``<IDENTIFIER>`` with a unique descriptive string for the
|
||||
HTTP webhook endpoint. Use the same ``<IDENTIFIER>`` for all environment
|
||||
variables related to the new audit log HTTP webhook.
|
||||
|
||||
If the specified ``<IDENTIFIER>`` matches an existing log endpoint,
|
||||
the new settings *override* any existing settings for that endpoint.
|
||||
Use :mc-cmd:`mc admin config get audit_webhook <mc admin config get>`
|
||||
to review the currently configured audit log HTTP webhook endpoints.
|
||||
|
||||
- Replace ``https://webhook-1.example.net`` with the URL of the HTTP
|
||||
webhook endpoint.
|
||||
|
||||
- Replace ``TOKEN`` with a JSON Web Token (JWT) to use for authenticating
|
||||
to the webhook endpoints. Omit for endpoints which do not require
|
||||
authentication.
|
||||
|
||||
- Replace ``cert.pem`` and ``cert.key`` with the public and private key
|
||||
of the x.509 TLS certificates to present to the HTTP webhook server.
|
||||
Omit for endpoints which do not require clients to present TLS
|
||||
certificates.
|
||||
|
||||
Restart the MinIO server to apply the new configuration settings. You
|
||||
must specify the same environment variables and settings on
|
||||
*all* MinIO servers in the deployment.
|
||||
|
||||
.. tab-item:: Configuration Settings
|
||||
|
||||
MinIO supports adding or updating audit log HTTP webhook endpoints on a
|
||||
MinIO deployment using the :mc-cmd:`mc admin config set` command and the
|
||||
:mc-conf:`audit_webhook` configuration key. You must restart the MinIO
|
||||
deployment to apply any new or updated configuration settings.
|
||||
|
||||
The following example code sets *all* settings related to configuring
|
||||
a audit log HTTP webhook endpoint. The minimum *required* setting is
|
||||
:mc-conf:`audit_webhook endpoint <audit_webhook.endpoint>`:
|
||||
|
||||
.. code-block:: shell
|
||||
:class: copyable
|
||||
|
||||
mc admin config set ALIAS/ audit_webhook:IDENTIFIER \
|
||||
endpoint="https://webhook-1.example.net" \
|
||||
auth_token="TOKEN" \
|
||||
client_cert="cert.pem" \
|
||||
client_key="cert.key"
|
||||
|
||||
- Replace ``<IDENTIFIER>`` with a unique descriptive string for the
|
||||
HTTP webhook endpoint. Use the same ``<IDENTIFIER>`` for all environment
|
||||
variables related to the new audit log HTTP webhook.
|
||||
|
||||
If the specified ``<IDENTIFIER>`` matches an existing log endpoint,
|
||||
the new settings *override* any existing settings for that endpoint.
|
||||
Use :mc-cmd:`mc admin config get audit_webhook <mc admin config get>`
|
||||
to review the currently configured audit log HTTP webhook endpoints.
|
||||
|
||||
- Replace ``https://webhook-1.example.net`` with the URL of the HTTP
|
||||
webhook endpoint.
|
||||
|
||||
- Replace ``TOKEN`` with a JSON Web Token (JWT) to use for authenticating
|
||||
to the webhook endpoints. Omit for endpoints which do not require
|
||||
authentication.
|
||||
|
||||
- Replace ``cert.pem`` and ``cert.key`` with the public and private key
|
||||
of the x.509 TLS certificates to present to the HTTP webhook server.
|
||||
Omit for endpoints which do not require clients to present TLS
|
||||
certificates.
|
||||
|
||||
Audit Log Structure
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
MinIO audit logs resemble the following JSON document:
|
||||
|
||||
- The ``api.timeToFirstBite`` and ``api.timeToResponse`` fields are expressed
|
||||
in nanoseconds.
|
||||
|
||||
- For :ref:`erasure coded setups <minio-erasure-coding>`
|
||||
``tags.objectErasureMap`` provides per-object details on the following:
|
||||
|
||||
- The :ref:`Server Pool <minio-intro-server-pool>` on which the object
|
||||
operation was performed.
|
||||
|
||||
- The :ref:`erasure set <minio-ec-erasure-set>` on which the object
|
||||
operation was performed.
|
||||
|
||||
- The list of disks in the erasure set which participated in the
|
||||
object operation.
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
{
|
||||
"version": "1",
|
||||
"deploymentid": "bc0e4d1e-bacc-42eb-91ad-2d7f3eacfa8d",
|
||||
"time": "2019-08-12T21:34:37.187817748Z",
|
||||
"api": {
|
||||
"name": "PutObject",
|
||||
"bucket": "testbucket",
|
||||
"object": "hosts",
|
||||
"status": "OK",
|
||||
"statusCode": 200,
|
||||
"timeToFirstByte": "366333ns",
|
||||
"timeToResponse": "16438202ns"
|
||||
},
|
||||
"remotehost": "127.0.0.1",
|
||||
"requestID": "15BA4A72C0C70AFC",
|
||||
"userAgent": "MinIO (linux; amd64) minio-go/v6.0.32 mc/2019-08-12T18:27:13Z",
|
||||
"requestHeader": {
|
||||
"Authorization": "AWS4-HMAC-SHA256 Credential=minio/20190812/us-east-1/s3/aws4_request,SignedHeaders=host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length,Signature=d3f02a6aeddeb29b06e1773b6a8422112890981269f2463a26f307b60423177c",
|
||||
"Content-Length": "686",
|
||||
"Content-Type": "application/octet-stream",
|
||||
"User-Agent": "MinIO (linux; amd64) minio-go/v6.0.32 mc/2019-08-12T18:27:13Z",
|
||||
"X-Amz-Content-Sha256": "STREAMING-AWS4-HMAC-SHA256-PAYLOAD",
|
||||
"X-Amz-Date": "20190812T213437Z",
|
||||
"X-Amz-Decoded-Content-Length": "512"
|
||||
},
|
||||
"responseHeader": {
|
||||
"Accept-Ranges": "bytes",
|
||||
"Content-Length": "0",
|
||||
"Content-Security-Policy": "block-all-mixed-content",
|
||||
"ETag": "a414c889dc276457bd7175f974332cb0-1",
|
||||
"Server": "MinIO/DEVELOPMENT.2019-08-12T21-28-07Z",
|
||||
"Vary": "Origin",
|
||||
"X-Amz-Request-Id": "15BA4A72C0C70AFC",
|
||||
"X-Xss-Protection": "1; mode=block"
|
||||
},
|
||||
"tags": {
|
||||
"objectErasureMap": {
|
||||
"object": {
|
||||
"poolId": 1,
|
||||
"setId": 10,
|
||||
"disks": [
|
||||
"http://server01/mnt/pool1/disk01",
|
||||
"http://server02/mnt/pool1/disk02",
|
||||
"http://server03/mnt/pool1/disk03",
|
||||
"http://server04/mnt/pool1/disk04"
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
Reference in New Issue
Block a user