minio/docs

mirror of https://github.com/minio/docs.git synced 2025-05-28 00:41:14 +03:00

Mc updates for multiple releases (#642 )

Updates `mc` reference docs for several releases of the MinIO Client.

- Adds missing flags to `mc admin trace`
- Updates `disk` -> `drive` throughout the docs, but not in all cases.
- Adds `--airgap flag` to `mc support profile` and `mc support perf`
commands.
- Updates the flags for `mc ilm add` command
- Adds `mc license unregister` command.
    
Closes #571
Closes #614
Closes #627
Closes #633

2022-11-18 12:49:03 -06:00

3.0 KiB

Raw Blame History

Recover after Hardware Failure

minio

Table of Contents

Distributed MinIO deployments rely on Erasure Coding <minio-erasure-coding> to provide built-in tolerance for multiple drive or node failures. Depending on the deployment topology and the selected erasure code parity, MinIO can tolerate the loss of up to half the drives or nodes in the deployment while maintaining read access ("read quorum") to objects.

The following table lists the typical types of failure in a MinIO deployment and links to procedures for recovering from each:

Failure Type	Description
`Drive Failure <minio-restore-hardware-failure-drive>`	MinIO supports hot-swapping failed drives with new healthy drives.
`Node Failure <minio-restore-hardware-failure-node>`	MinIO detects when a node rejoins the deployment and begins proactively healing the node shortly after it is joined back to the cluster healing data previously stored on that node.
`Site Failure <minio-restore-hardware-failure-site>`	MinIO Site Replication supports complete resynchronization of buckets, objects, and replication-eligible configuration settings after total site loss.

Since MinIO can operate in a degraded state without significant performance loss, administrators can schedule hardware replacement in proportion to the rate of hardware failure. "Normal" failure rates (single drive or node failure) may allow for a more reasonable replacement timeframe, while "critical" failure rates (multiple drives or nodes) may require a faster response.

For nodes with one or more drives that are either partially failed or operating in a degraded state (increasing drive errors, SMART warnings, timeouts in MinIO logs, etc.), you can safely unmount the drive if the cluster has sufficient remaining healthy drives to maintain read and write quorum <minio-ec-parity>. Missing drives are less disruptive to the deployment than drives that are reliably producing read and write errors.

MinIO Professional Support

MinIO SUBNET users can log in and create a new issue related to drive, node, or site failures. Coordination with MinIO Engineering via SUBNET can ensure successful recovery operations of production MinIO deployments, including root-cause analysis, and health diagnostics.

Community users can seek support on the MinIO Community Slack. Community Support is best-effort only and has no SLAs around responsiveness.

/operations/data-recovery/recover-after-drive-failure /operations/data-recovery/recover-after-node-failure /operations/data-recovery/recover-after-site-failure

3.0 KiB Raw Blame History

Recover after Hardware Failure

3.0 KiB

Raw Blame History