mirror of
https://github.com/minio/docs.git
synced 2025-07-28 19:42:10 +03:00
Adding concept docs for operation and administration sections (#519)
Creates an administration/concepts.rst file. Adds content to the operation/concepts.rst file stub.
This commit is contained in:
@ -10,35 +10,24 @@ Erasure Coding
|
||||
:local:
|
||||
:depth: 2
|
||||
|
||||
MinIO Erasure Coding is a data redundancy and availability feature that allows
|
||||
MinIO deployments to automatically reconstruct objects on-the-fly despite the
|
||||
loss of multiple drives or nodes in the cluster. Erasure Coding provides
|
||||
object-level healing with less overhead than adjacent technologies such as
|
||||
RAID or replication.
|
||||
MinIO Erasure Coding is a data redundancy and availability feature that allows MinIO deployments to automatically reconstruct objects on-the-fly despite the loss of multiple drives or nodes in the cluster.
|
||||
Erasure Coding provides object-level healing with significantly less overhead than adjacent technologies such as RAID or replication.
|
||||
|
||||
MinIO splits each new object into data and parity blocks, where parity blocks
|
||||
support reconstruction of missing or corrupted data blocks. MinIO writes these
|
||||
blocks to a single :ref:`erasure set <minio-ec-erasure-set>` in the deployment.
|
||||
Since erasure set drives are striped across the deployment, a given node
|
||||
typically contains only a portion of data or parity blocks for each object.
|
||||
MinIO can therefore tolerate the loss of multiple drives or nodes in the
|
||||
deployment depending on the configured parity and deployment topology.
|
||||
MinIO splits each new object into data and parity blocks, where parity blocks support reconstruction of missing or corrupted data blocks.
|
||||
MinIO writes these blocks to a single :ref:`erasure set <minio-ec-erasure-set>` in the deployment.
|
||||
Since erasure set drives are striped across the server pool, a given node contains only a portion of data or parity blocks for each object.
|
||||
MinIO can therefore tolerate the loss of multiple drives or nodes in the deployment depending on the configured parity and deployment topology.
|
||||
|
||||
.. image:: /images/erasure-code.jpg
|
||||
:width: 600px
|
||||
:alt: MinIO Erasure Coding example
|
||||
:align: center
|
||||
|
||||
At maximum parity, MinIO can tolerate the loss of up to half the drives per
|
||||
erasure set (``N/2-1``) and still perform read and write operations. MinIO
|
||||
defaults to 4 parity blocks per object with tolerance for the loss of 4 drives
|
||||
per erasure set. For more complete information on selecting erasure code parity,
|
||||
see :ref:`minio-ec-parity`.
|
||||
At maximum parity, MinIO can tolerate the loss of up to half the drives per erasure set (:math:`(N / 2) - 1`) and still perform read and write operations.
|
||||
MinIO defaults to 4 parity blocks per object with tolerance for the loss of 4 drives per erasure set.
|
||||
For more complete information on selecting erasure code parity, see :ref:`minio-ec-parity`.
|
||||
|
||||
Use the MinIO `Erasure Code Calculator
|
||||
<https://min.io/product/erasure-code-calculator?ref=docs>`__ when planning and
|
||||
designing your MinIO deployment to explore the effect of erasure code settings
|
||||
on your intended topology.
|
||||
Use the MinIO `Erasure Code Calculator <https://min.io/product/erasure-code-calculator?ref=docs>`__ when planning and designing your MinIO deployment to explore the effect of erasure code settings on your intended topology.
|
||||
|
||||
Zero-Parity Deployments
|
||||
-----------------------
|
||||
@ -53,50 +42,45 @@ Zero-parity deployments depend on the underlying storage for resiliency and avai
|
||||
Erasure Sets
|
||||
------------
|
||||
|
||||
An *Erasure Set* is a set of drives in a MinIO deployment that support Erasure
|
||||
Coding. MinIO evenly distributes object data and parity blocks among the drives
|
||||
in the Erasure Set. MinIO randomly and uniformly distributes the data and parity
|
||||
blocks across drives in the erasure set with *no overlap*. Each unique object
|
||||
has no more than one data or parity block per drive in the set.
|
||||
An *Erasure Set* is a set of drives in a MinIO deployment that support Erasure Coding.
|
||||
MinIO evenly distributes object data and parity blocks among the drives in the Erasure Set.
|
||||
MinIO randomly and uniformly distributes the data and parity blocks across drives in the erasure set with *no overlap*.
|
||||
Each unique object has no more than one data or parity block per drive in the set.
|
||||
|
||||
MinIO calculates the number and size of *Erasure Sets* by dividing the total
|
||||
number of drives in the :ref:`Server Pool <minio-intro-server-pool>` into sets
|
||||
consisting of between 4 and 16 drives each.
|
||||
MinIO calculates the number and size of *Erasure Sets* by dividing the total number of drives in the :ref:`Server Pool <minio-intro-server-pool>` into sets consisting of between 4 and 16 drives each.
|
||||
|
||||
Use the MinIO
|
||||
`Erasure Coding Calculator <https://min.io/product/erasure-code-calculator>`__
|
||||
to determine the optimal erasure set size for your preferred MinIO topology.
|
||||
For clusters, pools, or deployments with more than 16 drives, MinIO divides the drives into multiple erasure sets of the same number of drives.
|
||||
For this reason, the total number of drives in a deployment must be divisible evenly by a number between 4 and 16.
|
||||
|
||||
For example, 20 drives are divided into two erasure sets of 10 drives each.
|
||||
28 drives are divided into 2 erasure sets of 14 drives each.
|
||||
40 drives are divided into 4 erasure sets of 10 drives each.
|
||||
|
||||
Because numbers such as 17, 19, or 34 cannot be evenly divided by any number between 2 and 16, you cannot have a deployment with such a number of drives.
|
||||
Add or remove drives to return to an allowable number of drives.
|
||||
|
||||
Use the MinIO `Erasure Coding Calculator <https://min.io/product/erasure-code-calculator>`__ to determine the optimal erasure set size for your preferred MinIO topology.
|
||||
|
||||
.. _minio-ec-parity:
|
||||
|
||||
Erasure Code Parity (``EC:N``)
|
||||
------------------------------
|
||||
|
||||
MinIO uses a Reed-Solomon algorithm to split objects into data and parity blocks
|
||||
based on the :ref:`Erasure Set <minio-ec-erasure-set>` size in the deployment.
|
||||
For a given erasure set of size ``M``, MinIO splits objects into ``N`` parity
|
||||
blocks and ``M-N`` data blocks.
|
||||
MinIO uses a Reed-Solomon algorithm to split objects into data and parity blocks based on the :ref:`Erasure Set <minio-ec-erasure-set>` size in the deployment.
|
||||
For a given erasure set of size ``M``, MinIO splits objects into ``N`` parity blocks and ``M-N`` data blocks.
|
||||
|
||||
MinIO uses the ``EC:N`` notation to refer to the number of parity blocks (``N``)
|
||||
in the deployment. MinIO defaults to ``EC:4`` or 4 parity blocks per object.
|
||||
MinIO uses the same ``EC:N`` value for all erasure sets and
|
||||
:ref:`server pools <minio-intro-server-pool>` in the deployment.
|
||||
MinIO uses the ``EC:N`` notation to refer to the number of parity blocks (``N``) in the deployment.
|
||||
MinIO defaults to ``EC:4`` or 4 parity blocks per object.
|
||||
MinIO uses the same ``EC:N`` value for all erasure sets and :ref:`server pools <minio-intro-server-pool>` in the deployment.
|
||||
|
||||
MinIO can tolerate the loss of up to ``N`` drives per erasure set and
|
||||
continue performing read and write operations ("quorum"). If ``N`` is equal
|
||||
to exactly 1/2 the drives in the erasure set, MinIO write quorum requires
|
||||
``N+1`` drives to avoid data inconsistency ("split-brain").
|
||||
MinIO can tolerate the loss of up to ``N`` drives per erasure set and continue performing read and write operations ("quorum").
|
||||
If ``N`` is equal to exactly 1/2 the drives in the erasure set, MinIO write quorum requires :math:`N + 1` drives to avoid data inconsistency ("split-brain").
|
||||
|
||||
Setting the parity for a deployment is a balance between availability
|
||||
and total usable storage. Higher parity values increase resiliency to drive
|
||||
or node failure at the cost of usable storage, while lower parity provides
|
||||
maximum storage with reduced tolerance for drive/node failures.
|
||||
Use the MinIO `Erasure Code Calculator
|
||||
<https://min.io/product/erasure-code-calculator?ref=docs>`__ to explore the
|
||||
effect of parity on your planned cluster deployment.
|
||||
Setting the parity for a deployment is a balance between availability and total usable storage.
|
||||
Higher parity values increase resiliency to drive or node failure at the cost of usable storage, while lower parity provides maximum storage with reduced tolerance for drive/node failures.
|
||||
Use the MinIO `Erasure Code Calculator <https://min.io/product/erasure-code-calculator?ref=docs>`__ to explore the effect of parity on your planned cluster deployment.
|
||||
|
||||
The following table lists the outcome of varying erasure code parity levels on
|
||||
a MinIO deployment consisting of 1 node and 16 1TB drives:
|
||||
The following table lists the outcome of varying erasure code parity levels on a MinIO deployment consisting of 1 node and 16 1TB drives:
|
||||
|
||||
.. list-table:: Outcome of Parity Settings on a 16 Drive MinIO Cluster
|
||||
:header-rows: 1
|
||||
@ -132,14 +116,14 @@ a MinIO deployment consisting of 1 node and 16 1TB drives:
|
||||
Storage Classes
|
||||
~~~~~~~~~~~~~~~
|
||||
|
||||
MinIO supports storage classes with Erasure Coding to allow applications to
|
||||
specify per-object :ref:`parity <minio-ec-parity>`. Each storage class specifies
|
||||
a ``EC:N`` parity setting to apply to objects created with that class.
|
||||
MinIO supports redundancy storage classes with Erasure Coding to allow applications to specify per-object :ref:`parity <minio-ec-parity>`.
|
||||
Each storage class specifies a ``EC:N`` parity setting to apply to objects created with that class.
|
||||
|
||||
MinIO storage classes are *distinct* from Amazon Web Services
|
||||
:s3-docs:`storage classes <storage-class-intro.html>`. MinIO storage classes
|
||||
define *parity settings per object*, while AWS storage classes define *storage
|
||||
tiers per object*.
|
||||
MinIO storage classes for erasure coding are *distinct* from Amazon Web Services :s3-docs:`storage classes <storage-class-intro.html>` used for tiering.
|
||||
MinIO erasure coding storage classes define *parity settings per object*, while AWS storage classes define *storage tiers per object*.
|
||||
|
||||
.. note::
|
||||
For transitioning objects between storage classes for tiering purposes in MinIO, refer to the documentation on :ref:`lifecycle management <minio-lifecycle-management-tiering>`.
|
||||
|
||||
MinIO provides the following two storage classes:
|
||||
|
||||
@ -148,8 +132,7 @@ MinIO provides the following two storage classes:
|
||||
.. tab-item:: STANDARD
|
||||
|
||||
The ``STANDARD`` storage class is the default class for all objects.
|
||||
MinIO sets the ``STANDARD`` parity based on the number of volumes
|
||||
in the Erasure Set:
|
||||
MinIO sets the ``STANDARD`` parity based on the number of volumes in the Erasure Set:
|
||||
|
||||
.. list-table::
|
||||
:header-rows: 1
|
||||
@ -171,67 +154,51 @@ MinIO provides the following two storage classes:
|
||||
You can override the default ``STANDARD`` parity using either:
|
||||
|
||||
- The :envvar:`MINIO_STORAGE_CLASS_STANDARD` environment variable, *or*
|
||||
- The :mc:`mc admin config` command to modify the
|
||||
``storage_class.standard`` configuration setting.
|
||||
- The :mc:`mc admin config` command to modify the ``storage_class.standard`` configuration setting.
|
||||
|
||||
The maximum value is half of the total drives in the
|
||||
:ref:`Erasure Set <minio-ec-erasure-set>`. The minimum value is ``2``.
|
||||
The maximum value is half of the total drives in the :ref:`Erasure Set <minio-ec-erasure-set>`.
|
||||
The minimum value is ``2``.
|
||||
|
||||
``STANDARD`` parity *must* be greater than or equal to
|
||||
``REDUCED_REDUNDANCY``. If ``REDUCED_REDUNDANCY`` is unset, ``STANDARD``
|
||||
parity *must* be greater than 2.
|
||||
``STANDARD`` parity *must* be greater than or equal to ``REDUCED_REDUNDANCY``.
|
||||
If ``REDUCED_REDUNDANCY`` is unset, ``STANDARD`` parity *must* be greater than 2.
|
||||
|
||||
.. tab-item:: REDUCED_REDUNDANCY
|
||||
|
||||
The ``REDUCED_REDUNDANCY`` storage class allows creating objects with
|
||||
lower parity than ``STANDARD``. ``REDUCED_REDUNDANCY`` requires
|
||||
*at least* 5 drives in the MinIO deployment.
|
||||
The ``REDUCED_REDUNDANCY`` storage class allows creating objects with lower parity than ``STANDARD``.
|
||||
``REDUCED_REDUNDANCY`` requires *at least* 5 drives in the MinIO deployment.
|
||||
|
||||
MinIO sets the ``REDUCED_REDUNDANCY`` parity to ``EC:2`` by default.
|
||||
You can override ``REDUCED_REDUNDANCY`` storage class parity using
|
||||
either:
|
||||
You can override ``REDUCED_REDUNDANCY`` storage class parity using either:
|
||||
|
||||
- The :envvar:`MINIO_STORAGE_CLASS_RRS` environment variable, *or*
|
||||
- The :mc:`mc admin config` command to modify the
|
||||
``storage_class.rrs`` configuration setting.
|
||||
- The :mc:`mc admin config` command to modify the ``storage_class.rrs`` configuration setting.
|
||||
|
||||
``REDUCED_REDUNDANCY`` parity *must* be less than or equal to
|
||||
``STANDARD``.
|
||||
``REDUCED_REDUNDANCY`` parity *must* be less than or equal to ``STANDARD``.
|
||||
|
||||
MinIO references the ``x-amz-storage-class`` header in request metadata for
|
||||
determining which storage class to assign an object. The specific syntax
|
||||
or method for setting headers depends on your preferred method for
|
||||
interfacing with the MinIO server.
|
||||
MinIO references the ``x-amz-storage-class`` header in request metadata for determining which storage class to assign an object.
|
||||
The specific syntax or method for setting headers depends on your preferred method for interfacing with the MinIO server.
|
||||
|
||||
- For the :mc:`mc` command line tool, certain commands include a specific
|
||||
option for setting the storage class. For example, the :mc:`mc cp` command
|
||||
has the :mc-cmd:`~mc cp storage-class` option for specifying the
|
||||
storage class to assign to the object being copied.
|
||||
- For the :mc:`mc` command line tool, certain commands include a specific option for setting the storage class.
|
||||
For example, the :mc:`mc cp` command has the :mc-cmd:`~mc cp storage-class` option for specifying the storage class to assign to the object being copied.
|
||||
|
||||
- For MinIO SDKs, the ``S3Client`` object has specific methods for setting
|
||||
request headers. For example, the ``minio-go`` SDK ``S3Client.PutObject``
|
||||
method takes a ``PutObjectOptions`` data structure as a parameter.
|
||||
The ``PutObjectOptions`` data structure includes the ``StorageClass``
|
||||
option for specifying the storage class to assign to the object being
|
||||
created.
|
||||
- For MinIO SDKs, the ``S3Client`` object has specific methods for setting request headers.
|
||||
For example, the ``minio-go`` SDK ``S3Client.PutObject`` method takes a ``PutObjectOptions`` data structure as a parameter.
|
||||
The ``PutObjectOptions`` data structure includes the ``StorageClass`` option for specifying the storage class to assign to the object being created.
|
||||
|
||||
|
||||
.. _minio-ec-bitrot-protection:
|
||||
|
||||
BitRot Protection
|
||||
-----------------
|
||||
Bit Rot Protection
|
||||
------------------
|
||||
|
||||
.. TODO- ReWrite w/ more detail.
|
||||
|
||||
Silent data corruption or bitrot is a serious problem faced by disk drives
|
||||
resulting in data getting corrupted without the user’s knowledge. The reasons
|
||||
are manifold (ageing drives, current spikes, bugs in disk firmware, phantom
|
||||
writes, misdirected reads/writes, driver errors, accidental overwrites) but the
|
||||
result is the same - compromised data.
|
||||
Silent data corruption or bit rot is a serious problem faced by disk drives resulting in data getting corrupted without the user’s knowledge.
|
||||
The corruption of data occurs when the electrical charge on a portion of the disk disperses or changes with no notification to or input from the user.
|
||||
Many events can lead to such a silent corruption of stored data.
|
||||
For example, ageing drives, current spikes, bugs in disk firmware, phantom writes, misdirected reads/writes, driver errors, accidental overwrites, or a random cosmic ray can each lead to a bit change.
|
||||
Whatever the cause, the result is the same - compromised data.
|
||||
|
||||
MinIO’s optimized implementation of the HighwayHash algorithm ensures that it
|
||||
will never read corrupted data - it captures and heals corrupted objects on the
|
||||
fly. Integrity is ensured from end to end by computing a hash on READ and
|
||||
verifying it on WRITE from the application, across the network and to the
|
||||
memory/drive. The implementation is designed for speed and can achieve hashing
|
||||
speeds over 10 GB/sec on a single core on Intel CPUs.
|
||||
MinIO’s optimized implementation of the :minio-git:`HighwayHash algorithm <highwayhash/blob/master/README.md>` ensures that it captures and heals corrupted objects on the fly.
|
||||
Integrity is ensured from end to end by computing a hash on READ and verifying it on WRITE from the application, across the network, and to the memory or drive.
|
||||
The implementation is designed for speed and can achieve hashing speeds over 10 GB/sec on a single core on Intel CPUs.
|
||||
|
Reference in New Issue
Block a user