mirror of
https://github.com/minio/docs.git
synced 2025-07-28 19:42:10 +03:00
DOCS-369: Fix incorrect erasure code parity calculation
This commit is contained in:
@ -16,84 +16,48 @@ loss of multiple drives or nodes in the cluster. Erasure Coding provides
|
|||||||
object-level healing with less overhead than adjacent technologies such as
|
object-level healing with less overhead than adjacent technologies such as
|
||||||
RAID or replication.
|
RAID or replication.
|
||||||
|
|
||||||
Erasure Coding splits objects into data and parity blocks, where parity blocks
|
MinIO splits each new object into data and parity blocks, where parity blocks
|
||||||
support reconstruction of missing or corrupted data blocks. MinIO distributes
|
support reconstruction of missing or corrupted data blocks. MinIO writes these
|
||||||
both data and parity blocks across :mc:`minio server` nodes and drives in an
|
blocks to a single :ref:`erasure set <minio-ec-erasure-set>` in the deployment.
|
||||||
:ref:`Erasure Set <minio-ec-erasure-set>`. Depending on the configured parity,
|
Since erasure set drives are striped across the deployment, a given node
|
||||||
number of nodes, and number of drives per node in the Erasure Set, MinIO can
|
typically contains only a portion of data or parity blocks for each object.
|
||||||
tolerate the loss of up to half (``N/2``) of drives and still retrieve stored
|
MinIO can therefore tolerate the loss of multiple drives or nodes in the
|
||||||
objects.
|
deployment depending on the configured parity and deployment topology.
|
||||||
|
|
||||||
For example, consider a small-scale MinIO deployment consisting of a
|
.. image:: /images/erasure-code.jpg
|
||||||
single :ref:`Server Pool <minio-intro-server-pool>` with 4 :mc:`minio server`
|
:width: 600px
|
||||||
nodes. Each node in the deployment has 4 locally attached ``1Ti`` drives for
|
:alt: MinIO Erasure Coding example
|
||||||
a total of 16 drives.
|
:align: center
|
||||||
|
|
||||||
MinIO creates :ref:`Erasure Sets <minio-ec-erasure-set>` by dividing the total
|
At maximum parity, MinIO can tolerate the loss of up to half the drives per
|
||||||
number of drives in the deployment into sets consisting of between 4 and 16
|
erasure set (``N/2-1``) and still perform read and write operations. MinIO
|
||||||
drives each. In the example deployment, the largest possible Erasure Set size
|
defaults to 4 parity blocks per object with tolerance for the loss of 4 drives
|
||||||
that evenly divides into the total number of drives is ``16``.
|
per erasure set. For more complete information on selecting erasure code parity,
|
||||||
|
see :ref:`minio-ec-parity`.
|
||||||
|
|
||||||
MinIO uses a Reed-Solomon algorithm to split objects into data and parity blocks
|
Erasure coding requires a minimum of 4 drives is only available with
|
||||||
based on the size of the Erasure Set. MinIO then uniformly distributes the
|
:ref:`distributed <minio-installation-comparison>` MinIO deployments. Erasure
|
||||||
data and parity blocks across the Erasure Set drives such that each drive
|
coding is is a core requirement for the following MinIO features:
|
||||||
in the set contains no more than one block per object. MinIO uses
|
|
||||||
the ``EC:N`` notation to refer to the number of parity blocks (``N``) in the
|
|
||||||
Erasure Set.
|
|
||||||
|
|
||||||
The number of parity blocks in a deployment controls the deployment's relative
|
- :ref:`Object Versioning <minio-bucket-versioning>`
|
||||||
data redundancy. Higher levels of parity allow for higher tolerance of drive
|
- :ref:`Server-Side Replication <minio-bucket-replication>`
|
||||||
loss at the cost of total available storage. For example, using EC:4 in our
|
- :ref:`Write-Once Read-Many Locking <minio-bucket-locking>`
|
||||||
example deployment results in 12 data blocks and 4 parity blocks. The parity
|
|
||||||
blocks take up some portion of space in the deployment, reducing total storage.
|
|
||||||
*However*, the parity blocks allow MinIO to reconstruct the object with only
|
|
||||||
8 data blocks, increasing resilience to data corruption or loss.
|
|
||||||
|
|
||||||
The following table lists the outcome of varying EC levels on the example
|
Use the MinIO `Erasure Code Calculator
|
||||||
deployment:
|
<https://min.io/product/erasure-code-calculator?ref=docs>`__ when planning and
|
||||||
|
designing your MinIO deployment to explore the effect of erasure code settings
|
||||||
.. list-table:: Outcome of Parity Settings on a 16 Drive MinIO Cluster
|
on your intended topology.
|
||||||
:header-rows: 1
|
|
||||||
:widths: 20 20 20 20 20
|
|
||||||
:width: 100%
|
|
||||||
|
|
||||||
* - Parity
|
|
||||||
- Total Storage
|
|
||||||
- Storage Ratio
|
|
||||||
- Minimum Drives for Read Operations
|
|
||||||
- Minimum Drives for Write Operations
|
|
||||||
|
|
||||||
* - ``EC: 4`` (Default)
|
|
||||||
- 12 Tebibytes
|
|
||||||
- 0.750
|
|
||||||
- 12
|
|
||||||
- 13
|
|
||||||
|
|
||||||
* - ``EC: 6``
|
|
||||||
- 10 Tebibytes
|
|
||||||
- 0.625
|
|
||||||
- 10
|
|
||||||
- 11
|
|
||||||
|
|
||||||
* - ``EC: 8``
|
|
||||||
- 8 Tebibytes
|
|
||||||
- 0.500
|
|
||||||
- 8
|
|
||||||
- 9
|
|
||||||
|
|
||||||
- For more information on Erasure Sets, see :ref:`minio-ec-erasure-set`.
|
|
||||||
|
|
||||||
- For more information on selecting Erasure Code Parity, see
|
|
||||||
:ref:`minio-ec-parity`
|
|
||||||
|
|
||||||
.. _minio-ec-erasure-set:
|
.. _minio-ec-erasure-set:
|
||||||
|
|
||||||
Erasure Sets
|
Erasure Sets
|
||||||
------------
|
------------
|
||||||
|
|
||||||
An *Erasure Set* is a set of drives in a MinIO deployment that support
|
An *Erasure Set* is a set of drives in a MinIO deployment that support Erasure
|
||||||
Erasure Coding. MinIO evenly distributes object data and parity blocks among
|
Coding. MinIO evenly distributes object data and parity blocks among the drives
|
||||||
the drives in the Erasure Set.
|
in the Erasure Set. MinIO randomly and uniformly distributes the data and parity
|
||||||
|
blocks across drives in the erasure set with *no overlap*. Each unique object
|
||||||
|
has no more than one data or parity block per drive in the set.
|
||||||
|
|
||||||
MinIO calculates the number and size of *Erasure Sets* by dividing the total
|
MinIO calculates the number and size of *Erasure Sets* by dividing the total
|
||||||
number of drives in the :ref:`Server Pool <minio-intro-server-pool>` into sets
|
number of drives in the :ref:`Server Pool <minio-intro-server-pool>` into sets
|
||||||
@ -131,50 +95,59 @@ Erasure Code Parity (``EC:N``)
|
|||||||
------------------------------
|
------------------------------
|
||||||
|
|
||||||
MinIO uses a Reed-Solomon algorithm to split objects into data and parity blocks
|
MinIO uses a Reed-Solomon algorithm to split objects into data and parity blocks
|
||||||
based on the size of the Erasure Set. MinIO uses parity blocks to automatically
|
based on the :ref:`Erasure Set <minio-ec-erasure-set>` size in the deployment.
|
||||||
heal damaged or missing data blocks when reconstructing an object. MinIO uses
|
For a given erasure set of size ``M``, MinIO splits objects into ``N`` parity
|
||||||
the ``EC:N`` notation to refer to the number of parity blocks (``N``) in the
|
blocks and ``M-N`` data blocks.
|
||||||
Erasure Set.
|
|
||||||
|
|
||||||
MinIO uses a hash of an object's name to determine into which Erasure Set to
|
MinIO uses the ``EC:N`` notation to refer to the number of parity blocks (``N``)
|
||||||
store that object. MinIO always uses that erasure set for objects with a
|
in the deployment. MinIO defaults to ``EC:4`` or 4 parity blocks per object.
|
||||||
matching name. For example, MinIO stores all :ref:`versions
|
MinIO uses the same ``EC:N`` value for all erasure sets and
|
||||||
<minio-bucket-versioning>` of an object in the same Erasure Set.
|
:ref:`server pools <minio-intro-server-pool>` in the deployment.
|
||||||
|
|
||||||
After MinIO selects an object's Erasure Set, it divides the object based on the
|
MinIO can tolerate the loss of up to ``N`` drives per erasure set and
|
||||||
number of drives in the set and the configured parity. MinIO creates:
|
continue performing read and write operations ("quorum"). If ``N`` is equal
|
||||||
|
to exactly 1/2 the drives in the erasure set, MinIO write quorum requires
|
||||||
|
``N+1`` drives to avoid data inconsistency ("split-brain").
|
||||||
|
|
||||||
- ``(Erasure Set Drives) - EC:N`` Data Blocks, *and*
|
Setting the parity for a deployment is a balance between availability
|
||||||
- ``EC:N`` Parity Blocks.
|
and total usable storage. Higher parity values increase resiliency to drive
|
||||||
|
or node failure at the cost of usable storage, while lower parity provides
|
||||||
|
maximum storage with reduced tolerance for drive/node failures.
|
||||||
|
Use the MinIO `Erasure Code Calculator
|
||||||
|
<https://min.io/product/erasure-code-calculator?ref=docs>`__ to explore the
|
||||||
|
effect of parity on your planned cluster deployment.
|
||||||
|
|
||||||
MinIO randomly and uniformly distributes the data and parity blocks across
|
The following table lists the outcome of varying erasure code parity levels on
|
||||||
drives in the erasure set with *no overlap*. While a drive may contain both data
|
a MinIO deployment consisting of 1 node and 16 1TB drives:
|
||||||
and parity blocks for multiple unique objects, a single unique object has no
|
|
||||||
more than one block per drive in the set. For versioned objects, MinIO selects
|
|
||||||
the same drives for both data and parity storage while maintaining zero overlap
|
|
||||||
on any single drive.
|
|
||||||
|
|
||||||
The specified parity for an object also dictates the minimum number of Erasure
|
.. list-table:: Outcome of Parity Settings on a 16 Drive MinIO Cluster
|
||||||
Set drives ("Quorum") required for MinIO to either read or write that object:
|
:header-rows: 1
|
||||||
|
:widths: 20 20 20 20 20
|
||||||
|
:width: 100%
|
||||||
|
|
||||||
.. _minio-read-quorum:
|
* - Parity
|
||||||
|
- Total Storage
|
||||||
|
- Storage Ratio
|
||||||
|
- Minimum Drives for Read Operations
|
||||||
|
- Minimum Drives for Write Operations
|
||||||
|
|
||||||
Read Quorum
|
* - ``EC: 4`` (Default)
|
||||||
The minimum number of Erasure Set drives required for MinIO to
|
- 12 Tebibytes
|
||||||
serve read operations. MinIO can automatically reconstruct an object
|
- 0.750
|
||||||
with corrupted or missing data blocks if enough drives are online to
|
- 12
|
||||||
provide Read Quorum for that object.
|
- 12
|
||||||
|
|
||||||
MinIO Read Quorum is ``DRIVES - (EC:N)``.
|
|
||||||
|
|
||||||
.. _minio-write-quorum:
|
* - ``EC: 6``
|
||||||
|
- 10 Tebibytes
|
||||||
|
- 0.625
|
||||||
|
- 10
|
||||||
|
- 10
|
||||||
|
|
||||||
Write Quorum
|
* - ``EC: 8``
|
||||||
The minimum number of Erasure Set drives required for MinIO
|
- 8 Tebibytes
|
||||||
to serve write operations. MinIO requires enough available drives to
|
- 0.500
|
||||||
eliminate the risk of split-brain scenarios.
|
- 8
|
||||||
|
- 9
|
||||||
MinIO Write Quorum is ``(DRIVES - (EC:N)) + 1``.
|
|
||||||
|
|
||||||
.. _minio-ec-storage-class:
|
.. _minio-ec-storage-class:
|
||||||
|
|
||||||
@ -225,6 +198,8 @@ MinIO provides the following two storage classes:
|
|||||||
The maximum value is half of the total drives in the
|
The maximum value is half of the total drives in the
|
||||||
:ref:`Erasure Set <minio-ec-erasure-set>`.
|
:ref:`Erasure Set <minio-ec-erasure-set>`.
|
||||||
|
|
||||||
|
The minimum value is ``2``.
|
||||||
|
|
||||||
``STANDARD`` parity *must* be greater than or equal to
|
``STANDARD`` parity *must* be greater than or equal to
|
||||||
``REDUCED_REDUNDANCY``. If ``REDUCED_REDUNDANCY`` is unset, ``STANDARD``
|
``REDUCED_REDUNDANCY``. If ``REDUCED_REDUNDANCY`` is unset, ``STANDARD``
|
||||||
parity *must* be greater than 2
|
parity *must* be greater than 2
|
||||||
|
BIN
source/images/erasure-code.jpg
Normal file
BIN
source/images/erasure-code.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 106 KiB |
@ -25,10 +25,13 @@ Standalone Deployments
|
|||||||
to the full set of MinIO's advanced S3 features and functionality.
|
to the full set of MinIO's advanced S3 features and functionality.
|
||||||
|
|
||||||
Distributed Deployments
|
Distributed Deployments
|
||||||
Two or more MinIO servers with multiple storage volumes per server.
|
One or more MinIO servers with *at least* four total storage volumes across
|
||||||
Distributed deployments are best for production environments and
|
all servers. Distributed deployments are best for production environments and
|
||||||
workloads and support all of MinIO's core and advanced S3 features and
|
workloads and support all of MinIO's core and advanced S3 features and
|
||||||
functionality.
|
functionality.
|
||||||
|
|
||||||
|
MinIO recommends a baseline topology of 4 nodes with 4 drives each
|
||||||
|
for production environments.
|
||||||
|
|
||||||
.. _minio-installation-comparison:
|
.. _minio-installation-comparison:
|
||||||
|
|
||||||
|
@ -43,7 +43,7 @@ Cluster Write Quorum
|
|||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
Use the following endpoint to test if a MinIO cluster has
|
Use the following endpoint to test if a MinIO cluster has
|
||||||
:ref:`write quorum <minio-write-quorum>`:
|
:ref:`write quorum <minio-ec-parity>`:
|
||||||
|
|
||||||
.. code-block:: shell
|
.. code-block:: shell
|
||||||
:class: copyable
|
:class: copyable
|
||||||
@ -76,7 +76,7 @@ Cluster Read Quorum
|
|||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
Use the following endpoint to test if a MinIO cluster has
|
Use the following endpoint to test if a MinIO cluster has
|
||||||
:ref:`read quorum <minio-read-quorum>`:
|
:ref:`read quorum <minio-ec-parity>`:
|
||||||
|
|
||||||
.. code-block:: shell
|
.. code-block:: shell
|
||||||
:class: copyable
|
:class: copyable
|
||||||
@ -104,7 +104,7 @@ Cluster Maintenance Check
|
|||||||
-------------------------
|
-------------------------
|
||||||
|
|
||||||
Use the following endpoint to test if the MinIO cluster can maintain
|
Use the following endpoint to test if the MinIO cluster can maintain
|
||||||
both :ref:`read <minio-read-quorum>` and :ref:`write <minio-write-quorum>`
|
both :ref:`read <minio-ec-parity>` and :ref:`write <minio-ec-parity>`
|
||||||
if the specified MinIO server is taken down for maintenance:
|
if the specified MinIO server is taken down for maintenance:
|
||||||
|
|
||||||
.. code-block:: shell
|
.. code-block:: shell
|
||||||
|
Reference in New Issue
Block a user