mirror of
https://github.com/minio/docs.git
synced 2025-07-30 07:03:26 +03:00
DOCS-763 part 2: Eco feedback (#853)
Context:
89bc154bbf
Responding to Eco's review points.
Staged: http://192.241.195.202:9000/staging/DOCS-763-2/linux/index.html
This commit is contained in:
@ -35,7 +35,7 @@ A production MinIO deployment consists of at least 4 MinIO hosts with homogeneou
|
||||
|
||||
Each MinIO host in this pool has matching compute, storage, and network configurations
|
||||
|
||||
MinIO provides best performance when using direct-attached storage (DAS), such as NVMe or SSD drives attached to a PCI-E controller board on the host machine.
|
||||
MinIO provides best performance when using locally-attached storage, such as NVMe or SSD drives attached to a PCI-E controller board on the host machine.
|
||||
Storage controllers should present XFS-formatted drives in "Just a Bunch of Drives" (JBOD) configurations with no RAID, pooling, or other hardware/software resiliency layers.
|
||||
MinIO recommends against caching, either at the drive or the controller layer.
|
||||
Either type of caching can cause :abbr:`I/O (Input / Output)` spikes as the cache fills and clears, resulting in unpredictable performance.
|
||||
@ -49,8 +49,8 @@ MinIO provides best performance when using direct-attached storage (DAS), such a
|
||||
|
||||
MinIO automatically groups drives in the pool into :ref:`erasure sets <minio-ec-erasure-set>`.
|
||||
Erasure sets are the foundational component of MinIO :ref:`availability and resiliency <minio_availability-resiliency>`.
|
||||
MinIO stripes erasure sets across the nodes in the pool to maintain even distribution of erasure set drives.
|
||||
MinIO then shards objects into data and parity blocks based on the deployment :ref:`parity <minio-ec-parity>` and distributes them across an erasure set.
|
||||
MinIO stripes erasure sets symmetrically across the nodes in the pool to maintain even distribution of erasure set drives.
|
||||
MinIO then partitions objects into data and parity shards based on the deployment :ref:`parity <minio-ec-parity>` and distributes them across an erasure set.
|
||||
|
||||
For a more complete discussion of MinIO redundancy and healing, see :ref:`minio-erasure-coding`.
|
||||
|
||||
@ -61,7 +61,7 @@ MinIO automatically groups drives in the pool into :ref:`erasure sets <minio-ec-
|
||||
|
||||
With the default parity of ``EC:4``, MinIO shards the object into 4 data and 4 parity blocks, distributing them across the drives in the erasure set.
|
||||
|
||||
MinIO uses a deterministic algorithm to select the erasure set for a given object.
|
||||
MinIO uses a deterministic hashing algorithm based on object name and path to select the erasure set for a given object.
|
||||
For each unique object namespace ``BUCKET/PREFIX/[PREFIX/...]/OBJECT.EXTENSION``, MinIO always selects the same erasure set for read/write operations.
|
||||
MinIO handles all routing within pools and erasure sets, making the select/read/write process entirely transparent to applications.
|
||||
|
||||
@ -93,6 +93,7 @@ You can expand a MinIO deployment's available storage through :ref:`pool expansi
|
||||
The pool which contains the correct erasure set then responds to the operation, remaining entirely transparent to the application.
|
||||
|
||||
If you modify the MinIO topology through pool expansion, you can update your applications by modifying the load balancer to include the new pool's nodes.
|
||||
Applications can continue using the load balancer address for the MinIO deployment without any updates or modifications.
|
||||
This ensures even distribution of requests across all pools, while applications continue using the single load balancer URL for MinIO operations.
|
||||
|
||||
.. figure:: /images/architecture/architecture-load-balancer-multi-pool.svg
|
||||
@ -104,9 +105,7 @@ You can expand a MinIO deployment's available storage through :ref:`pool expansi
|
||||
Once identified, MinIO partitions the object and distributes the data and parity shards across the appropriate set.
|
||||
|
||||
Client applications can use any S3-compatible SDK or library to interact with the MinIO deployment.
|
||||
MinIO publishes its own :ref:`drivers <minio-drivers>` specifically intended for use with S3-compatible deployments.
|
||||
Regardless of the driver, the S3 API uses HTTP methods like ``GET`` and ``POST`` for all operations.
|
||||
Neither MinIO nor S3 implements proprietary wire protocols or other low-level interfaces for normal operations.
|
||||
MinIO publishes its own :ref:`SDK <minio-drivers>` specifically intended for use with S3-compatible deployments.
|
||||
|
||||
.. figure:: /images/architecture/architecture-multiple-clients.svg
|
||||
:figwidth: 100%
|
||||
@ -115,11 +114,12 @@ Client applications can use any S3-compatible SDK or library to interact with th
|
||||
Clients using a variety of S3-compatible SDKs can perform operations against the same MinIO deployment.
|
||||
|
||||
MinIO uses a strict implementation of the S3 API, including requiring clients to sign all operations using AWS :s3-api:`Signature V4 <sig-v4-authenticating-requests.html>` or the legacy Signature V2.
|
||||
AWS signature calculation uses the client-provided headers, such that any modification to those headers by load balancers, proxies, security programs, or other components can result in signature mismatch errors.
|
||||
AWS signature calculation uses the client-provided headers, such that any modification to those headers by load balancers, proxies, security programs, or other components will result in signature mismatch errors and request failure.
|
||||
Ensure any such intermediate components support pass-through of unaltered headers from client to server.
|
||||
|
||||
The complexity of signature calculation typically makes interfacing via ``curl`` or similar REST clients difficult or impractical.
|
||||
MinIO recommends using S3-compatible drivers which perform the signature calculation automatically as part of operations.
|
||||
While the S3 API uses HTTP methods like ``GET`` and ``POST`` for all operations, applications typically use an SDK for S3 operations.
|
||||
In particular, the complexity of signature calculation typically makes interfacing via ``curl`` or similar REST clients impractical.
|
||||
MinIO recommends using S3-compatible SDKs or libraries which perform the signature calculation automatically as part of operations.
|
||||
|
||||
Replicated MinIO Deployments
|
||||
----------------------------
|
||||
@ -170,12 +170,12 @@ Deploying a global load balancer or similar network appliance with support for s
|
||||
The load balancer should meet the same requirements as single-site deployments regarding connection balancing and header preservation.
|
||||
MinIO replication handles transient failures by queuing objects for replication.
|
||||
|
||||
MinIO replication can automatically heal a site that has partial data loss due to transient or sustained downtime.
|
||||
MinIO replication can automatically heal a site that has partial or total data loss due to transient or sustained downtime.
|
||||
If a peer site completely fails, you can remove that site from the configuration entirely.
|
||||
The load balancer configuration should also remove that site to avoid routing client requests to the offline site.
|
||||
|
||||
You can then restore the peer site, either after repairing the original hardware or replacing it entirely, by adding it back to the site replication configuration.
|
||||
MinIO automatically begins resynchronizing content.
|
||||
MinIO automatically begins resynchronizing existing data while continuously replicating new data.
|
||||
|
||||
.. figure:: /images/architecture/architecture-load-balancer-multi-site-healing.svg
|
||||
:figwidth: 100%
|
||||
|
@ -63,7 +63,9 @@ MinIO requires :ref:`read and write quorum <minio-read-quorum>` to perform read
|
||||
|
||||
Write quorum depends on the configured parity and the size of the erasure set.
|
||||
If parity is less than 1/2 (half) the number of erasure set drives, write quorum equals parity and functions similarly to read quorum.
|
||||
MinIO automatically "upgrades" the parity of objects written to a degraded erasure set to ensure that object can meet the same :abbr:`SLA (Service Level Agreement)` as objects in healthy erasure sets.
|
||||
|
||||
MinIO automatically increases the parity of objects written to a degraded erasure set to ensure that object can meet the same :abbr:`SLA (Service Level Agreement)` as objects in healthy erasure sets.
|
||||
The parity upgrade behavior provides an additional layer of risk mitigation, but cannot replace the long-term solution of repairing or replacing damaged drives to bring the erasure set back to full healthy status.
|
||||
|
||||
.. figure:: /images/availability/availability-erasure-sharding-degraded-write.svg
|
||||
:figwidth: 100%
|
||||
@ -74,7 +76,6 @@ Write quorum depends on the configured parity and the size of the erasure set.
|
||||
MinIO writes the object with an upgraded parity of ``EC:6`` to ensure this object meets the same SLA as other objects.
|
||||
|
||||
With the default parity of ``EC:4``, the deployment can tolerate the loss of 4 drives per erasure set and still serve write operations.
|
||||
MinIO can perform "parity upgrade" up to /2 the drives in the erasure set.
|
||||
|
||||
If parity equals 1/2 (half) the number of erasure set drives, write quorum equals parity + 1 (one) to avoid data inconsistency due to "split brain" scenarios.
|
||||
For example, if exactly half the drives in the erasure set become isolated due to a network fault, MinIO would consider quorum lost as it cannot establish a N+1 group of drives for the write operation.
|
||||
@ -88,7 +89,7 @@ If parity equals 1/2 (half) the number of erasure set drives, write quorum equal
|
||||
If parity is ``EC:8``, this erasure set cannot meet write quorum and MinIO rejects write operations to that set.
|
||||
Since the erasure set still maintains read quorum, read operations to existing objects can still succeed.
|
||||
|
||||
An erasure set which loses more drives than the configured parity has suffered data loss.
|
||||
An erasure set which permanently loses more drives than the configured parity has suffered data loss.
|
||||
For maximum parity configurations, the erasure set goes into "read only" mode if drive loss equals parity.
|
||||
For the maximum erasure set size of 16 and maximum parity of 8, this would require the loss of 9 drives for data loss to occur.
|
||||
|
||||
@ -100,10 +101,12 @@ An erasure set which loses more drives than the configured parity has suffered d
|
||||
This erasure set has lost more drives than the configured parity of ``EC:4`` and has therefore lost both read and write quorum.
|
||||
MinIO cannot recover any data stored on this erasure set.
|
||||
|
||||
MinIO further mitigates the risk of erasure set failure by "striping" erasure set drives across each node in the pool.
|
||||
Transient or temporary drive failures, such as due to a failed storage controller or connecting hardware, may recover back to normal operational status within the erasure set.
|
||||
|
||||
MinIO further mitigates the risk of erasure set failure by "striping" erasure set drives symmetrically across each node in the pool.
|
||||
MinIO automatically calculates the optimal erasure set size based on the number of nodes and drives, where the maximum set size is 16 (sixteen).
|
||||
It then selects one drive per node going across the pool for each erasure set, circling around if the erasure set stripe size is greater than the number of nodes.
|
||||
This topology improves resiliency to the loss of a single node, or even a storage controller on that node.
|
||||
This topology provides resiliency to the loss of a single node, or even a storage controller on that node.
|
||||
|
||||
.. figure:: /images/availability/availability-erasure-sharding-striped.svg
|
||||
:figwidth: 100%
|
||||
@ -130,7 +133,8 @@ Each erasure set is independent of all others in the same pool.
|
||||
One pool has a degraded erasure set.
|
||||
While MinIO can no longer serve read/write operations to that erasure set, it can continue to serve operations on healthy erasure sets in that pool.
|
||||
|
||||
Since erasure sets are independent, you cannot restore data to a completely degraded erasure set using other erasure sets.
|
||||
However, the lost data may still impact workloads which rely on the assumption of 100% data availability.
|
||||
Furthermore, each erasure set is fully independent of the other such that you cannot restore data to a completely degraded erasure set using other erasure sets.
|
||||
You must use :ref:`Site <minio-site-replication-overview>` or :ref:`Bucket <minio-bucket-replication>` replication to create a :abbr:`BC/DR (Business Continuity / Disaster Recovery)`-ready remote deployment for restoring lost data.
|
||||
|
||||
For multi-pool MinIO deployments, each pool requires at least one erasure set maintaining read/write quorum to continue performing operations.
|
||||
|
@ -15,6 +15,9 @@ Erasure Coding provides object-level healing with significantly less overhead th
|
||||
|
||||
MinIO partitions each new object into data and parity shards, where parity shards support reconstruction of missing or corrupted data shards.
|
||||
MinIO writes these shards to a single :ref:`erasure set <minio-ec-erasure-set>` in the deployment.
|
||||
MinIO can use either data or parity shards to reconstruct an object, as long as the erasure set has :ref:`read quorum <minio-read-quorum>`.
|
||||
For example, MinIO can use parity shards local to the node receiving a request instead of specifically filtering only those nodes or drives containing data shards.
|
||||
|
||||
Since erasure set drives are striped across the server pool, a given node contains only a portion of data or parity shards for each object.
|
||||
MinIO can therefore tolerate the loss of multiple drives or nodes in the deployment depending on the configured parity and deployment topology.
|
||||
|
||||
|
Reference in New Issue
Block a user