mirror of
https://github.com/minio/docs.git
synced 2025-07-05 03:21:23 +03:00
Closes #911 --------- Co-authored-by: Daryl White <53910321+djwfyi@users.noreply.github.com> Co-authored-by: Andrea Longo <feorlen@users.noreply.github.com>
162 lines
10 KiB
ReStructuredText
162 lines
10 KiB
ReStructuredText
.. _minio-architecture:
|
|
|
|
=======================
|
|
Deployment Architecture
|
|
=======================
|
|
|
|
.. default-domain:: minio
|
|
|
|
.. contents:: Table of Contents
|
|
:local:
|
|
:depth: 2
|
|
|
|
.. meta::
|
|
:keywords: topology, architecture, deployment, production
|
|
:description: Information on MinIO Deployment architecture and topology in production environments
|
|
|
|
This page provides an overview of MinIO deployment architectures from a production perspective.
|
|
For information on specific hardware or software configurations, see:
|
|
|
|
- :ref:`Hardware Checklist <minio-hardware-checklist>`
|
|
- :ref:`Security Checklist <minio-security-checklist>`
|
|
- :ref:`Software Checklist <minio-software-checklists>`
|
|
- :ref:`Thresholds and Limits <minio-server-limits>`
|
|
|
|
Distributed MinIO Deployments
|
|
-----------------------------
|
|
|
|
A production MinIO deployment consists of at least 4 MinIO hosts with homogeneous storage and compute resources.
|
|
MinIO aggregates these resources together as a :ref:`pool <minio-intro-server-pool>` and presents itself as a single object storage service.
|
|
|
|
.. figure:: /images/architecture/architecture-4-node-deploy.svg
|
|
:figwidth: 100%
|
|
:alt: 4 Node MinIO deployment with homogeneous storage and compute resources
|
|
:align: center
|
|
|
|
Each MinIO host in this pool has matching compute, storage, and network configurations
|
|
|
|
MinIO provides best performance when using locally-attached storage, such as NVMe or SSD drives attached to a PCI-E controller board on the host machine.
|
|
Storage controllers should present XFS-formatted drives in "Just a Bunch of Drives" (JBOD) configurations with no RAID, pooling, or other hardware/software resiliency layers.
|
|
MinIO recommends against caching, either at the drive or the controller layer.
|
|
Either type of caching can cause :abbr:`I/O (Input / Output)` spikes as the cache fills and clears, resulting in unpredictable performance.
|
|
|
|
.. figure:: /images/architecture/architecture-one-node-DAS.svg
|
|
:figwidth: 100%
|
|
:alt: MinIO Server diagram of Direct-Attached Storage via SAS to a PCI-E Storage Controller
|
|
:align: center
|
|
|
|
Each SSD connects by SAS to a PCI-E-attached storage controller operating in HBA mode
|
|
|
|
MinIO automatically groups drives in the pool into :ref:`erasure sets <minio-ec-erasure-set>`.
|
|
Erasure sets are the foundational component of MinIO :ref:`availability and resiliency <minio-availability-resiliency>`.
|
|
MinIO stripes erasure sets symmetrically across the nodes in the pool to maintain even distribution of erasure set drives.
|
|
MinIO then partitions objects into data and parity shards based on the deployment :ref:`parity <minio-ec-parity>` and distributes them across an erasure set.
|
|
|
|
For a more complete discussion of MinIO redundancy and healing, see :ref:`minio-erasure-coding`.
|
|
|
|
.. figure:: /images/architecture/architecture-erasure-set-shard.svg
|
|
:figwidth: 100%
|
|
:alt: Diagram of object being sharded into eight data and eight parity blocks, distributed across sixteen drives
|
|
:align: center
|
|
|
|
With the maximum parity of ``EC:8``, MinIO shards the object into 8 data and 8 parity blocks, distributing them across the drives in the erasure set.
|
|
All erasure sets in this pool have the same stripe size and shard distribution.
|
|
|
|
MinIO uses a deterministic hashing algorithm based on object name and path to select the erasure set for a given object.
|
|
For each unique object namespace ``BUCKET/PREFIX/[PREFIX/...]/OBJECT.EXTENSION``, MinIO always selects the same erasure set for read/write operations.
|
|
MinIO handles all routing within pools and erasure sets, making the select/read/write process entirely transparent to applications.
|
|
|
|
.. figure:: /images/architecture/architecture-erasure-set-retrieve-object.svg
|
|
:figwidth: 100%
|
|
:alt: Diagram of object retrieval from only data shards
|
|
:align: center
|
|
|
|
MinIO reconstructs objects from data or parity shards transparently before returning the object to the requesting client.
|
|
|
|
Each MinIO server has a complete picture of the distributed topology, such that an application can connect and direct operations against any node in the deployment.
|
|
The MinIO responding node automatically handles routing internal requests to other nodes in the deployment *and* returning the final response to the client.
|
|
|
|
Applications typically should not manage those connections, as any changes to the deployment topology would require application updates.
|
|
Production environments should instead deploy a load balancer or similar network control plane component to manage connections to the MinIO deployment.
|
|
For example, you can deploy an NGINX load balancer to perform "least connections" or "round robin" load balancing against the available nodes in the deployment.
|
|
|
|
.. figure:: /images/architecture/architecture-load-balancer-8-node.svg
|
|
:figwidth: 100%
|
|
:alt: Diagram of an eight node MinIO deployment behind a load balancer
|
|
:align: center
|
|
|
|
The load balancer routes the request to any node in the deployment.
|
|
The receiving node handles any internode requests thereafter.
|
|
|
|
You can expand a MinIO deployment's available storage through :ref:`pool expansion <expand-minio-distributed>`.
|
|
Each pool consists of an independent group of nodes with their own erasure sets.
|
|
MinIO must query each pool to determine the correct erasure set to which it directs read and write operations, such that each additional pool adds increased internode traffic per call.
|
|
The pool which contains the correct erasure set then responds to the operation, remaining entirely transparent to the application.
|
|
|
|
If you modify the MinIO topology through pool expansion, you can update your applications by modifying the load balancer to include the new pool's nodes.
|
|
Applications can continue using the load balancer address for the MinIO deployment without any updates or modifications.
|
|
This ensures even distribution of requests across all pools, while applications continue using the single load balancer URL for MinIO operations.
|
|
|
|
.. figure:: /images/architecture/architecture-load-balancer-multi-pool.svg
|
|
:figwidth: 100%
|
|
:alt: Diagram of a multi-pool minio deployment behind a load balancer
|
|
:align: center
|
|
|
|
The PUT request requires checking each pool for the correct erasure set.
|
|
Once identified, MinIO partitions the object and distributes the data and parity shards across the appropriate set.
|
|
|
|
Client applications can use any S3-compatible SDK or library to interact with the MinIO deployment.
|
|
MinIO publishes its own :ref:`SDK <minio-drivers>` specifically intended for use with S3-compatible deployments.
|
|
|
|
.. figure:: /images/architecture/architecture-multiple-clients.svg
|
|
:figwidth: 100%
|
|
:alt: Diagram of multiple S3-compatible clients using SDKs to connect to MinIO
|
|
|
|
Clients using a variety of S3-compatible SDKs can perform operations against the same MinIO deployment.
|
|
|
|
MinIO uses a strict implementation of the S3 API, including requiring clients to sign all operations using AWS :s3-api:`Signature V4 <sig-v4-authenticating-requests.html>` or the legacy Signature V2.
|
|
AWS signature calculation uses the client-provided headers, such that any modification to those headers by load balancers, proxies, security programs, or other components will result in signature mismatch errors and request failure.
|
|
Ensure any such intermediate components support pass-through of unaltered headers from client to server.
|
|
|
|
While the S3 API uses HTTP methods like ``GET`` and ``POST`` for all operations, applications typically use an SDK for S3 operations.
|
|
In particular, the complexity of signature calculation typically makes interfacing via ``curl`` or similar REST clients impractical.
|
|
MinIO recommends using S3-compatible SDKs or libraries which perform the signature calculation automatically as part of operations.
|
|
|
|
.. _minio-deployment-architecture-replicated:
|
|
|
|
Replicated MinIO Deployments
|
|
----------------------------
|
|
|
|
MinIO :ref:`site replication <minio-site-replication-overview>` provides support for synchronizing distinct independent deployments.
|
|
You can deploy peer sites in different racks, datacenters, or geographic regions to support functions like :abbr:`BC/DR (Business Continuity / Disaster Recovery)` or geo-local read/write performance in a globally distributed MinIO object store.
|
|
|
|
.. figure:: /images/architecture/architecture-multi-site.svg
|
|
:figwidth: 100%
|
|
:alt: Diagram of a multi-site deployment with three MinIO peer site
|
|
|
|
A MinIO multi-site deployment with three peers.
|
|
Write operations on one peer replicate to all other peers in the configuration automatically.
|
|
|
|
Replication performance primarily depends on the network latency between each peer site.
|
|
With geographically distributed peer sites, high latency between sites can result in significant replication lag.
|
|
This can compound with workloads that are near or at the deployment's overall performance capacity, as the replication process itself requires sufficient free :abbr:`I/O (Input / Output)` to synchronize objects.
|
|
|
|
.. figure:: /images/architecture/architecture-multi-site-latency.svg
|
|
:figwidth: 100%
|
|
:alt: Diagram of a multi-site deployment with latency between sites
|
|
|
|
In this peer configuration, the latency between Site A and its peer sites is 100ms.
|
|
The soonest the object fully synchronizes to all sites is at least 110ms.
|
|
|
|
Deploying a global load balancer or similar network appliance with support for site-to-site failover protocols is critical to the functionality of multi-site deployments.
|
|
The load balancer should support a health probe/check setting to detect the failure of one site and automatically redirect applications to any remaining healthy peer.
|
|
|
|
.. figure:: /images/architecture/architecture-load-balancer-multi-site.svg
|
|
:figwidth: 100%
|
|
:alt: Diagram of a site replication deployment with two sites
|
|
|
|
The Load Balancer automatically routes client requests using configured logic (geo-local, latency, etc.).
|
|
Data written to one site automatically replicates to the other peer site.
|
|
|
|
The load balancer should meet the same requirements as single-site deployments regarding connection balancing and header preservation.
|
|
MinIO replication handles transient failures by queuing objects for replication. |