mirror of
https://github.com/minio/docs.git
synced 2025-04-25 17:22:39 +03:00
573 lines
18 KiB
ReStructuredText
573 lines
18 KiB
ReStructuredText
.. _deploy-minio-distributed:
|
|
|
|
====================================
|
|
Deploy MinIO: Multi-Node Multi-Drive
|
|
====================================
|
|
|
|
.. default-domain:: minio
|
|
|
|
.. contents:: Table of Contents
|
|
:local:
|
|
:depth: 1
|
|
|
|
Overview
|
|
--------
|
|
|
|
A distributed MinIO deployment consists of 4 or more drives/volumes managed by
|
|
one or more :mc:`minio server` process, where the processes manage pooling the
|
|
compute and storage resources into a single aggregated object storage resource.
|
|
Each MinIO server has a complete picture of the distributed topology, such that
|
|
an application can connect to any node in the deployment and perform S3
|
|
operations.
|
|
|
|
Distributed deployments implicitly enable :ref:`erasure coding
|
|
<minio-erasure-coding>`, MinIO's data redundancy and availability feature that
|
|
allows deployments to automatically reconstruct objects on-the-fly despite the
|
|
loss of multiple drives or nodes in the cluster. Erasure coding provides
|
|
object-level healing with less overhead than adjacent technologies such as RAID
|
|
or replication.
|
|
|
|
Depending on the configured :ref:`erasure code parity <minio-ec-parity>`, a
|
|
distributed deployment with ``m`` servers and ``n`` disks per server can
|
|
continue serving read and write operations with only ``m/2`` servers or
|
|
``m*n/2`` drives online and accessible.
|
|
|
|
Distributed deployments also support the following features:
|
|
|
|
- :ref:`Server-Side Object Replication <minio-bucket-replication-serverside>`
|
|
- :ref:`Write-Once Read-Many Locking <minio-bucket-locking>`
|
|
- :ref:`Object Versioning <minio-bucket-versioning>`
|
|
|
|
.. _deploy-minio-distributed-prereqs:
|
|
|
|
Prerequisites
|
|
-------------
|
|
|
|
Networking and Firewalls
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Each node should have full bidirectional network access to every other node in
|
|
the deployment. For containerized or orchestrated infrastructures, this may
|
|
require specific configuration of networking and routing components such as
|
|
ingress or load balancers. Certain operating systems may also require setting
|
|
firewall rules. For example, the following command explicitly opens the default
|
|
MinIO server API port ``9000`` for servers running firewalld :
|
|
|
|
.. code-block:: shell
|
|
:class: copyable
|
|
|
|
firewall-cmd --permanent --zone=public --add-port=9000/tcp
|
|
firewall-cmd --reload
|
|
|
|
All MinIO servers in the deployment *must* use the same listen port.
|
|
|
|
If you set a static :ref:`MinIO Console <minio-console>` port (e.g. ``:9001``)
|
|
you must *also* grant access to that port to ensure connectivity from external
|
|
clients.
|
|
|
|
MinIO **strongly recomends** using a load balancer to manage connectivity to the
|
|
cluster. The Load Balancer should use a "Least Connections" algorithm for
|
|
routing requests to the MinIO deployment, since any MinIO node in the deployment
|
|
can receive, route, or process client requests.
|
|
|
|
The following load balancers are known to work well with MinIO:
|
|
|
|
- `NGINX <https://www.nginx.com/products/nginx/load-balancing/>`__
|
|
- `HAProxy <https://cbonte.github.io/haproxy-dconv/2.3/intro.html#3.3.5>`__
|
|
|
|
Configuring firewalls or load balancers to support MinIO is out of scope for
|
|
this procedure.
|
|
|
|
Sequential Hostnames
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
MinIO *requires* using expansion notation ``{x...y}`` to denote a sequential
|
|
series of MinIO hosts when creating a server pool. MinIO therefore *requires*
|
|
using sequentially-numbered hostnames to represent each
|
|
:mc:`minio server` process in the deployment.
|
|
|
|
Create the necessary DNS hostname mappings *prior* to starting this procedure.
|
|
For example, the following hostnames would support a 4-node distributed
|
|
deployment:
|
|
|
|
- ``minio1.example.com``
|
|
- ``minio2.example.com``
|
|
- ``minio3.example.com``
|
|
- ``minio4.example.com``
|
|
|
|
You can specify the entire range of hostnames using the expansion notation
|
|
``minio{1...4}.example.com``.
|
|
|
|
Configuring DNS to support MinIO is out of scope for this procedure.
|
|
|
|
.. _deploy-minio-distributed-prereqs-storage:
|
|
|
|
Local JBOD Storage with Sequential Mounts
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. |deployment| replace:: deployment
|
|
|
|
.. include:: /includes/common-installation.rst
|
|
:start-after: start-local-jbod-desc
|
|
:end-before: end-local-jbod-desc
|
|
|
|
.. admonition:: Network File System Volumes Break Consistency Guarantees
|
|
:class: note
|
|
|
|
MinIO's strict **read-after-write** and **list-after-write** consistency
|
|
model requires local disk filesystems.
|
|
|
|
MinIO cannot provide consistency guarantees if the underlying storage
|
|
volumes are NFS or a similar network-attached storage volume.
|
|
|
|
For deployments that *require* using network-attached storage, use
|
|
NFSv4 for best results.
|
|
|
|
Considerations
|
|
--------------
|
|
|
|
Homogeneous Node Configurations
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
MinIO strongly recommends selecting substantially similar hardware
|
|
configurations for all nodes in the deployment. Ensure the hardware (CPU,
|
|
memory, motherboard, storage adapters) and software (operating system, kernel
|
|
settings, system services) is consistent across all nodes.
|
|
|
|
Deployment may exhibit unpredictable performance if nodes have heterogeneous
|
|
hardware or software configurations. Workloads that benefit from storing aged
|
|
data on lower-cost hardware should instead deploy a dedicated "warm" or "cold"
|
|
MinIO deployment and :ref:`transition <minio-lifecycle-management-tiering>`
|
|
data to that tier.
|
|
|
|
Erasure Coding Parity
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
MinIO :ref:`erasure coding <minio-erasure-coding>` is a data redundancy and
|
|
availability feature that allows MinIO deployments to automatically reconstruct
|
|
objects on-the-fly despite the loss of multiple drives or nodes in the cluster.
|
|
Erasure Coding provides object-level healing with less overhead than adjacent
|
|
technologies such as RAID or replication. Distributed deployments implicitly
|
|
enable and rely on erasure coding for core functionality.
|
|
|
|
Erasure Coding splits objects into data and parity blocks, where parity blocks
|
|
support reconstruction of missing or corrupted data blocks. The number of parity
|
|
blocks in a deployment controls the deployment's relative data redundancy.
|
|
Higher levels of parity allow for higher tolerance of drive loss at the cost of
|
|
total available storage.
|
|
|
|
MinIO defaults to ``EC:4`` , or 4 parity blocks per
|
|
:ref:`erasure set <minio-ec-erasure-set>`. You can set a custom parity
|
|
level by setting the appropriate
|
|
:ref:`MinIO Storage Class environment variable
|
|
<minio-server-envvar-storage-class>`. Consider using the MinIO
|
|
`Erasure Code Calculator <https://min.io/product/erasure-code-calculator>`__ for
|
|
guidance in selecting the appropriate erasure code parity level for your
|
|
cluster.
|
|
|
|
Capacity-Based Planning
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
MinIO generally recommends planning capacity such that
|
|
:ref:`server pool expansion <expand-minio-distributed>` is only required after
|
|
2+ years of deployment uptime.
|
|
|
|
For example, consider an application suite that is estimated to produce 10TB of
|
|
data per year. The MinIO deployment should provide *at minimum*:
|
|
|
|
``10TB + 10TB + 10TB = 30TB``
|
|
|
|
MinIO recommends adding buffer storage to account for potential growth in
|
|
stored data (e.g. 40TB of total usable storage). As a rule-of-thumb, more
|
|
capacity initially is preferred over frequent just-in-time expansion to meet
|
|
capacity requirements.
|
|
|
|
Since MinIO :ref:`erasure coding <minio-erasure-coding>` requires some
|
|
storage for parity, the total **raw** storage must exceed the planned **usable**
|
|
capacity. Consider using the MinIO `Erasure Code Calculator
|
|
<https://min.io/product/erasure-code-calculator>`__ for guidance in planning
|
|
capacity around specific erasure code settings.
|
|
|
|
Recommended Operating Systems
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. cond:: linux
|
|
|
|
This tutorial assumes all hosts running MinIO use a
|
|
:ref:`recommended Linux operating system <minio-installation-platform-support>`
|
|
such as RHEL8+ or Ubuntu 18.04+.
|
|
|
|
.. cond:: macos
|
|
|
|
This tutorial assumes all hosts running MinIO use a non-EOL macOS version (10.14+).
|
|
|
|
.. cond:: Windows
|
|
|
|
This tutorial assumes all hosts running MinIO use a non-EOL Windows distribution.
|
|
|
|
Support for running distributed MinIO deployments on Windows is *experimental*.
|
|
|
|
Pre-Existing Data
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
When starting a new MinIO server in a distributed environment, the storage devices must not have existing data.
|
|
|
|
Once you start the MinIO server, all interactions with the data must be done through the S3 API.
|
|
Use the :ref:`MinIO Client <minio-client>`, the :ref:`MinIO Console <minio-console>`, or one of the MinIO :ref:`Software Development Kits <minio-drivers>` to work with the buckets and objects.
|
|
|
|
.. warning::
|
|
|
|
Modifying files on the backend drives can result in data corruption or data loss.
|
|
|
|
.. _deploy-minio-distributed-baremetal:
|
|
|
|
Deploy Distributed MinIO
|
|
------------------------
|
|
|
|
The following procedure creates a new distributed MinIO deployment consisting
|
|
of a single :ref:`Server Pool <minio-intro-server-pool>`.
|
|
|
|
All commands provided below use example values. Replace these values with
|
|
those appropriate for your deployment.
|
|
|
|
Review the :ref:`deploy-minio-distributed-prereqs` before starting this
|
|
procedure.
|
|
|
|
1) Install the MinIO Binary on Each Node
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. cond:: linux
|
|
|
|
.. include:: /includes/linux/common-installation.rst
|
|
:start-after: start-install-minio-binary-desc
|
|
:end-before: end-install-minio-binary-desc
|
|
|
|
.. cond:: macos
|
|
|
|
.. include:: /includes/macos/common-installation.rst
|
|
:start-after: start-install-minio-binary-desc
|
|
:end-before: end-install-minio-binary-desc
|
|
|
|
2) Create the ``systemd`` Service File
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. include:: /includes/linux/common-installation.rst
|
|
:start-after: start-install-minio-systemd-desc
|
|
:end-before: end-install-minio-systemd-desc
|
|
|
|
3) Create the Service Environment File
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Create an environment file at ``/etc/default/minio``. The MinIO
|
|
service uses this file as the source of all
|
|
:ref:`environment variables <minio-server-environment-variables>` used by
|
|
MinIO *and* the ``minio.service`` file.
|
|
|
|
The following examples assumes that:
|
|
|
|
- The deployment has a single server pool consisting of four MinIO server hosts
|
|
with sequential hostnames.
|
|
|
|
.. code-block:: shell
|
|
|
|
minio1.example.com minio3.example.com
|
|
minio2.example.com minio4.example.com
|
|
|
|
- All hosts have four locally-attached disks with sequential mount-points:
|
|
|
|
.. code-block:: shell
|
|
|
|
/mnt/disk1/minio /mnt/disk3/minio
|
|
/mnt/disk2/minio /mnt/disk4/minio
|
|
|
|
- The deployment has a load balancer running at ``https://minio.example.net``
|
|
that manages connections across all four MinIO hosts.
|
|
|
|
Modify the example to reflect your deployment topology:
|
|
|
|
.. code-block:: shell
|
|
:class: copyable
|
|
|
|
# Set the hosts and volumes MinIO uses at startup
|
|
# The command uses MinIO expansion notation {x...y} to denote a
|
|
# sequential series.
|
|
#
|
|
# The following example covers four MinIO hosts
|
|
# with 4 drives each at the specified hostname and drive locations.
|
|
# The command includes the port that each MinIO server listens on
|
|
# (default 9000)
|
|
|
|
MINIO_VOLUMES="https://minio{1...4}.example.net:9000/mnt/disk{1...4}/minio"
|
|
|
|
# Set all MinIO server options
|
|
#
|
|
# The following explicitly sets the MinIO Console listen address to
|
|
# port 9001 on all network interfaces. The default behavior is dynamic
|
|
# port selection.
|
|
|
|
MINIO_OPTS="--console-address :9001"
|
|
|
|
# Set the root username. This user has unrestricted permissions to
|
|
# perform S3 and administrative API operations on any resource in the
|
|
# deployment.
|
|
#
|
|
# Defer to your organizations requirements for superadmin user name.
|
|
|
|
MINIO_ROOT_USER=minioadmin
|
|
|
|
# Set the root password
|
|
#
|
|
# Use a long, random, unique string that meets your organizations
|
|
# requirements for passwords.
|
|
|
|
MINIO_ROOT_PASSWORD=minio-secret-key-CHANGE-ME
|
|
|
|
# Set to the URL of the load balancer for the MinIO deployment
|
|
# This value *must* match across all MinIO servers. If you do
|
|
# not have a load balancer, set this value to to any *one* of the
|
|
# MinIO hosts in the deployment as a temporary measure.
|
|
MINIO_SERVER_URL="https://minio.example.net:9000"
|
|
|
|
You may specify other :ref:`environment variables
|
|
<minio-server-environment-variables>` or server commandline options as required
|
|
by your deployment. All MinIO nodes in the deployment should include the same
|
|
environment variables with the same values for each variable.
|
|
|
|
4) Add TLS/SSL Certificates
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. include:: /includes/common-installation.rst
|
|
:start-after: start-install-minio-tls-desc
|
|
:end-before: end-install-minio-tls-desc
|
|
|
|
5) Run the MinIO Server Process
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Issue the following commands on each node in the deployment to start the
|
|
MinIO service:
|
|
|
|
.. include:: /includes/linux/common-installation.rst
|
|
:start-after: start-install-minio-start-service-desc
|
|
:end-before: end-install-minio-start-service-desc
|
|
|
|
6) Open the MinIO Console
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
.. include:: /includes/common-installation.rst
|
|
:start-after: start-install-minio-console-desc
|
|
:end-before: end-install-minio-console-desc
|
|
|
|
7) Next Steps
|
|
~~~~~~~~~~~~~
|
|
|
|
- Create an :ref:`alias <minio-mc-alias>` for accessing the deployment using
|
|
:mc:`mc`.
|
|
|
|
- :ref:`Create users and policies to control access to the deployment
|
|
<minio-authentication-and-identity-management>`.
|
|
|
|
|
|
.. _deploy-minio-distributed-recommendations:
|
|
|
|
Deployment Recommendations
|
|
--------------------------
|
|
|
|
Minimum Nodes per Deployment
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
For all production deployments, MinIO recommends a *minimum* of 4 nodes per
|
|
:ref:`server pool <minio-intro-server-pool>` with 4 drives per server.
|
|
With the default :ref:`erasure code parity <minio-erasure-coding>` setting of
|
|
``EC:4``, this topology can continue serving read and write operations
|
|
despite the loss of up to 4 drives *or* one node.
|
|
|
|
The minimum recommendation reflects MinIO's experience with assisting enterprise
|
|
customers in deploying on a variety of IT infrastructures while maintaining the
|
|
desired SLA/SLO. While MinIO may run on less than the minimum recommended
|
|
topology, any potential cost savings come at the risk of decreased reliability.
|
|
|
|
Server Hardware
|
|
~~~~~~~~~~~~~~~
|
|
|
|
MinIO is hardware agnostic and runs on a variety of hardware architectures
|
|
ranging from ARM-based embedded systems to high-end x64 and POWER9 servers.
|
|
|
|
The following recommendations match MinIO's
|
|
`Reference Hardware <https://min.io/product/reference-hardware>`__ for
|
|
large-scale data storage:
|
|
|
|
.. list-table::
|
|
:stub-columns: 1
|
|
:widths: 20 80
|
|
:width: 100%
|
|
|
|
* - Processor
|
|
- Dual Intel Xeon Scalable Gold CPUs with 8 cores per socket.
|
|
|
|
* - Memory
|
|
- 128GB of Memory per pod
|
|
|
|
* - Network
|
|
- Minimum of 25GbE NIC and supporting network infrastructure between nodes.
|
|
|
|
MinIO can make maximum use of drive throughput, which can fully saturate
|
|
network links between MinIO nodes or clients. Large clusters may require
|
|
100GbE network infrastructure to fully utilize MinIO's per-node
|
|
performance potential.
|
|
|
|
* - Drives
|
|
- SATA/SAS NVMe/SSD with a minimum of 8 drives per server.
|
|
|
|
Drives should be :abbr:`JBOD (Just a Bunch of Disks)` arrays with
|
|
no RAID or similar technologies. MinIO recommends XFS formatting for
|
|
best performance.
|
|
|
|
Use the same type of disk (NVMe, SSD, or HDD) with the same capacity
|
|
across all nodes in the deployment. MinIO does not distinguish drive
|
|
types when using the underlying storage and does not benefit from mixed
|
|
storage types. Additionally. MinIO limits the size used per disk to the
|
|
smallest drive in the deployment. For example, if the deployment has 15
|
|
10TB disks and 1 1TB disk, MinIO limits the per-disk capacity to 1TB.
|
|
|
|
Networking
|
|
~~~~~~~~~~
|
|
|
|
MinIO recommends high speed networking to support the maximum possible
|
|
throughput of the attached storage (aggregated drives, storage controllers,
|
|
and PCIe busses). The following table provides general guidelines for the
|
|
maximum storage throughput supported by a given NIC:
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
:width: 100%
|
|
:widths: 40 60
|
|
|
|
* - NIC bandwidth (Gbps)
|
|
- Estimated Aggregated Storage Throughput (GBps)
|
|
|
|
* - 10GbE
|
|
- 1GBps
|
|
|
|
* - 25GbE
|
|
- 2.5GBps
|
|
|
|
* - 50GbE
|
|
- 5GBps
|
|
|
|
* - 100GbE
|
|
- 10GBps
|
|
|
|
CPU Allocation
|
|
~~~~~~~~~~~~~~
|
|
|
|
MinIO can perform well with consumer-grade processors. MinIO can take advantage
|
|
of CPUs which support AVX-512 SIMD instructions for increased performance of
|
|
certain operations.
|
|
|
|
MinIO benefits from allocating CPU based on the expected per-host network
|
|
throughput. The following table provides general guidelines for allocating CPU
|
|
for use by based on the total network bandwidth supported by the host:
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
:width: 100%
|
|
:widths: 40 60
|
|
|
|
* - Host NIC Bandwidth
|
|
- Recommended Pod vCPU
|
|
|
|
* - 10GbE or less
|
|
- 8 vCPU per pod.
|
|
|
|
* - 25GbE
|
|
- 16 vCPU per pod.
|
|
|
|
* - 50GbE
|
|
- 32 vCPU per pod.
|
|
|
|
* - 100GbE
|
|
- 64 vCPU per pod.
|
|
|
|
|
|
|
|
Memory Allocation
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
MinIO benefits from allocating memory based on the total storage of each host.
|
|
The following table provides general guidelines for allocating memory for use
|
|
by MinIO server processes based on the total amount of local storage on the
|
|
host:
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
:width: 100%
|
|
:widths: 40 60
|
|
|
|
* - Total Host Storage
|
|
- Recommended Host Memory
|
|
|
|
* - Up to 1 Tebibyte (Ti)
|
|
- 8GiB
|
|
|
|
* - Up to 10 Tebibyte (Ti)
|
|
- 16GiB
|
|
|
|
* - Up to 100 Tebibyte (Ti)
|
|
- 32GiB
|
|
|
|
* - Up to 1 Pebibyte (Pi)
|
|
- 64GiB
|
|
|
|
* - More than 1 Pebibyte (Pi)
|
|
- 128GiB
|
|
|
|
.. _minio-requests-per-node:
|
|
|
|
Requests Per Node
|
|
~~~~~~~~~~~~~~~~~
|
|
|
|
You can calculate the maximum number of concurrent requests per host with this formula:
|
|
|
|
:math:`totalRam / ramPerRequest`
|
|
|
|
To calculate the amount of RAM used for each request, use this formula:
|
|
|
|
:math:`((2MiB + 128KiB) * driveCount) + (2 * 10MiB) + (2 * 1 MiB)`
|
|
|
|
10MiB is the default erasure block size v1.
|
|
1 MiB is the default erasure block size v2.
|
|
|
|
The following table lists the maximum concurrent requests on a node based on the number of host drives and the *free* system RAM:
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
:width: 100%
|
|
|
|
* - Number of Drives
|
|
- 32 GiB of RAM
|
|
- 64 GiB of RAM
|
|
- 128 GiB of RAM
|
|
- 256 GiB of RAM
|
|
- 512 GiB of RAM
|
|
|
|
* - 4 Drives
|
|
- 1,074
|
|
- 2,149
|
|
- 4,297
|
|
- 8,595
|
|
- 17,190
|
|
|
|
* - 8 Drives
|
|
- 840
|
|
- 1,680
|
|
- 3,361
|
|
- 6,722
|
|
- 13,443
|
|
|
|
* - 16 Drives
|
|
- 585
|
|
- 1,170
|
|
- 2.341
|
|
- 4,681
|
|
- 9,362
|