mirror of
https://github.com/minio/docs.git
synced 2025-04-21 08:05:59 +03:00
Closes #1191 # Summary Finally getting around to this mc release - Added docs for enc-c, enc-s3, enc-kms - Some docs are making assumptions around behavior that needs to be fixed _first_ - Drive-by linker fixes Staged: http://192.241.195.202:9000/staging/DOCS-1191/linux/index.html --------- Co-authored-by: Andrea Longo <feorlen@users.noreply.github.com> Co-authored-by: Daryl White <53910321+djwfyi@users.noreply.github.com>
306 lines
8.9 KiB
ReStructuredText
306 lines
8.9 KiB
ReStructuredText
==========
|
|
``mc sql``
|
|
==========
|
|
|
|
.. default-domain:: minio
|
|
|
|
.. contents:: Table of Contents
|
|
:local:
|
|
:depth: 2
|
|
|
|
.. mc:: mc sql
|
|
|
|
Syntax
|
|
------
|
|
|
|
.. start-mc-sql-desc
|
|
|
|
The :mc:`mc sql` command provides an S3 Select interface for performing sql queries on objects in the specified MinIO deployment.
|
|
|
|
.. end-mc-sql-desc
|
|
|
|
See :s3-docs:`Selecting content from objects <selecting-content-from-objects>` for more information on S3 Select behavior and limitations.
|
|
|
|
.. tab-set::
|
|
|
|
.. tab-item:: EXAMPLE
|
|
|
|
The following command queries all objects in the ``mydata`` bucket on the ``myminio`` MinIO deployment:
|
|
|
|
.. code-block:: shell
|
|
:class: copyable
|
|
|
|
mc sql --recursive --query "select * from S3Object" myminio/mydata
|
|
|
|
.. tab-item:: SYNTAX
|
|
|
|
The command has the following syntax:
|
|
|
|
.. code-block:: shell
|
|
:class: copyable
|
|
|
|
mc [GLOBALFLAGS] mc sql \
|
|
--query "string" \
|
|
[--csv-input "string"] \
|
|
[--compression "string"] \
|
|
[--csv-output "string"] \
|
|
[--csv-output-header "string"] \
|
|
[--enc-c "string"] \
|
|
[--json-input "string"] \
|
|
[--json-output "string"] \
|
|
[--recursive] \
|
|
ALIAS
|
|
|
|
.. include:: /includes/common-minio-mc.rst
|
|
:start-after: start-minio-syntax
|
|
:end-before: end-minio-syntax
|
|
|
|
Parameters
|
|
~~~~~~~~~~
|
|
|
|
.. mc-cmd:: ALIAS
|
|
:required:
|
|
|
|
The full path to the bucket or object to run the SQL query against.
|
|
Specify the :ref:`alias <alias>` of a configured S3 service as the prefix to the ``ALIAS`` path.
|
|
For example:
|
|
|
|
.. code-block:: shell
|
|
|
|
mc sql [FLAGS] play/mybucket
|
|
|
|
.. mc-cmd:: --query, e
|
|
:required:
|
|
|
|
The SQL statement to execute on the specified :mc-cmd:`~mc sql ALIAS` directory or object.
|
|
Wrap the entire SQL query in double quotes ``"``.
|
|
|
|
Defaults to ``"select * from S3Object"``.
|
|
|
|
.. mc-cmd:: --csv-input
|
|
:optional:
|
|
|
|
The data format for ``.csv`` input objects.
|
|
Specify a string of comma-seperated ``key=value,...`` pairs.
|
|
See :ref:`mc-sql-csv-format` for more information on valid keys.
|
|
|
|
.. mc-cmd:: --compression
|
|
:optional:
|
|
|
|
The compression type of the input object.
|
|
Specify one of the following supported values:
|
|
|
|
- ``GZIP``
|
|
- ``BZIP2``
|
|
- ``NONE`` (default)
|
|
|
|
Compression schemes supported by MinIO backend only:
|
|
|
|
- ``ZSTD`` `Zstandard <https://facebook.github.io/zstd/>`__
|
|
- ``LZ4`` `LZ4 <https://lz4.github.io/lz4/>`__ stream
|
|
- ``S2`` `S2 <https://github.com/klauspost/compress/tree/master/s2#s2-compression>`__ framed stream
|
|
- ``SNAPPY`` `Snappy <http://google.github.io/snappy/>`__ framed stream
|
|
|
|
.. mc-cmd:: --csv-output
|
|
:optional:
|
|
|
|
The data format for ``.csv`` output.
|
|
Specify a string of comma-seperated ``key=value,...`` pairs.
|
|
See :ref:`mc-sql-csv-format` for more information on valid keys.
|
|
|
|
See the S3 API :s3-api:`CSVOutput <API_CSVOutput.html>` for more information.
|
|
|
|
.. mc-cmd:: --csv-output-header
|
|
:optional:
|
|
|
|
The header row of the ``.csv`` output file.
|
|
Specify a string of comma-separated fields as ``field1,field2,...``.
|
|
|
|
Omit to output a ``.csv`` with no header row.
|
|
|
|
.. block include of enc-c
|
|
|
|
.. include:: /includes/common-minio-sse.rst
|
|
:start-after: start-minio-mc-sse-c-only
|
|
:end-before: end-minio-mc-sse-options
|
|
|
|
.. mc-cmd:: --json-input
|
|
:optional:
|
|
|
|
The data format for ``.json`` or ``.ndjson`` input objects.
|
|
Specify the type of the JSON contents as ``type=<VALUE>``.
|
|
The value can be either:
|
|
|
|
- ``DOCUMENT`` - JSON `document <https://www.json.org/json-en.html>`__.
|
|
- ``LINES`` - JSON `lines <http://jsonlines.org/>`__.
|
|
|
|
See the S3 API :s3-api:`JSONInput <API_JSONInput.html>` for more information.
|
|
|
|
.. mc-cmd:: --json-output
|
|
:optional:
|
|
|
|
The data format for the ``.json`` output.
|
|
Supports the ``rd=value`` key, where ``rd`` is the ``RecordDelimiter`` for the JSON document.
|
|
|
|
Omit to use the default newline character ``\n``.
|
|
|
|
See the S3 API :s3-api:`JSONOutput <API_JSONOutput.html>` for more information.
|
|
|
|
.. mc-cmd:: --recursive, r
|
|
:optional:
|
|
|
|
Recursively searches the specified :mc-cmd:`~mc sql ALIAS` directory using the :mc-cmd:`~mc sql --query` SQL statement.
|
|
|
|
Global Flags
|
|
~~~~~~~~~~~~
|
|
|
|
.. include:: /includes/common-minio-mc.rst
|
|
:start-after: start-minio-mc-globals
|
|
:end-before: end-minio-mc-globals
|
|
|
|
Examples
|
|
--------
|
|
|
|
Select all Columns in all Objects in a Bucket
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Use :mc:`mc sql` with the :mc-cmd:`~mc sql --recursive` and :mc-cmd:`~mc sql --query` options to apply the query to all objects in a bucket:
|
|
|
|
.. code-block:: shell
|
|
:class: copyable
|
|
|
|
mc sql --recursive --query "select * from S3Object" ALIAS/PATH
|
|
|
|
- Replace :mc-cmd:`ALIAS <mc sql ALIAS>` with the :ref:`alias <alias>` of the MinIO deployment.
|
|
|
|
- Replace :mc-cmd:`PATH <mc sql ALIAS>` with the path to the bucket on the MinIO deployment.
|
|
|
|
Run an Aggregation Query on an Object
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Use :mc:`mc sql` with the :mc-cmd:`~mc sql --query` option to query an object on an MinIO deployment:
|
|
|
|
.. code-block:: shell
|
|
|
|
mc sql --query "select count(s.power) from S3Object" ALIAS/PATH
|
|
|
|
- Replace :mc-cmd:`ALIAS <mc sql ALIAS>` with the :ref:`alias <alias>` of the MinIO deployment.
|
|
|
|
- Replace :mc-cmd:`PATH <mc sql ALIAS>` with the path to the object on the MinIO deployment.
|
|
|
|
Behavior
|
|
--------
|
|
|
|
Input Formats
|
|
~~~~~~~~~~~~~
|
|
|
|
:mc:`mc sql` supports the following input formats:
|
|
|
|
.. list-table:: Input Format Types
|
|
:header-rows: 1
|
|
|
|
* - Type
|
|
- ``content-type`` Value
|
|
|
|
* - ``.csv``
|
|
- ``text/csv``
|
|
|
|
* - ``.json``
|
|
- ``application/json``
|
|
|
|
* - ``.parquet``
|
|
- none
|
|
|
|
For ``.csv`` file types, use :mc-cmd:`mc sql --csv-input` to specify the CSV data format.
|
|
See :ref:`mc-sql-csv-format` for more information on CSV formatting fields.
|
|
|
|
For ``.json`` file types, use :mc-cmd:`mc sql --json-input` to specify the JSON data format.
|
|
|
|
For ``.parquet`` file types, :mc:`mc sql` automatically interprets the data format.
|
|
|
|
:mc:`mc sql` determines the type by the file extension of the target object.
|
|
For example, an object named ``data.json`` is interpreted as a JSON file.
|
|
|
|
You can query data of a supported type but a different extension if the object has the appropriate ``content-type``.
|
|
For more information, see :mc-cmd:`mc cp --attr`.
|
|
|
|
.. _mc-sql-csv-format:
|
|
|
|
CSV Formatting Fields
|
|
~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
The following table lists valid key-value pairs for use with :mc-cmd:`mc sql --csv-input` and :mc-cmd:`mc sql --csv-output`.
|
|
Certain key pairs are only valid for :mc-cmd:`~mc sql --csv-input`.
|
|
See the documentation for S3 API :s3-api:`CSVInput <API_CSVInput.html>` for more information on S3 CSV formatting.
|
|
|
|
.. list-table::
|
|
:header-rows: 1
|
|
:widths: 20 20 60
|
|
:width: 100%
|
|
|
|
* - Key
|
|
- ``--csv-input`` Only
|
|
- Description
|
|
|
|
* - ``rd``
|
|
-
|
|
- The character that seperates each record (row) in the input ``.csv`` file.
|
|
|
|
Corresponds to ``RecordDelimiter`` in the S3 API ``CSVInput``.
|
|
|
|
* - ``fd``
|
|
-
|
|
- The character that seperates each field in a record. Defaults to ``,``.
|
|
|
|
Corresponds to ``FieldDelimeter`` in the S3 API ``CSVInput``.
|
|
|
|
* - ``qc``
|
|
-
|
|
- The character used for escaping when the ``fd`` character is part of a value. Defaults to ``"``.
|
|
|
|
Corresponds to ``QuoteCharacter`` in the S3 API ``CSVInput``.
|
|
|
|
* - ``qec``
|
|
-
|
|
- The character used for escaping a quotation mark ``"`` character inside an already escaped value.
|
|
|
|
Corresponds to ``QuoteEscapeCharacter`` in the S3 API ``CSVInput``.
|
|
|
|
* - ``fh``
|
|
- Yes
|
|
- The content of the first line in the ``.csv`` file.
|
|
|
|
Specify one of the following supported values:
|
|
|
|
- ``NONE`` - The first line is not a header.
|
|
- ``IGNORE`` - Ignore the first line.
|
|
- ``USE`` - The first line is a header.
|
|
|
|
For ``NONE`` or ``IGNORE``, you must specify column positions ``_#`` to identify a column in the :mc-cmd:`~mc sql --query` statement.
|
|
|
|
For ``USE``, you can specify header values to identify a column in the :mc-cmd:`~mc sql --query` statement.
|
|
|
|
Corresponds to ``FieldHeaderInfo`` in the S3 API ``CSVInput``.
|
|
|
|
* - ``cc``
|
|
- Yes
|
|
- The character used to indicate a record should be ignored.
|
|
The character *must* appear at the beginning of the record.
|
|
|
|
Corresponds to ``Comment`` in the S3 API ``CSVInput``.
|
|
|
|
* - ``qrd``
|
|
- Yes
|
|
- Specify ``TRUE`` to indicate that fields may contain record delimiter values (``rd``).
|
|
|
|
Defaults to ``FALSE``.
|
|
|
|
Corresponds to ``AllowQuotedRecordDelimiter`` in the S3 API ``CSVInput``.
|
|
|
|
S3 Compatibility
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
.. include:: /includes/common-minio-mc.rst
|
|
:start-after: start-minio-mc-s3-compatibility
|
|
:end-before: end-minio-mc-s3-compatibility
|