1
0
mirror of https://github.com/minio/docs.git synced 2025-04-21 08:05:59 +03:00
Ravind Kumar 76e5e35ab3
DOCS-1191: Updating SSE params, general fixups (#1295)
Closes #1191 

# Summary

Finally getting around to this mc release

- Added docs for enc-c, enc-s3, enc-kms
- Some docs are making assumptions around behavior that needs to be
fixed _first_
- Drive-by linker fixes

Staged: http://192.241.195.202:9000/staging/DOCS-1191/linux/index.html

---------

Co-authored-by: Andrea Longo <feorlen@users.noreply.github.com>
Co-authored-by: Daryl White <53910321+djwfyi@users.noreply.github.com>
2024-08-26 11:54:49 -04:00

306 lines
8.9 KiB
ReStructuredText

==========
``mc sql``
==========
.. default-domain:: minio
.. contents:: Table of Contents
:local:
:depth: 2
.. mc:: mc sql
Syntax
------
.. start-mc-sql-desc
The :mc:`mc sql` command provides an S3 Select interface for performing sql queries on objects in the specified MinIO deployment.
.. end-mc-sql-desc
See :s3-docs:`Selecting content from objects <selecting-content-from-objects>` for more information on S3 Select behavior and limitations.
.. tab-set::
.. tab-item:: EXAMPLE
The following command queries all objects in the ``mydata`` bucket on the ``myminio`` MinIO deployment:
.. code-block:: shell
:class: copyable
mc sql --recursive --query "select * from S3Object" myminio/mydata
.. tab-item:: SYNTAX
The command has the following syntax:
.. code-block:: shell
:class: copyable
mc [GLOBALFLAGS] mc sql \
--query "string" \
[--csv-input "string"] \
[--compression "string"] \
[--csv-output "string"] \
[--csv-output-header "string"] \
[--enc-c "string"] \
[--json-input "string"] \
[--json-output "string"] \
[--recursive] \
ALIAS
.. include:: /includes/common-minio-mc.rst
:start-after: start-minio-syntax
:end-before: end-minio-syntax
Parameters
~~~~~~~~~~
.. mc-cmd:: ALIAS
:required:
The full path to the bucket or object to run the SQL query against.
Specify the :ref:`alias <alias>` of a configured S3 service as the prefix to the ``ALIAS`` path.
For example:
.. code-block:: shell
mc sql [FLAGS] play/mybucket
.. mc-cmd:: --query, e
:required:
The SQL statement to execute on the specified :mc-cmd:`~mc sql ALIAS` directory or object.
Wrap the entire SQL query in double quotes ``"``.
Defaults to ``"select * from S3Object"``.
.. mc-cmd:: --csv-input
:optional:
The data format for ``.csv`` input objects.
Specify a string of comma-seperated ``key=value,...`` pairs.
See :ref:`mc-sql-csv-format` for more information on valid keys.
.. mc-cmd:: --compression
:optional:
The compression type of the input object.
Specify one of the following supported values:
- ``GZIP``
- ``BZIP2``
- ``NONE`` (default)
Compression schemes supported by MinIO backend only:
- ``ZSTD`` `Zstandard <https://facebook.github.io/zstd/>`__
- ``LZ4`` `LZ4 <https://lz4.github.io/lz4/>`__ stream
- ``S2`` `S2 <https://github.com/klauspost/compress/tree/master/s2#s2-compression>`__ framed stream
- ``SNAPPY`` `Snappy <http://google.github.io/snappy/>`__ framed stream
.. mc-cmd:: --csv-output
:optional:
The data format for ``.csv`` output.
Specify a string of comma-seperated ``key=value,...`` pairs.
See :ref:`mc-sql-csv-format` for more information on valid keys.
See the S3 API :s3-api:`CSVOutput <API_CSVOutput.html>` for more information.
.. mc-cmd:: --csv-output-header
:optional:
The header row of the ``.csv`` output file.
Specify a string of comma-separated fields as ``field1,field2,...``.
Omit to output a ``.csv`` with no header row.
.. block include of enc-c
.. include:: /includes/common-minio-sse.rst
:start-after: start-minio-mc-sse-c-only
:end-before: end-minio-mc-sse-options
.. mc-cmd:: --json-input
:optional:
The data format for ``.json`` or ``.ndjson`` input objects.
Specify the type of the JSON contents as ``type=<VALUE>``.
The value can be either:
- ``DOCUMENT`` - JSON `document <https://www.json.org/json-en.html>`__.
- ``LINES`` - JSON `lines <http://jsonlines.org/>`__.
See the S3 API :s3-api:`JSONInput <API_JSONInput.html>` for more information.
.. mc-cmd:: --json-output
:optional:
The data format for the ``.json`` output.
Supports the ``rd=value`` key, where ``rd`` is the ``RecordDelimiter`` for the JSON document.
Omit to use the default newline character ``\n``.
See the S3 API :s3-api:`JSONOutput <API_JSONOutput.html>` for more information.
.. mc-cmd:: --recursive, r
:optional:
Recursively searches the specified :mc-cmd:`~mc sql ALIAS` directory using the :mc-cmd:`~mc sql --query` SQL statement.
Global Flags
~~~~~~~~~~~~
.. include:: /includes/common-minio-mc.rst
:start-after: start-minio-mc-globals
:end-before: end-minio-mc-globals
Examples
--------
Select all Columns in all Objects in a Bucket
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use :mc:`mc sql` with the :mc-cmd:`~mc sql --recursive` and :mc-cmd:`~mc sql --query` options to apply the query to all objects in a bucket:
.. code-block:: shell
:class: copyable
mc sql --recursive --query "select * from S3Object" ALIAS/PATH
- Replace :mc-cmd:`ALIAS <mc sql ALIAS>` with the :ref:`alias <alias>` of the MinIO deployment.
- Replace :mc-cmd:`PATH <mc sql ALIAS>` with the path to the bucket on the MinIO deployment.
Run an Aggregation Query on an Object
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use :mc:`mc sql` with the :mc-cmd:`~mc sql --query` option to query an object on an MinIO deployment:
.. code-block:: shell
mc sql --query "select count(s.power) from S3Object" ALIAS/PATH
- Replace :mc-cmd:`ALIAS <mc sql ALIAS>` with the :ref:`alias <alias>` of the MinIO deployment.
- Replace :mc-cmd:`PATH <mc sql ALIAS>` with the path to the object on the MinIO deployment.
Behavior
--------
Input Formats
~~~~~~~~~~~~~
:mc:`mc sql` supports the following input formats:
.. list-table:: Input Format Types
:header-rows: 1
* - Type
- ``content-type`` Value
* - ``.csv``
- ``text/csv``
* - ``.json``
- ``application/json``
* - ``.parquet``
- none
For ``.csv`` file types, use :mc-cmd:`mc sql --csv-input` to specify the CSV data format.
See :ref:`mc-sql-csv-format` for more information on CSV formatting fields.
For ``.json`` file types, use :mc-cmd:`mc sql --json-input` to specify the JSON data format.
For ``.parquet`` file types, :mc:`mc sql` automatically interprets the data format.
:mc:`mc sql` determines the type by the file extension of the target object.
For example, an object named ``data.json`` is interpreted as a JSON file.
You can query data of a supported type but a different extension if the object has the appropriate ``content-type``.
For more information, see :mc-cmd:`mc cp --attr`.
.. _mc-sql-csv-format:
CSV Formatting Fields
~~~~~~~~~~~~~~~~~~~~~
The following table lists valid key-value pairs for use with :mc-cmd:`mc sql --csv-input` and :mc-cmd:`mc sql --csv-output`.
Certain key pairs are only valid for :mc-cmd:`~mc sql --csv-input`.
See the documentation for S3 API :s3-api:`CSVInput <API_CSVInput.html>` for more information on S3 CSV formatting.
.. list-table::
:header-rows: 1
:widths: 20 20 60
:width: 100%
* - Key
- ``--csv-input`` Only
- Description
* - ``rd``
-
- The character that seperates each record (row) in the input ``.csv`` file.
Corresponds to ``RecordDelimiter`` in the S3 API ``CSVInput``.
* - ``fd``
-
- The character that seperates each field in a record. Defaults to ``,``.
Corresponds to ``FieldDelimeter`` in the S3 API ``CSVInput``.
* - ``qc``
-
- The character used for escaping when the ``fd`` character is part of a value. Defaults to ``"``.
Corresponds to ``QuoteCharacter`` in the S3 API ``CSVInput``.
* - ``qec``
-
- The character used for escaping a quotation mark ``"`` character inside an already escaped value.
Corresponds to ``QuoteEscapeCharacter`` in the S3 API ``CSVInput``.
* - ``fh``
- Yes
- The content of the first line in the ``.csv`` file.
Specify one of the following supported values:
- ``NONE`` - The first line is not a header.
- ``IGNORE`` - Ignore the first line.
- ``USE`` - The first line is a header.
For ``NONE`` or ``IGNORE``, you must specify column positions ``_#`` to identify a column in the :mc-cmd:`~mc sql --query` statement.
For ``USE``, you can specify header values to identify a column in the :mc-cmd:`~mc sql --query` statement.
Corresponds to ``FieldHeaderInfo`` in the S3 API ``CSVInput``.
* - ``cc``
- Yes
- The character used to indicate a record should be ignored.
The character *must* appear at the beginning of the record.
Corresponds to ``Comment`` in the S3 API ``CSVInput``.
* - ``qrd``
- Yes
- Specify ``TRUE`` to indicate that fields may contain record delimiter values (``rd``).
Defaults to ``FALSE``.
Corresponds to ``AllowQuotedRecordDelimiter`` in the S3 API ``CSVInput``.
S3 Compatibility
~~~~~~~~~~~~~~~~
.. include:: /includes/common-minio-mc.rst
:start-after: start-minio-mc-s3-compatibility
:end-before: end-minio-mc-s3-compatibility