========== ``mc sql`` ========== .. default-domain:: minio .. contents:: Table of Contents :local: :depth: 2 .. mc:: mc sql Syntax ------ .. start-mc-sql-desc The :mc:`mc sql` command provides an S3 Select interface for performing sql queries on objects in the specified MinIO deployment. .. end-mc-sql-desc See :s3-docs:`Selecting content from objects ` for more information on S3 Select behavior and limitations. .. tab-set:: .. tab-item:: EXAMPLE The following command queries all objects in the ``mydata`` bucket on the ``myminio`` MinIO deployment: .. code-block:: shell :class: copyable mc sql --recursive --query "select * from S3Object" myminio/mydata .. tab-item:: SYNTAX The command has the following syntax: .. code-block:: shell :class: copyable mc [GLOBALFLAGS] mc sql \ --query "string" \ [--csv-input "string"] \ [--compression "string"] \ [--csv-output "string"] \ [--csv-output-header "string"] \ [--enc-c "string"] \ [--json-input "string"] \ [--json-output "string"] \ [--recursive] \ ALIAS .. include:: /includes/common-minio-mc.rst :start-after: start-minio-syntax :end-before: end-minio-syntax Parameters ~~~~~~~~~~ .. mc-cmd:: ALIAS :required: The full path to the bucket or object to run the SQL query against. Specify the :ref:`alias ` of a configured S3 service as the prefix to the ``ALIAS`` path. For example: .. code-block:: shell mc sql [FLAGS] play/mybucket .. mc-cmd:: --query, e :required: The SQL statement to execute on the specified :mc-cmd:`~mc sql ALIAS` directory or object. Wrap the entire SQL query in double quotes ``"``. Defaults to ``"select * from S3Object"``. .. mc-cmd:: --csv-input :optional: The data format for ``.csv`` input objects. Specify a string of comma-seperated ``key=value,...`` pairs. See :ref:`mc-sql-csv-format` for more information on valid keys. .. mc-cmd:: --compression :optional: The compression type of the input object. Specify one of the following supported values: - ``GZIP`` - ``BZIP2`` - ``NONE`` (default) Compression schemes supported by MinIO backend only: - ``ZSTD`` `Zstandard `__ - ``LZ4`` `LZ4 `__ stream - ``S2`` `S2 `__ framed stream - ``SNAPPY`` `Snappy `__ framed stream .. mc-cmd:: --csv-output :optional: The data format for ``.csv`` output. Specify a string of comma-seperated ``key=value,...`` pairs. See :ref:`mc-sql-csv-format` for more information on valid keys. See the S3 API :s3-api:`CSVOutput ` for more information. .. mc-cmd:: --csv-output-header :optional: The header row of the ``.csv`` output file. Specify a string of comma-separated fields as ``field1,field2,...``. Omit to output a ``.csv`` with no header row. .. block include of enc-c .. include:: /includes/common-minio-sse.rst :start-after: start-minio-mc-sse-c-only :end-before: end-minio-mc-sse-options .. mc-cmd:: --json-input :optional: The data format for ``.json`` or ``.ndjson`` input objects. Specify the type of the JSON contents as ``type=``. The value can be either: - ``DOCUMENT`` - JSON `document `__. - ``LINES`` - JSON `lines `__. See the S3 API :s3-api:`JSONInput ` for more information. .. mc-cmd:: --json-output :optional: The data format for the ``.json`` output. Supports the ``rd=value`` key, where ``rd`` is the ``RecordDelimiter`` for the JSON document. Omit to use the default newline character ``\n``. See the S3 API :s3-api:`JSONOutput ` for more information. .. mc-cmd:: --recursive, r :optional: Recursively searches the specified :mc-cmd:`~mc sql ALIAS` directory using the :mc-cmd:`~mc sql --query` SQL statement. Global Flags ~~~~~~~~~~~~ .. include:: /includes/common-minio-mc.rst :start-after: start-minio-mc-globals :end-before: end-minio-mc-globals Examples -------- Select all Columns in all Objects in a Bucket ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use :mc:`mc sql` with the :mc-cmd:`~mc sql --recursive` and :mc-cmd:`~mc sql --query` options to apply the query to all objects in a bucket: .. code-block:: shell :class: copyable mc sql --recursive --query "select * from S3Object" ALIAS/PATH - Replace :mc-cmd:`ALIAS ` with the :ref:`alias ` of the MinIO deployment. - Replace :mc-cmd:`PATH ` with the path to the bucket on the MinIO deployment. Run an Aggregation Query on an Object ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Use :mc:`mc sql` with the :mc-cmd:`~mc sql --query` option to query an object on an MinIO deployment: .. code-block:: shell mc sql --query "select count(s.power) from S3Object" ALIAS/PATH - Replace :mc-cmd:`ALIAS ` with the :ref:`alias ` of the MinIO deployment. - Replace :mc-cmd:`PATH ` with the path to the object on the MinIO deployment. Behavior -------- Input Formats ~~~~~~~~~~~~~ :mc:`mc sql` supports the following input formats: .. list-table:: Input Format Types :header-rows: 1 * - Type - ``content-type`` Value * - ``.csv`` - ``text/csv`` * - ``.json`` - ``application/json`` * - ``.parquet`` - none For ``.csv`` file types, use :mc-cmd:`mc sql --csv-input` to specify the CSV data format. See :ref:`mc-sql-csv-format` for more information on CSV formatting fields. For ``.json`` file types, use :mc-cmd:`mc sql --json-input` to specify the JSON data format. For ``.parquet`` file types, :mc:`mc sql` automatically interprets the data format. :mc:`mc sql` determines the type by the file extension of the target object. For example, an object named ``data.json`` is interpreted as a JSON file. You can query data of a supported type but a different extension if the object has the appropriate ``content-type``. For more information, see :mc-cmd:`mc cp --attr`. .. _mc-sql-csv-format: CSV Formatting Fields ~~~~~~~~~~~~~~~~~~~~~ The following table lists valid key-value pairs for use with :mc-cmd:`mc sql --csv-input` and :mc-cmd:`mc sql --csv-output`. Certain key pairs are only valid for :mc-cmd:`~mc sql --csv-input`. See the documentation for S3 API :s3-api:`CSVInput ` for more information on S3 CSV formatting. .. list-table:: :header-rows: 1 :widths: 20 20 60 :width: 100% * - Key - ``--csv-input`` Only - Description * - ``rd`` - - The character that seperates each record (row) in the input ``.csv`` file. Corresponds to ``RecordDelimiter`` in the S3 API ``CSVInput``. * - ``fd`` - - The character that seperates each field in a record. Defaults to ``,``. Corresponds to ``FieldDelimeter`` in the S3 API ``CSVInput``. * - ``qc`` - - The character used for escaping when the ``fd`` character is part of a value. Defaults to ``"``. Corresponds to ``QuoteCharacter`` in the S3 API ``CSVInput``. * - ``qec`` - - The character used for escaping a quotation mark ``"`` character inside an already escaped value. Corresponds to ``QuoteEscapeCharacter`` in the S3 API ``CSVInput``. * - ``fh`` - Yes - The content of the first line in the ``.csv`` file. Specify one of the following supported values: - ``NONE`` - The first line is not a header. - ``IGNORE`` - Ignore the first line. - ``USE`` - The first line is a header. For ``NONE`` or ``IGNORE``, you must specify column positions ``_#`` to identify a column in the :mc-cmd:`~mc sql --query` statement. For ``USE``, you can specify header values to identify a column in the :mc-cmd:`~mc sql --query` statement. Corresponds to ``FieldHeaderInfo`` in the S3 API ``CSVInput``. * - ``cc`` - Yes - The character used to indicate a record should be ignored. The character *must* appear at the beginning of the record. Corresponds to ``Comment`` in the S3 API ``CSVInput``. * - ``qrd`` - Yes - Specify ``TRUE`` to indicate that fields may contain record delimiter values (``rd``). Defaults to ``FALSE``. Corresponds to ``AllowQuotedRecordDelimiter`` in the S3 API ``CSVInput``. S3 Compatibility ~~~~~~~~~~~~~~~~ .. include:: /includes/common-minio-mc.rst :start-after: start-minio-mc-s3-compatibility :end-before: end-minio-mc-s3-compatibility