1
0
mirror of https://github.com/minio/docs.git synced 2025-04-19 21:02:14 +03:00
Ravind Kumar 76e5e35ab3
DOCS-1191: Updating SSE params, general fixups (#1295)
Closes #1191 

# Summary

Finally getting around to this mc release

- Added docs for enc-c, enc-s3, enc-kms
- Some docs are making assumptions around behavior that needs to be
fixed _first_
- Drive-by linker fixes

Staged: http://192.241.195.202:9000/staging/DOCS-1191/linux/index.html

---------

Co-authored-by: Andrea Longo <feorlen@users.noreply.github.com>
Co-authored-by: Daryl White <53910321+djwfyi@users.noreply.github.com>
2024-08-26 11:54:49 -04:00

8.9 KiB

mc sql

minio

Table of Contents

mc sql

Syntax

The mc sql command provides an S3 Select interface for performing sql queries on objects in the specified MinIO deployment.

See Selecting content from objects <selecting-content-from-objects> for more information on S3 Select behavior and limitations.

EXAMPLE

The following command queries all objects in the mydata bucket on the myminio MinIO deployment:

mc sql --recursive --query "select * from S3Object" myminio/mydata

SYNTAX

The command has the following syntax:

mc [GLOBALFLAGS] mc sql                          \
                 --query "string"                \
                 [--csv-input "string"]          \
                 [--compression "string"]        \
                 [--csv-output "string"]         \
                 [--csv-output-header "string"]  \
                 [--enc-c "string"]              \
                 [--json-input "string"]         \
                 [--json-output "string"]        \
                 [--recursive]                   \
                 ALIAS

Parameters

ALIAS

The full path to the bucket or object to run the SQL query against. Specify the alias <alias> of a configured S3 service as the prefix to the ALIAS path. For example:

mc sql [FLAGS] play/mybucket

--query, e

The SQL statement to execute on the specified ~mc sql ALIAS directory or object. Wrap the entire SQL query in double quotes ".

Defaults to "select * from S3Object".

--csv-input

The data format for .csv input objects. Specify a string of comma-seperated key=value,... pairs. See mc-sql-csv-format for more information on valid keys.

--compression

The compression type of the input object. Specify one of the following supported values:

  • GZIP
  • BZIP2
  • NONE (default)

Compression schemes supported by MinIO backend only:

--csv-output

The data format for .csv output. Specify a string of comma-seperated key=value,... pairs. See mc-sql-csv-format for more information on valid keys.

See the S3 API CSVOutput <API_CSVOutput.html> for more information.

--csv-output-header

The header row of the .csv output file. Specify a string of comma-separated fields as field1,field2,....

Omit to output a .csv with no header row.

--json-input

The data format for .json or .ndjson input objects. Specify the type of the JSON contents as type=<VALUE>. The value can be either:

See the S3 API JSONInput <API_JSONInput.html> for more information.

--json-output

The data format for the .json output. Supports the rd=value key, where rd is the RecordDelimiter for the JSON document.

Omit to use the default newline character \n.

See the S3 API JSONOutput <API_JSONOutput.html> for more information.

--recursive, r

Recursively searches the specified ~mc sql ALIAS directory using the ~mc sql --query SQL statement.

Global Flags

Examples

Select all Columns in all Objects in a Bucket

Use mc sql with the ~mc sql --recursive and ~mc sql --query options to apply the query to all objects in a bucket:

mc sql --recursive --query "select * from S3Object" ALIAS/PATH
  • Replace ALIAS <mc sql ALIAS> with the alias <alias> of the MinIO deployment.
  • Replace PATH <mc sql ALIAS> with the path to the bucket on the MinIO deployment.

Run an Aggregation Query on an Object

Use mc sql with the ~mc sql --query option to query an object on an MinIO deployment:

mc sql --query "select count(s.power) from S3Object" ALIAS/PATH
  • Replace ALIAS <mc sql ALIAS> with the alias <alias> of the MinIO deployment.
  • Replace PATH <mc sql ALIAS> with the path to the object on the MinIO deployment.

Behavior

Input Formats

mc sql supports the following input formats:

Input Format Types

For .csv file types, use mc sql --csv-input to specify the CSV data format. See mc-sql-csv-format for more information on CSV formatting fields.

For .json file types, use mc sql --json-input to specify the JSON data format.

For .parquet file types, mc sql automatically interprets the data format.

mc sql determines the type by the file extension of the target object. For example, an object named data.json is interpreted as a JSON file.

You can query data of a supported type but a different extension if the object has the appropriate content-type. For more information, see mc cp --attr.

CSV Formatting Fields

The following table lists valid key-value pairs for use with mc sql --csv-input and mc sql --csv-output. Certain key pairs are only valid for ~mc sql --csv-input. See the documentation for S3 API CSVInput <API_CSVInput.html> for more information on S3 CSV formatting.

Key --csv-input Only Description

rd

The character that seperates each record (row) in the input .csv file.

Corresponds to RecordDelimiter in the S3 API CSVInput.

fd

The character that seperates each field in a record. Defaults to ,.

Corresponds to FieldDelimeter in the S3 API CSVInput.

qc

The character used for escaping when the fd character is part of a value. Defaults to ".

Corresponds to QuoteCharacter in the S3 API CSVInput.

qec

The character used for escaping a quotation mark " character inside an already escaped value.

Corresponds to QuoteEscapeCharacter in the S3 API CSVInput.

fh

Yes

The content of the first line in the .csv file.

Specify one of the following supported values:

  • NONE - The first line is not a header.
  • IGNORE - Ignore the first line.
  • USE - The first line is a header.

For NONE or IGNORE, you must specify column positions _# to identify a column in the ~mc sql --query statement.

For USE, you can specify header values to identify a column in the ~mc sql --query statement.

Corresponds to FieldHeaderInfo in the S3 API CSVInput.

cc

Yes

The character used to indicate a record should be ignored. The character must appear at the beginning of the record.

Corresponds to Comment in the S3 API CSVInput.

qrd

Yes

Specify TRUE to indicate that fields may contain record delimiter values (rd).

Defaults to FALSE.

Corresponds to AllowQuotedRecordDelimiter in the S3 API CSVInput.

S3 Compatibility