mirror of
https://github.com/postgres/postgres.git
synced 2025-08-28 18:48:04 +03:00
Implement multivariate n-distinct coefficients
Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
This commit is contained in:
155
doc/src/sgml/ref/create_statistics.sgml
Normal file
155
doc/src/sgml/ref/create_statistics.sgml
Normal file
@@ -0,0 +1,155 @@
|
||||
<!--
|
||||
doc/src/sgml/ref/create_statistics.sgml
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
<refentry id="SQL-CREATESTATISTICS">
|
||||
<indexterm zone="sql-createstatistics">
|
||||
<primary>CREATE STATISTICS</primary>
|
||||
</indexterm>
|
||||
|
||||
<refmeta>
|
||||
<refentrytitle>CREATE STATISTICS</refentrytitle>
|
||||
<manvolnum>7</manvolnum>
|
||||
<refmiscinfo>SQL - Language Statements</refmiscinfo>
|
||||
</refmeta>
|
||||
|
||||
<refnamediv>
|
||||
<refname>CREATE STATISTICS</refname>
|
||||
<refpurpose>define extended statistics</refpurpose>
|
||||
</refnamediv>
|
||||
|
||||
<refsynopsisdiv>
|
||||
<synopsis>
|
||||
CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable> ON (
|
||||
<replaceable class="PARAMETER">column_name</replaceable>, <replaceable class="PARAMETER">column_name</replaceable> [, ...])
|
||||
FROM <replaceable class="PARAMETER">table_name</replaceable>
|
||||
</synopsis>
|
||||
|
||||
</refsynopsisdiv>
|
||||
|
||||
<refsect1 id="SQL-CREATESTATISTICS-description">
|
||||
<title>Description</title>
|
||||
|
||||
<para>
|
||||
<command>CREATE STATISTICS</command> will create a new extended statistics
|
||||
object on the specified table.
|
||||
The statistics will be created in the current database and
|
||||
will be owned by the user issuing the command.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If a schema name is given (for example, <literal>CREATE STATISTICS
|
||||
myschema.mystat ...</>) then the statistics is created in the specified
|
||||
schema. Otherwise it is created in the current schema. The name of
|
||||
the statistics must be distinct from the name of any other statistics in the
|
||||
same schema.
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1>
|
||||
<title>Parameters</title>
|
||||
|
||||
<variablelist>
|
||||
|
||||
<varlistentry>
|
||||
<term><literal>IF NOT EXISTS</></term>
|
||||
<listitem>
|
||||
<para>
|
||||
Do not throw an error if a statistics with the same name already exists.
|
||||
A notice is issued in this case. Note that only the name of the
|
||||
statistics object is considered here. The definition of the statistics is
|
||||
not considered.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><replaceable class="PARAMETER">statistics_name</replaceable></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The name (optionally schema-qualified) of the statistics to be created.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><replaceable class="PARAMETER">column_name</replaceable></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The name of a column to be included in the statistics.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term><replaceable class="PARAMETER">table_name</replaceable></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The name (optionally schema-qualified) of the table the statistics should
|
||||
be created on.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
</variablelist>
|
||||
|
||||
</refsect1>
|
||||
|
||||
<refsect1>
|
||||
<title>Notes</title>
|
||||
|
||||
<para>
|
||||
You must be the owner of a table to create or change statistics on it.
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1 id="SQL-CREATESTATISTICS-examples">
|
||||
<title>Examples</title>
|
||||
|
||||
<para>
|
||||
Create table <structname>t1</> with two functionally dependent columns, i.e.
|
||||
knowledge of a value in the first column is sufficient for determining the
|
||||
value in the other column. Then functional dependencies are built on those
|
||||
columns:
|
||||
|
||||
<programlisting>
|
||||
CREATE TABLE t1 (
|
||||
a int,
|
||||
b int
|
||||
);
|
||||
|
||||
INSERT INTO t1 SELECT i/100, i/500
|
||||
FROM generate_series(1,1000000) s(i);
|
||||
|
||||
CREATE STATISTICS s1 ON (a, b) FROM t1;
|
||||
|
||||
ANALYZE t1;
|
||||
|
||||
-- valid combination of values
|
||||
EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0);
|
||||
|
||||
-- invalid combination of values
|
||||
EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1);
|
||||
</programlisting>
|
||||
</para>
|
||||
|
||||
</refsect1>
|
||||
|
||||
<refsect1>
|
||||
<title>Compatibility</title>
|
||||
|
||||
<para>
|
||||
There's no <command>CREATE STATISTICS</command> command in the SQL standard.
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1>
|
||||
<title>See Also</title>
|
||||
|
||||
<simplelist type="inline">
|
||||
<member><xref linkend="sql-alterstatistics"></member>
|
||||
<member><xref linkend="sql-dropstatistics"></member>
|
||||
</simplelist>
|
||||
</refsect1>
|
||||
</refentry>
|
Reference in New Issue
Block a user