1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-27 12:41:57 +03:00

Change CREATE STATISTICS syntax

Previously, we had the WITH clause in the middle of the command, where
you'd specify both generic options as well as statistic types.  Few
people liked this, so this commit changes it to remove the WITH keyword
from that clause and makes it accept statistic types only.  (We
currently don't have any generic options, but if we invent in the
future, we will gain a new WITH clause, probably at the end of the
command).

Also, the column list is now specified without parens, which makes the
whole command look more similar to a SELECT command.  This change will
let us expand the command to supporting expressions (not just columns
names) as well as multiple tables and their join conditions.

Tom added lots of code comments and fixed some parts of the CREATE
STATISTICS reference page, too; more changes in this area are
forthcoming.  He also fixed a potential problem in the alter_generic
regression test, reducing verbosity on a cascaded drop to avoid
dependency on message ordering, as we do in other tests.

Tom also closed a security bug: we documented that table ownership was
required in order to create a statistics object on it, but didn't
actually implement it.

Implement tab-completion for statistics objects.  This can stand some
more improvement.

Authors: Alvaro Herrera, with lots of cleanup by Tom Lane
Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql
This commit is contained in:
Alvaro Herrera
2017-05-12 14:59:23 -03:00
parent 46052d9ef3
commit bc085205c8
21 changed files with 325 additions and 290 deletions

View File

@ -1132,8 +1132,8 @@ WHERE tablename = 'road';
To inspect functional dependencies on a statistics
<literal>stts</literal>, you may do this:
<programlisting>
CREATE STATISTICS stts WITH (dependencies)
ON (zip, city) FROM zipcodes;
CREATE STATISTICS stts (dependencies)
ON zip, city FROM zipcodes;
ANALYZE zipcodes;
SELECT stxname, stxkeys, stxdependencies
FROM pg_statistic_ext
@ -1219,8 +1219,8 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 10;
Continuing the above example, the n-distinct coefficients in a ZIP
code table may look like the following:
<programlisting>
CREATE STATISTICS stts2 WITH (ndistinct)
ON (zip, state, city) FROM zipcodes;
CREATE STATISTICS stts2 (ndistinct)
ON zip, state, city FROM zipcodes;
ANALYZE zipcodes;
SELECT stxkeys AS k, stxndistinct AS nd
FROM pg_statistic_ext

View File

@ -526,7 +526,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
multivariate statistics on the two columns:
<programlisting>
CREATE STATISTICS stts WITH (dependencies) ON (a, b) FROM t;
CREATE STATISTICS stts (dependencies) ON a, b FROM t;
ANALYZE t;
EXPLAIN (ANALYZE, TIMING OFF) SELECT * FROM t WHERE a = 1 AND b = 1;
QUERY PLAN
@ -569,7 +569,7 @@ EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
calculation, the estimate is much improved:
<programlisting>
DROP STATISTICS stts;
CREATE STATISTICS stts WITH (dependencies, ndistinct) ON (a, b) FROM t;
CREATE STATISTICS stts (dependencies, ndistinct) ON a, b FROM t;
ANALYZE t;
EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM t GROUP BY a, b;
QUERY PLAN

View File

@ -17,7 +17,7 @@ PostgreSQL documentation
<refnamediv>
<refname>ALTER STATISTICS</refname>
<refpurpose>
change the definition of a extended statistics
change the definition of an extended statistics object
</refpurpose>
</refnamediv>
@ -34,19 +34,20 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
<para>
<command>ALTER STATISTICS</command> changes the parameters of an existing
extended statistics. Any parameters not specifically set in the
extended statistics object. Any parameters not specifically set in the
<command>ALTER STATISTICS</command> command retain their prior settings.
</para>
<para>
You must own the statistics to use <command>ALTER STATISTICS</>.
To change a statistics' schema, you must also have <literal>CREATE</>
privilege on the new schema.
You must own the statistics object to use <command>ALTER STATISTICS</>.
To change a statistics object's schema, you must also
have <literal>CREATE</> privilege on the new schema.
To alter the owner, you must also be a direct or indirect member of the new
owning role, and that role must have <literal>CREATE</literal> privilege on
the statistics' schema. (These restrictions enforce that altering the owner
doesn't do anything you couldn't do by dropping and recreating the statistics.
However, a superuser can alter ownership of any statistics anyway.)
the statistics object's schema. (These restrictions enforce that altering
the owner doesn't do anything you couldn't do by dropping and recreating
the statistics object. However, a superuser can alter ownership of any
statistics object anyway.)
</para>
</refsect1>
@ -59,7 +60,8 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
<term><replaceable class="parameter">name</replaceable></term>
<listitem>
<para>
The name (optionally schema-qualified) of the statistics to be altered.
The name (optionally schema-qualified) of the statistics object to be
altered.
</para>
</listitem>
</varlistentry>
@ -68,7 +70,7 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
<term><replaceable class="PARAMETER">new_owner</replaceable></term>
<listitem>
<para>
The user name of the new owner of the statistics.
The user name of the new owner of the statistics object.
</para>
</listitem>
</varlistentry>
@ -77,7 +79,7 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
<term><replaceable class="parameter">new_name</replaceable></term>
<listitem>
<para>
The new name for the statistics.
The new name for the statistics object.
</para>
</listitem>
</varlistentry>
@ -86,7 +88,7 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
<term><replaceable class="parameter">new_schema</replaceable></term>
<listitem>
<para>
The new schema for the statistics.
The new schema for the statistics object.
</para>
</listitem>
</varlistentry>
@ -99,7 +101,7 @@ ALTER STATISTICS <replaceable class="parameter">name</replaceable> SET SCHEMA <r
<title>Compatibility</title>
<para>
There's no <command>ALTER STATISTICS</command> command in the SQL standard.
There is no <command>ALTER STATISTICS</command> command in the SQL standard.
</para>
</refsect1>

View File

@ -22,8 +22,8 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_name</replaceable>
WITH ( <replaceable class="PARAMETER">option</replaceable> [= <replaceable class="PARAMETER">value</replaceable>] [, ... ] )
ON ( <replaceable class="PARAMETER">column_name</replaceable>, <replaceable class="PARAMETER">column_name</replaceable> [, ...])
[ ( <replaceable class="PARAMETER">statistic_type</replaceable> [, ... ] ) ]
ON <replaceable class="PARAMETER">column_name</replaceable>, <replaceable class="PARAMETER">column_name</replaceable> [, ...]
FROM <replaceable class="PARAMETER">table_name</replaceable>
</synopsis>
@ -34,17 +34,17 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
<para>
<command>CREATE STATISTICS</command> will create a new extended statistics
object on the specified table, foreign table or materialized view.
The statistics will be created in the current database and
will be owned by the user issuing the command.
object tracking data about the specified table, foreign table or
materialized view. The statistics object will be created in the current
database and will be owned by the user issuing the command.
</para>
<para>
If a schema name is given (for example, <literal>CREATE STATISTICS
myschema.mystat ...</>) then the statistics is created in the specified
schema. Otherwise it is created in the current schema. The name of
the statistics must be distinct from the name of any other statistics in the
same schema.
myschema.mystat ...</>) then the statistics object is created in the
specified schema. Otherwise it is created in the current schema.
The name of the statistics object must be distinct from the name of any
other statistics object in the same schema.
</para>
</refsect1>
@ -57,10 +57,10 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
<term><literal>IF NOT EXISTS</></term>
<listitem>
<para>
Do not throw an error if a statistics with the same name already exists.
A notice is issued in this case. Note that only the name of the
statistics object is considered here. The definition of the statistics is
not considered.
Do not throw an error if a statistics object with the same name already
exists. A notice is issued in this case. Note that only the name of
the statistics object is considered here, not the details of its
definition.
</para>
</listitem>
</varlistentry>
@ -69,7 +69,21 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
<term><replaceable class="PARAMETER">statistics_name</replaceable></term>
<listitem>
<para>
The name (optionally schema-qualified) of the statistics to be created.
The name (optionally schema-qualified) of the statistics object to be
created.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="PARAMETER">statistic_type</replaceable></term>
<listitem>
<para>
A statistic type to be computed in this statistics object. Currently
supported types are <literal>ndistinct</literal>, which enables
n-distinct coefficient tracking,
and <literal>dependencies</literal>, which enables functional
dependencies.
</para>
</listitem>
</varlistentry>
@ -78,7 +92,7 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
<term><replaceable class="PARAMETER">column_name</replaceable></term>
<listitem>
<para>
The name of a column to be included in the statistics.
The name of a table column to be included in the statistics object.
</para>
</listitem>
</varlistentry>
@ -87,49 +101,13 @@ CREATE STATISTICS [ IF NOT EXISTS ] <replaceable class="PARAMETER">statistics_na
<term><replaceable class="PARAMETER">table_name</replaceable></term>
<listitem>
<para>
The name (optionally schema-qualified) of the table the statistics should
be created on.
The name (optionally schema-qualified) of the table containing the
column(s) the statistics are computed on.
</para>
</listitem>
</varlistentry>
</variablelist>
<refsect2 id="SQL-CREATESTATISTICS-parameters">
<title id="SQL-CREATESTATISTICS-parameters-title">Parameters</title>
<indexterm zone="sql-createstatistics-parameters">
<primary>statistics parameters</primary>
</indexterm>
<para>
The <literal>WITH</> clause can specify <firstterm>options</>
for the statistics. Available options are listed below.
</para>
<variablelist>
<varlistentry>
<term><literal>dependencies</> (<type>boolean</>)</term>
<listitem>
<para>
Enables functional dependencies for the statistics.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>ndistinct</> (<type>boolean</>)</term>
<listitem>
<para>
Enables ndistinct coefficients for the statistics.
</para>
</listitem>
</varlistentry>
</variablelist>
</refsect2>
</refsect1>
<refsect1>
@ -158,7 +136,7 @@ CREATE TABLE t1 (
INSERT INTO t1 SELECT i/100, i/500
FROM generate_series(1,1000000) s(i);
CREATE STATISTICS s1 WITH (dependencies) ON (a, b) FROM t1;
CREATE STATISTICS s1 (dependencies) ON a, b FROM t1;
ANALYZE t1;
@ -168,6 +146,11 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 0);
-- invalid combination of values
EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1);
</programlisting>
Without functional-dependency statistics, the planner would make the
same estimate of the number of matching rows for these two queries.
With such statistics, it is able to tell that one case has matches
and the other does not.
</para>
</refsect1>
@ -176,7 +159,7 @@ EXPLAIN ANALYZE SELECT * FROM t1 WHERE (a = 1) AND (b = 1);
<title>Compatibility</title>
<para>
There's no <command>CREATE STATISTICS</command> command in the SQL standard.
There is no <command>CREATE STATISTICS</command> command in the SQL standard.
</para>
</refsect1>

View File

@ -29,9 +29,9 @@ DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
<title>Description</title>
<para>
<command>DROP STATISTICS</command> removes statistics from the database.
Only the statistics owner, the schema owner, and superuser can drop a
statistics.
<command>DROP STATISTICS</command> removes statistics object(s) from the
database. Only the statistics object's owner, the schema owner, or a
superuser can drop a statistics object.
</para>
</refsect1>
@ -44,8 +44,8 @@ DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
<term><literal>IF EXISTS</literal></term>
<listitem>
<para>
Do not throw an error if the statistics do not exist. A notice is
issued in this case.
Do not throw an error if the statistics object does not exist. A notice
is issued in this case.
</para>
</listitem>
</varlistentry>
@ -54,7 +54,7 @@ DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
<term><replaceable class="PARAMETER">name</replaceable></term>
<listitem>
<para>
The name (optionally schema-qualified) of the statistics to drop.
The name (optionally schema-qualified) of the statistics object to drop.
</para>
</listitem>
</varlistentry>
@ -66,7 +66,7 @@ DROP STATISTICS [ IF EXISTS ] <replaceable class="PARAMETER">name</replaceable>
<title>Examples</title>
<para>
To destroy two statistics objects on different schemas, without failing
To destroy two statistics objects in different schemas, without failing
if they don't exist:
<programlisting>
@ -82,7 +82,7 @@ DROP STATISTICS IF EXISTS
<title>Compatibility</title>
<para>
There's no <command>DROP STATISTICS</command> command in the SQL standard.
There is no <command>DROP STATISTICS</command> command in the SQL standard.
</para>
</refsect1>