1
0
mirror of https://github.com/postgres/postgres.git synced 2025-10-15 05:46:52 +03:00

Doc: clarify n_distinct_inherited setting

There was some confusion around how to adjust the n_distinct estimates
for partitioned tables.  Here we try and clarify that
n_distinct_inherited needs to be adjusted rather than n_distinct.

Also fix some slightly misleading text which was talking about table
size rather than table rows, fix a grammatical error, and adjust some
text which indicated that ANALYZE was performing calculations based on
the n_distinct settings.  Really it's the query planner that does this
and ANALYZE only stores the overridden n_distinct estimate value in
pg_statistic.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-ZmVw@mail.gmail.com
This commit is contained in:
David Rowley
2025-10-14 09:27:38 +13:00
parent af302ff6b8
commit c3b9e56513

View File

@@ -328,24 +328,22 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
<literal>n_distinct_inherited</literal>, which override the
number-of-distinct-values estimates made by subsequent
<xref linkend="sql-analyze"/>
operations. <literal>n_distinct</literal> affects the statistics for the table
itself, while <literal>n_distinct_inherited</literal> affects the statistics
gathered for the table plus its inheritance children. When set to a
positive value, <command>ANALYZE</command> will assume that the column contains
exactly the specified number of distinct nonnull values. When set to a
negative value, which must be greater
than or equal to -1, <command>ANALYZE</command> will assume that the number of
distinct nonnull values in the column is linear in the size of the
table; the exact count is to be computed by multiplying the estimated
table size by the absolute value of the given number. For example,
a value of -1 implies that all values in the column are distinct, while
a value of -0.5 implies that each value appears twice on the average.
This can be useful when the size of the table changes over time, since
the multiplication by the number of rows in the table is not performed
until query planning time. Specify a value of 0 to revert to estimating
the number of distinct values normally. For more information on the use
of statistics by the <productname>PostgreSQL</productname> query
planner, refer to <xref linkend="planner-stats"/>.
operations. <literal>n_distinct</literal> affects the statistics for the
table itself, while <literal>n_distinct_inherited</literal> affects the
statistics gathered for the table plus its inheritance children, and for
the statistics gathered for partitioned tables. When the value
specified is a positive value, the query planner will assume that the
column contains exactly the specified number of distinct nonnull values.
Fractional values may also be specified by using values below 0 and
above or equal to -1. This instructs the query planner to estimate the
number of distinct values by multiplying the absolute value of the
specified number by the estimated number of rows in the table. For
example, a value of -1 implies that all values in the column are
distinct, while a value of -0.5 implies that each value appears twice on
average. This can be useful when the size of the table changes over
time. For more information on the use of statistics by the
<productname>PostgreSQL</productname> query planner, refer to
<xref linkend="planner-stats"/>.
</para>
<para>
Changing per-attribute options acquires a