mirror of
				https://github.com/postgres/postgres.git
				synced 2025-11-03 09:13:20 +03:00 
			
		
		
		
	Doc: clarify n_distinct_inherited setting
There was some confusion around how to adjust the n_distinct estimates for partitioned tables. Here we try and clarify that n_distinct_inherited needs to be adjusted rather than n_distinct. Also fix some slightly misleading text which was talking about table size rather than table rows, fix a grammatical error, and adjust some text which indicated that ANALYZE was performing calculations based on the n_distinct settings. Really it's the query planner that does this and ANALYZE only stores the overridden n_distinct estimate value in pg_statistic. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-ZmVw@mail.gmail.com
This commit is contained in:
		@@ -340,24 +340,22 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
 | 
			
		||||
      <literal>n_distinct_inherited</literal>, which override the
 | 
			
		||||
      number-of-distinct-values estimates made by subsequent
 | 
			
		||||
      <link linkend="sql-analyze"><command>ANALYZE</command></link>
 | 
			
		||||
      operations.  <literal>n_distinct</literal> affects the statistics for the table
 | 
			
		||||
      itself, while <literal>n_distinct_inherited</literal> affects the statistics
 | 
			
		||||
      gathered for the table plus its inheritance children.  When set to a
 | 
			
		||||
      positive value, <command>ANALYZE</command> will assume that the column contains
 | 
			
		||||
      exactly the specified number of distinct nonnull values.  When set to a
 | 
			
		||||
      negative value, which must be greater
 | 
			
		||||
      than or equal to -1, <command>ANALYZE</command> will assume that the number of
 | 
			
		||||
      distinct nonnull values in the column is linear in the size of the
 | 
			
		||||
      table; the exact count is to be computed by multiplying the estimated
 | 
			
		||||
      table size by the absolute value of the given number.  For example,
 | 
			
		||||
      a value of -1 implies that all values in the column are distinct, while
 | 
			
		||||
      a value of -0.5 implies that each value appears twice on the average.
 | 
			
		||||
      This can be useful when the size of the table changes over time, since
 | 
			
		||||
      the multiplication by the number of rows in the table is not performed
 | 
			
		||||
      until query planning time.  Specify a value of 0 to revert to estimating
 | 
			
		||||
      the number of distinct values normally.  For more information on the use
 | 
			
		||||
      of statistics by the <productname>PostgreSQL</productname> query
 | 
			
		||||
      planner, refer to <xref linkend="planner-stats"/>.
 | 
			
		||||
      operations. <literal>n_distinct</literal> affects the statistics for the
 | 
			
		||||
      table itself, while <literal>n_distinct_inherited</literal> affects the
 | 
			
		||||
      statistics gathered for the table plus its inheritance children, and for
 | 
			
		||||
      the statistics gathered for partitioned tables.  When the value
 | 
			
		||||
      specified is a positive value, the query planner will assume that the
 | 
			
		||||
      column contains exactly the specified number of distinct nonnull values.
 | 
			
		||||
      Fractional values may also be specified by using values below 0 and
 | 
			
		||||
      above or equal to -1.  This instructs the query planner to estimate the
 | 
			
		||||
      number of distinct values by multiplying the absolute value of the
 | 
			
		||||
      specified number by the estimated number of rows in the table.  For
 | 
			
		||||
      example, a value of -1 implies that all values in the column are
 | 
			
		||||
      distinct, while a value of -0.5 implies that each value appears twice on
 | 
			
		||||
      average.  This can be useful when the size of the table changes over
 | 
			
		||||
      time.  For more information on the use of statistics by the
 | 
			
		||||
      <productname>PostgreSQL</productname> query planner, refer to
 | 
			
		||||
      <xref linkend="planner-stats"/>.
 | 
			
		||||
     </para>
 | 
			
		||||
     <para>
 | 
			
		||||
      Changing per-attribute options acquires a
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user