1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-28 23:42:10 +03:00

Trigger autovacuum based on number of INSERTs

Traditionally autovacuum has only ever invoked a worker based on the
estimated number of dead tuples in a table and for anti-wraparound
purposes. For the latter, with certain classes of tables such as
insert-only tables, anti-wraparound vacuums could be the first vacuum that
the table ever receives. This could often lead to autovacuum workers being
busy for extended periods of time due to having to potentially freeze
every page in the table. This could be particularly bad for very large
tables. New clusters, or recently pg_restored clusters could suffer even
more as many large tables may have the same relfrozenxid, which could
result in large numbers of tables requiring an anti-wraparound vacuum all
at once.

Here we aim to reduce the work required by anti-wraparound and aggressive
vacuums in general, by triggering autovacuum when the table has received
enough INSERTs. This is controlled by adding two new GUCs and reloptions;
autovacuum_vacuum_insert_threshold and
autovacuum_vacuum_insert_scale_factor. These work exactly the same as the
existing scale factor and threshold controls, only base themselves off the
number of inserts since the last vacuum, rather than the number of dead
tuples. New controls were added rather than reusing the existing
controls, to allow these new vacuums to be tuned independently and perhaps
even completely disabled altogether, which can be done by setting
autovacuum_vacuum_insert_threshold to -1.

We make no attempt to skip index cleanup operations on these vacuums as
they may trigger for an insert-mostly table which continually doesn't have
enough dead tuples to trigger an autovacuum for the purpose of removing
those dead tuples. If we were to skip cleaning the indexes in this case,
then it is possible for the index(es) to become bloated over time.

There are additional benefits to triggering autovacuums based on inserts,
as tables which never contain enough dead tuples to trigger an autovacuum
are now more likely to receive a vacuum, which can mark more of the table
as "allvisible" and encourage the query planner to make use of Index Only
Scans.

Currently, we still obey vacuum_freeze_min_age when triggering these new
autovacuums based on INSERTs. For large insert-only tables, it may be
beneficial to lower the table's autovacuum_freeze_min_age so that tuples
are eligible to be frozen sooner. Here we've opted not to zero that for
these types of vacuums, since the table may just be insert-mostly and we
may otherwise freeze tuples that are still destined to be updated or
removed in the near future.

There was some debate to what exactly the new scale factor and threshold
should default to. For now, these are set to 0.2 and 1000, respectively.
There may be some motivation to adjust these before the release.

Author: Laurenz Albe, Darafei Praliaskouski
Reviewed-by: Alvaro Herrera, Masahiko Sawada, Chris Travers, Andres Freund, Justin Pryzby
Discussion: https://postgr.es/m/CAC8Q8t%2Bj36G_bLF%3D%2B0iMo6jGNWnLnWb1tujXuJr-%2Bx8ZCCTqoQ%40mail.gmail.com
This commit is contained in:
David Rowley
2020-03-28 19:20:12 +13:00
parent 9945ad6e90
commit b07642dbcd
18 changed files with 230 additions and 12 deletions

View File

@ -7313,6 +7313,28 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
<varlistentry id="guc-autovacuum-vacuum-insert-threshold" xreflabel="autovacuum_vacuum_insert_threshold">
<term><varname>autovacuum_vacuum_insert_threshold</varname> (<type>integer</type>)
<indexterm>
<primary><varname>autovacuum_vacuum_insert_threshold</varname></primary>
<secondary>configuration parameter</secondary>
</indexterm>
</term>
<listitem>
<para>
Specifies the number of inserted tuples needed to trigger a
<command>VACUUM</command> in any one table.
The default is 1000 tuples. If -1 is specified, autovacuum will not
trigger a <command>VACUUM</command> operation on any tables based on
the number of inserts.
This parameter can only be set in the <filename>postgresql.conf</filename>
file or on the server command line;
but the setting can be overridden for individual tables by
changing table storage parameters.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-autovacuum-analyze-threshold" xreflabel="autovacuum_analyze_threshold">
<term><varname>autovacuum_analyze_threshold</varname> (<type>integer</type>)
<indexterm>
@ -7354,6 +7376,27 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
</listitem>
</varlistentry>
<varlistentry id="guc-autovacuum-vacuum-insert-scale-factor" xreflabel="autovacuum_vacuum_insert_scale_factor">
<term><varname>autovacuum_vacuum_insert_scale_factor</varname> (<type>floating point</type>)
<indexterm>
<primary><varname>autovacuum_vacuum_insert_scale_factor</varname></primary>
<secondary>configuration parameter</secondary>
</indexterm>
</term>
<listitem>
<para>
Specifies a fraction of the table size to add to
<varname>autovacuum_vacuum_insert_threshold</varname>
when deciding whether to trigger a <command>VACUUM</command>.
The default is 0.2 (20% of table size).
This parameter can only be set in the <filename>postgresql.conf</filename>
file or on the server command line;
but the setting can be overridden for individual tables by
changing table storage parameters.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-autovacuum-analyze-scale-factor" xreflabel="autovacuum_analyze_scale_factor">
<term><varname>autovacuum_analyze_scale_factor</varname> (<type>floating point</type>)
<indexterm>

View File

@ -777,13 +777,33 @@ vacuum threshold = vacuum base threshold + vacuum scale factor * number of tuple
<xref linkend="guc-autovacuum-vacuum-scale-factor"/>,
and the number of tuples is
<structname>pg_class</structname>.<structfield>reltuples</structfield>.
The number of obsolete tuples is obtained from the statistics
collector; it is a semi-accurate count updated by each
<command>UPDATE</command> and <command>DELETE</command> operation. (It
is only semi-accurate because some information might be lost under heavy
load.) If the <structfield>relfrozenxid</structfield> value of the table is more
than <varname>vacuum_freeze_table_age</varname> transactions old, an aggressive
vacuum is performed to freeze old tuples and advance
</para>
<para>
The table is also vacuumed if the number of tuples inserted since the last
vacuum has exceeded the defined insert threshold, which is defined as:
<programlisting>
vacuum insert threshold = vacuum base insert threshold + vacuum insert scale factor * number of tuples
</programlisting>
where the vacuum insert base threshold is
<xref linkend="guc-autovacuum-vacuum-insert-threshold"/>,
and vacuum insert scale factor is
<xref linkend="guc-autovacuum-vacuum-insert-scale-factor"/>.
Such vacuums may allow portions of the table to be marked as
<firstterm>all visible</firstterm> and also allow tuples to be frozen, which
can reduce the work required in subsequent vacuums.
For tables which receive <command>INSERT</command> operations but no or
almost no <command>UPDATE</command>/<command>DELETE</command> operations,
it may be beneficial to lower the table's
<xref linkend="reloption-autovacuum-freeze-min-age"/> as this may allow
tuples to be frozen by earlier vacuums. The number of obsolete tuples and
the number of inserted tuples are obtained from the statistics collector;
it is a semi-accurate count updated by each <command>UPDATE</command>,
<command>DELETE</command> and <command>INSERT</command> operation. (It is
only semi-accurate because some information might be lost under heavy
load.) If the <structfield>relfrozenxid</structfield> value of the table
is more than <varname>vacuum_freeze_table_age</varname> transactions old,
an aggressive vacuum is performed to freeze old tuples and advance
<structfield>relfrozenxid</structfield>; otherwise, only pages that have been modified
since the last vacuum are scanned.
</para>

View File

@ -2861,6 +2861,11 @@ SELECT pid, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event i
<entry><type>bigint</type></entry>
<entry>Estimated number of rows modified since this table was last analyzed</entry>
</row>
<row>
<entry><structfield>n_ins_since_vacuum</structfield></entry>
<entry><type>bigint</type></entry>
<entry>Estimated number of rows inserted since this table was last vacuumed</entry>
</row>
<row>
<entry><structfield>last_vacuum</structfield></entry>
<entry><type>timestamp with time zone</type></entry>

View File

@ -1475,6 +1475,36 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM
</listitem>
</varlistentry>
<varlistentry id="reloption-autovacuum-vacuum-insert-threshold" xreflabel="autovacuum_vacuum_insert_threshold">
<term><literal>autovacuum_vacuum_insert_threshold</literal>, <literal>toast.autovacuum_vacuum_insert_threshold</literal> (<type>integer</type>)
<indexterm>
<primary><varname>autovacuum_vacuum_insert_threshold</varname></primary>
<secondary>storage parameter</secondary>
</indexterm>
</term>
<listitem>
<para>
Per-table value for <xref linkend="guc-autovacuum-vacuum-insert-threshold"/>
parameter. The special value of -1 may be used to disable insert vacuums on the table.
</para>
</listitem>
</varlistentry>
<varlistentry id="reloption-autovacuum-vacuum-insert-scale-factor" xreflabel="autovacuum_vacuum_insert_scale_factor">
<term><literal>autovacuum_vacuum_insert_scale_factor</literal>, <literal>toast.autovacuum_vacuum_insert_scale_factor</literal> (<type>float4</type>)
<indexterm>
<primary><varname>autovacuum_vacuum_insert_scale_factor</varname> </primary>
<secondary>storage parameter</secondary>
</indexterm>
</term>
<listitem>
<para>
Per-table value for <xref linkend="guc-autovacuum-vacuum-insert-scale-factor"/>
parameter.
</para>
</listitem>
</varlistentry>
<varlistentry id="reloption-autovacuum-analyze-threshold" xreflabel="autovacuum_analyze_threshold">
<term><literal>autovacuum_analyze_threshold</literal> (<type>integer</type>)
<indexterm>