1
0
mirror of https://github.com/postgres/postgres.git synced 2025-10-29 22:49:41 +03:00

Add options to control whether VACUUM runs vac_update_datfrozenxid.

VACUUM normally ends by running vac_update_datfrozenxid(), which
requires a scan of pg_class.  Therefore, if one attempts to vacuum a
database one table at a time --- as vacuumdb has done since v12 ---
we will spend O(N^2) time in vac_update_datfrozenxid().  That causes
serious performance problems in databases with tens of thousands of
tables, and indeed the effect is measurable with only a few hundred.
To add insult to injury, only one process can run
vac_update_datfrozenxid at the same time per DB, so this behavior
largely defeats vacuumdb's -j option.

Hence, invent options SKIP_DATABASE_STATS and ONLY_DATABASE_STATS
to allow applications to postpone vac_update_datfrozenxid() until the
end of a series of VACUUM requests, and teach vacuumdb to use them.

Per bug #17717 from Gunnar L.  Sadly, this answer doesn't seem
like something we'd consider back-patching, so the performance
problem will remain in v12-v15.

Tom Lane and Nathan Bossart

Discussion: https://postgr.es/m/17717-6c50eb1c7d23a886@postgresql.org
This commit is contained in:
Tom Lane
2023-01-06 14:17:25 -05:00
parent cd4b2334db
commit a46a7011b2
10 changed files with 147 additions and 21 deletions

View File

@@ -36,6 +36,8 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
PROCESS_TOAST [ <replaceable class="parameter">boolean</replaceable> ]
TRUNCATE [ <replaceable class="parameter">boolean</replaceable> ]
PARALLEL <replaceable class="parameter">integer</replaceable>
SKIP_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ]
ONLY_DATABASE_STATS [ <replaceable class="parameter">boolean</replaceable> ]
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
@@ -295,6 +297,41 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ ANALYZE ] [ <replaceable class="paramet
</listitem>
</varlistentry>
<varlistentry>
<term><literal>SKIP_DATABASE_STATS</literal></term>
<listitem>
<para>
Specifies that <command>VACUUM</command> should skip updating the
database-wide statistics about oldest unfrozen XIDs. Normally
<command>VACUUM</command> will update these statistics once at the
end of the command. However, this can take awhile in a database
with a very large number of tables, and it will accomplish nothing
unless the table that had contained the oldest unfrozen XID was
among those vacuumed. Moreover, if multiple <command>VACUUM</command>
commands are issued in parallel, only one of them can update the
database-wide statistics at a time. Therefore, if an application
intends to issue a series of many <command>VACUUM</command>
commands, it can be helpful to set this option in all but the last
such command; or set it in all the commands and separately
issue <literal>VACUUM (ONLY_DATABASE_STATS)</literal> afterwards.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><literal>ONLY_DATABASE_STATS</literal></term>
<listitem>
<para>
Specifies that <command>VACUUM</command> should do nothing except
update the database-wide statistics about oldest unfrozen XIDs.
When this option is specified,
the <replaceable class="parameter">table_and_columns</replaceable>
list must be empty, and no other option may be enabled
except <literal>VERBOSE</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><replaceable class="parameter">boolean</replaceable></term>
<listitem>