1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-31 22:04:40 +03:00

Remove old-style VACUUM FULL (which was known for a little while as

VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity.
Per discussion, the use case for this method of vacuuming is no longer large
enough to justify maintaining it; not to mention that we don't wish to invest
the work that would be needed to make it play nicely with Hot Standby.

Aside from the code directly related to old-style VACUUM FULL, this commit
removes support for certain WAL record types that could only be generated
within VACUUM FULL, redirect-pointer removal in heap_page_prune, and
nontransactional generation of cache invalidation sinval messages (the last
being the sticking point for Hot Standby).

We still have to retain all code that copes with finding HEAP_MOVED_OFF and
HEAP_MOVED_IN flag bits on existing tuples.  This can't be removed as long
as we want to support in-place update from pre-9.0 databases.
This commit is contained in:
Tom Lane
2010-02-08 04:33:55 +00:00
parent 1ddc2703a9
commit 0a469c8769
41 changed files with 247 additions and 3737 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.32 2010/01/01 21:53:49 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/indexam.sgml,v 2.33 2010/02/08 04:33:51 tgl Exp $ -->
<chapter id="indexam">
<title>Index Access Method Interface Definition</title>
@ -577,11 +577,10 @@ amrestrpos (IndexScanDesc scan);
The core <productname>PostgreSQL</productname> system obtains
<literal>AccessShareLock</> on the index during an index scan, and
<literal>RowExclusiveLock</> when updating the index (including plain
<command>VACUUM</>). Since these lock
types do not conflict, the access method is responsible for handling any
fine-grained locking it might need. An exclusive lock on the index as a whole
will be taken only during index creation, destruction,
<command>REINDEX</>, or <command>VACUUM FULL</>.
<command>VACUUM</>). Since these lock types do not conflict, the access
method is responsible for handling any fine-grained locking it might need.
An exclusive lock on the index as a whole will be taken only during index
creation, destruction, or <command>REINDEX</>.
</para>
<para>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.98 2010/02/03 17:25:05 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/maintenance.sgml,v 1.99 2010/02/08 04:33:51 tgl Exp $ -->
<chapter id="maintenance">
<title>Routine Database Maintenance Tasks</title>
@ -123,9 +123,7 @@
<command>ALTER TABLE</command> while it is being vacuumed.)
<command>VACUUM FULL</> requires exclusive lock on the table it is
working on, and therefore cannot be done in parallel with other use
of the table. Another disadvantage of <command>VACUUM FULL</> is that
while it reduces table size, it does not reduce index size proportionally;
in fact it can make indexes <emphasis>larger</>. Generally, therefore,
of the table. Generally, therefore,
administrators should strive to use standard <command>VACUUM</> and
avoid <command>VACUUM FULL</>.
</para>
@ -166,13 +164,10 @@
system, except in the special case where one or more pages at the
end of a table become entirely free and an exclusive table lock can be
easily obtained. In contrast, <command>VACUUM FULL</> actively compacts
tables by moving row versions to earlier pages. It is thus able to
force pages at the end of the table to become entirely free, whereupon
it will return them to the operating system. However, if many rows
must be moved, this can take a long time. Also, moving a row requires
transiently making duplicate index entries for it (the entry pointing
to its new location must be made before the old entry can be removed);
so moving a lot of rows this way causes severe index bloat.
tables by writing a complete new version of the table file with no dead
space. This minimizes the size of the table, but can take a long time.
It also requires extra disk space for the new copy of the table, until
the operation completes.
</para>
<para>
@ -220,20 +215,19 @@
<tip>
<para>
Neither form of <command>VACUUM</> is entirely satisfactory when
Plain <command>VACUUM</> may not be satisfactory when
a table contains large numbers of dead row versions as a result of
massive update or delete activity. If you have such a table and
you need to reclaim the excess disk space it occupies, the best
way is to use <xref linkend="sql-cluster" endterm="sql-cluster-title">
you need to reclaim the excess disk space it occupies, you will need
to use <command>VACUUM FULL</>, or alternatively
<xref linkend="sql-cluster" endterm="sql-cluster-title">
or one of the table-rewriting variants of
<xref linkend="sql-altertable" endterm="sql-altertable-title">.
These commands rewrite an entire new copy of the table and build
new indexes for it. Like <command>VACUUM FULL</>, they require
exclusive lock. Note that they also temporarily use extra disk
space, since the old copies of the table and indexes can't be
released until the new ones are complete. In the worst case where
your disk is nearly full, <command>VACUUM FULL</> may be the only
workable alternative.
new indexes for it. All these options require exclusive lock. Note that
they also temporarily use extra disk space approximately equal to the size
of the table, since the old copies of the table and indexes can't be
released until the new ones are complete.
</para>
</tip>
@ -579,22 +573,22 @@ HINT: Stop the postmaster and use a standalone backend to VACUUM in "mydb".
<firstterm>autovacuum launcher</firstterm>, which is in charge of starting
<firstterm>autovacuum worker</firstterm> processes for all databases. The
launcher will distribute the work across time, attempting to start one
worker on each database every <xref linkend="guc-autovacuum-naptime">
seconds. One worker will be launched for each database, with a maximum
of <xref linkend="guc-autovacuum-max-workers"> processes running at the
same time. If there are more than
<xref linkend="guc-autovacuum-max-workers"> databases to be processed,
worker within each database every <xref linkend="guc-autovacuum-naptime">
seconds. (Therefore, if the installation has <replaceable>N</> databases,
a new worker will be launched every
<varname>autovacuum_naptime</>/<replaceable>N</> seconds.)
A maximum of <xref linkend="guc-autovacuum-max-workers"> worker processes
are allowed to run at the same time. If there are more than
<varname>autovacuum_max_workers</> databases to be processed,
the next database will be processed as soon as the first worker finishes.
Each worker process will check each table within its database and
execute <command>VACUUM</> and/or <command>ANALYZE</> as needed.
</para>
<para>
The <xref linkend="guc-autovacuum-max-workers"> setting limits how many
workers may be running at any time. If several large tables all become
eligible for vacuuming in a short amount of time, all autovacuum workers
might become occupied with vacuuming those tables for a long period.
This would result
If several large tables all become eligible for vacuuming in a short
amount of time, all autovacuum workers might become occupied with
vacuuming those tables for a long period. This would result
in other tables and databases not being vacuumed until a worker became
available. There is no limit on how many workers might be in a
single database, but workers do try to avoid repeating work that has
@ -700,8 +694,8 @@ analyze threshold = analyze base threshold + analyze scale factor * number of tu
</para>
<para>
Index pages that have become
completely empty are reclaimed for re-use. However, here is still the possibility
B-tree index pages that have become completely empty are reclaimed for
re-use. However, there is still a possibility
of inefficient use of space: if all but a few index keys on a page have
been deleted, the page remains allocated. Therefore, a usage
pattern in which most, but not all, keys in each range are eventually

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/vacuum.sgml,v 1.57 2010/01/06 05:31:13 itagaki Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/vacuum.sgml,v 1.58 2010/02/08 04:33:51 tgl Exp $
PostgreSQL documentation
-->
@ -21,7 +21,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
VACUUM [ ( { FULL [ INPLACE ] | FREEZE | VERBOSE | ANALYZE } [, ...] ) ] [ <replaceable class="PARAMETER">table</replaceable> [ (<replaceable class="PARAMETER">column</replaceable> [, ...] ) ] ]
VACUUM [ ( { FULL | FREEZE | VERBOSE | ANALYZE } [, ...] ) ] [ <replaceable class="PARAMETER">table</replaceable> [ (<replaceable class="PARAMETER">column</replaceable> [, ...] ) ] ]
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] [ <replaceable class="PARAMETER">table</replaceable> ]
VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ <replaceable class="PARAMETER">table</replaceable> [ (<replaceable class="PARAMETER">column</replaceable> [, ...] ) ] ]
</synopsis>
@ -58,11 +58,12 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ <replaceable class="PARAMETER">
space and makes it
available for re-use. This form of the command can operate in parallel
with normal reading and writing of the table, as an exclusive lock
is not obtained. <command>VACUUM
FULL</command> does more extensive processing, including moving of tuples
across blocks to try to compact the table to the minimum number of disk
blocks. This form is much slower and requires an exclusive lock on each
table while it is being processed.
is not obtained. However, extra space is not returned to the operating
system (in most cases); it's just kept available for re-use within the
same table. <command>VACUUM FULL</command> rewrites the entire contents
of the table into a new disk file with no extra space, allowing unused
space to be returned to the operating system. This form is much slower and
requires an exclusive lock on each table while it is being processed.
</para>
<para>
@ -85,27 +86,10 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ <replaceable class="PARAMETER">
<para>
Selects <quote>full</quote> vacuum, which can reclaim more
space, but takes much longer and exclusively locks the table.
</para>
<para>
For user tables, all table data and indexes are rewritten. This
method requires extra disk space in which to write the new data,
and is generally useful when a significant amount of space needs
to be reclaimed from within the table.
</para>
<para>
For system tables, all table data and indexes are modified in
place to reclaim space. This method may require less disk space
for the table data than <command>VACUUM FULL</command> on a
comparable user table, but the indexes will grow which may
counteract that benefit. Additionally, the operation is often
slower than <command>VACUUM FULL</command> on a comparable user
table.
</para>
<para>
If <literal>FULL INPLACE</literal> is specified, the space is
reclaimed in the same manner as a system table, even if it is a
user table. Specifying <literal>INPLACE</literal> explicitly is
rarely useful.
This method also requires extra disk space, since it writes a
new copy of the table and doesn't release the old copy until
the operation is complete. Usually this should only be used when a
significant amount of space needs to be reclaimed from within the table.
</para>
</listitem>
</varlistentry>
@ -217,10 +201,7 @@ VACUUM [ FULL ] [ FREEZE ] [ VERBOSE ] ANALYZE [ <replaceable class="PARAMETER">
or updated most of the rows in a table and would like the table to
physically shrink to occupy less disk space and allow faster table
scans. <command>VACUUM FULL</command> will usually shrink the table
more than a plain <command>VACUUM</command> would. The
<option>FULL</option> option does not shrink indexes; a periodic
<command>REINDEX</> is still recommended. In fact, it is often faster
to drop all indexes, <command>VACUUM FULL</>, and recreate the indexes.
more than a plain <command>VACUUM</command> would.
</para>
<para>

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/vacuumdb.sgml,v 1.49 2010/01/07 14:35:44 momjian Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/vacuumdb.sgml,v 1.50 2010/02/08 04:33:51 tgl Exp $
PostgreSQL documentation
-->
@ -24,7 +24,6 @@ PostgreSQL documentation
<command>vacuumdb</command>
<arg rep="repeat"><replaceable>connection-option</replaceable></arg>
<group><arg>--full</arg><arg>-f</arg></group>
<group><arg>--inplace</arg><arg>-i</arg></group>
<group><arg>--freeze</arg><arg>-F</arg></group>
<group><arg>--verbose</arg><arg>-v</arg></group>
<group><arg>--analyze</arg><arg>-z</arg></group>
@ -38,14 +37,12 @@ PostgreSQL documentation
<arg rep="repeat"><replaceable>connection-options</replaceable></arg>
<group><arg>--all</arg><arg>-a</arg></group>
<group><arg>--full</arg><arg>-f</arg></group>
<group><arg>--inplace</arg><arg>-i</arg></group>
<group><arg>--freeze</arg><arg>-F</arg></group>
<group><arg>--verbose</arg><arg>-v</arg></group>
<group><arg>--analyze</arg><arg>-z</arg></group>
<group><arg>--analyze-only</arg><arg>-Z</arg></group>
</cmdsynopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
@ -60,8 +57,8 @@ PostgreSQL documentation
<para>
<application>vacuumdb</application> is a wrapper around the SQL
command <xref linkend="SQL-VACUUM" endterm="SQL-VACUUM-title">.
There is no effective difference between vacuuming and analyzing
databases via this utility and via other methods for accessing the
There is no effective difference between vacuuming and analyzing
databases via this utility and via other methods for accessing the
server.
</para>
@ -73,7 +70,6 @@ PostgreSQL documentation
<para>
<application>vacuumdb</application> accepts the following command-line arguments:
<variablelist>
<varlistentry>
<term><option>-a</option></term>
@ -131,16 +127,6 @@ PostgreSQL documentation
</listitem>
</varlistentry>
<varlistentry>
<term><option>-i</option></term>
<term><option>--inplace</option></term>
<listitem>
<para>
Perform <quote>full inplace</quote> vacuuming.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-Z</option></term>
<term><option>--analyze-only</option></term>
@ -203,18 +189,16 @@ PostgreSQL documentation
</para>
<para>
<application>vacuumdb</application> also accepts
<application>vacuumdb</application> also accepts
the following command-line arguments for connection parameters:
<variablelist>
<varlistentry>
<term><option>-h <replaceable class="parameter">host</replaceable></></term>
<term><option>--host <replaceable class="parameter">host</replaceable></></term>
<listitem>
<para>
Specifies the host name of the machine on which the
server
is running. If the value begins with a slash, it is used
Specifies the host name of the machine on which the server
is running. If the value begins with a slash, it is used
as the directory for the Unix domain socket.
</para>
</listitem>
@ -225,7 +209,7 @@ PostgreSQL documentation
<term><option>--port <replaceable class="parameter">port</replaceable></></term>
<listitem>
<para>
Specifies the TCP port or local Unix domain socket file
Specifies the TCP port or local Unix domain socket file
extension on which the server
is listening for connections.
</para>
@ -263,7 +247,7 @@ PostgreSQL documentation
<listitem>
<para>
Force <application>vacuumdb</application> to prompt for a
password before connecting to a database.
password before connecting to a database.
</para>
<para>