1
0
mirror of https://github.com/postgres/postgres.git synced 2025-12-22 17:42:17 +03:00

Add support for incremental backup.

To take an incremental backup, you use the new replication command
UPLOAD_MANIFEST to upload the manifest for the prior backup. This
prior backup could either be a full backup or another incremental
backup.  You then use BASE_BACKUP with the INCREMENTAL option to take
the backup.  pg_basebackup now has an --incremental=PATH_TO_MANIFEST
option to trigger this behavior.

An incremental backup is like a regular full backup except that
some relation files are replaced with files with names like
INCREMENTAL.${ORIGINAL_NAME}, and the backup_label file contains
additional lines identifying it as an incremental backup. The new
pg_combinebackup tool can be used to reconstruct a data directory
from a full backup and a series of incremental backups.

Patch by me.  Reviewed by Matthias van de Meent, Dilip Kumar, Jakub
Wartak, Peter Eisentraut, and Álvaro Herrera. Thanks especially to
Jakub for incredibly helpful and extensive testing.

Discussion: http://postgr.es/m/CA+TgmoYOYZfMCyOXFyC-P+-mdrZqm5pP2N7S-r0z3_402h9rsA@mail.gmail.com
This commit is contained in:
Robert Haas
2023-12-20 09:49:12 -05:00
parent 174c480508
commit dc21234005
49 changed files with 5834 additions and 52 deletions

View File

@@ -202,6 +202,7 @@ Complete list of usable sgml source files in this directory.
<!ENTITY pgBasebackup SYSTEM "pg_basebackup.sgml">
<!ENTITY pgbench SYSTEM "pgbench.sgml">
<!ENTITY pgChecksums SYSTEM "pg_checksums.sgml">
<!ENTITY pgCombinebackup SYSTEM "pg_combinebackup.sgml">
<!ENTITY pgConfig SYSTEM "pg_config-ref.sgml">
<!ENTITY pgControldata SYSTEM "pg_controldata.sgml">
<!ENTITY pgCtl SYSTEM "pg_ctl-ref.sgml">

View File

@@ -38,11 +38,25 @@ PostgreSQL documentation
</para>
<para>
<application>pg_basebackup</application> makes an exact copy of the database
cluster's files, while making sure the server is put into and
out of backup mode automatically. Backups are always taken of the entire
database cluster; it is not possible to back up individual databases or
database objects. For selective backups, another tool such as
<application>pg_basebackup</application> can take a full or incremental
base backup of the database. When used to take a full backup, it makes an
exact copy of the database cluster's files. When used to take an incremental
backup, some files that would have been part of a full backup may be
replaced with incremental versions of the same files, containing only those
blocks that have been modified since the reference backup. An incremental
backup cannot be used directly; instead,
<xref linkend="app-pgcombinebackup"/> must first
be used to combine it with the previous backups upon which it depends.
See <xref linkend="backup-incremental-backup" /> for more information
about incremental backups, and <xref linkend="backup-pitr-recovery" />
for steps to recover from a backup.
</para>
<para>
In any mode, <application>pg_basebackup</application> makes sure the server
is put into and out of backup mode automatically. Backups are always taken of
the entire database cluster; it is not possible to back up individual
databases or database objects. For selective backups, another tool such as
<xref linkend="app-pgdump"/> must be used.
</para>
@@ -197,6 +211,19 @@ PostgreSQL documentation
</listitem>
</varlistentry>
<varlistentry>
<term><option>-i <replaceable class="parameter">old_manifest_file</replaceable></option></term>
<term><option>--incremental=<replaceable class="parameter">old_meanifest_file</replaceable></option></term>
<listitem>
<para>
Performs an <link linkend="backup-incremental-backup">incremental
backup</link>. The backup manifest for the reference
backup must be provided, and will be uploaded to the server, which will
respond by sending the requested incremental backup.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-R</option></term>
<term><option>--write-recovery-conf</option></term>

View File

@@ -0,0 +1,240 @@
<!--
doc/src/sgml/ref/pg_combinebackup.sgml
PostgreSQL documentation
-->
<refentry id="app-pgcombinebackup">
<indexterm zone="app-pgcombinebackup">
<primary>pg_combinebackup</primary>
</indexterm>
<refmeta>
<refentrytitle><application>pg_combinebackup</application></refentrytitle>
<manvolnum>1</manvolnum>
<refmiscinfo>Application</refmiscinfo>
</refmeta>
<refnamediv>
<refname>pg_combinebackup</refname>
<refpurpose>reconstruct a full backup from an incremental backup and dependent backups</refpurpose>
</refnamediv>
<refsynopsisdiv>
<cmdsynopsis>
<command>pg_combinebackup</command>
<arg rep="repeat"><replaceable>option</replaceable></arg>
<arg rep="repeat"><replaceable>backup_directory</replaceable></arg>
</cmdsynopsis>
</refsynopsisdiv>
<refsect1>
<title>Description</title>
<para>
<application>pg_combinebackup</application> is used to reconstruct a
synthetic full backup from an
<link linkend="backup-incremental-backup">incremental backup</link> and the
earlier backups upon which it depends.
</para>
<para>
Specify all of the required backups on the command line from oldest to newest.
That is, the first backup directory should be the path to the full backup, and
the last should be the path to the final incremental backup
that you wish to restore. The reconstructed backup will be written to the
output directory specified by the <option>-o</option> option.
</para>
<para>
Although <application>pg_combinebackup</application> will attempt to verify
that the backups you specify form a legal backup chain from which a correct
full backup can be reconstructed, it is not designed to help you keep track
of which backups depend on which other backups. If you remove the one or
more of the previous backups upon which your incremental
backup relies, you will not be able to restore it.
</para>
<para>
Since the output of <application>pg_combinebackup</application> is a
synthetic full backup, it can be used as an input to a future invocation of
<application>pg_combinebackup</application>. The synthetic full backup would
be specified on the command line in lieu of the chain of backups from which
it was reconstructed.
</para>
</refsect1>
<refsect1>
<title>Options</title>
<para>
<variablelist>
<varlistentry>
<term><option>-d</option></term>
<term><option>--debug</option></term>
<listitem>
<para>
Print lots of debug logging output on <filename>stderr</filename>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-n</option></term>
<term><option>--dry-run</option></term>
<listitem>
<para>
The <option>-n</option>/<option>--dry-run</option> option instructs
<command>pg_cominebackup</command> to figure out what would be done
without actually creating the target directory or any output files.
It is particularly useful in comination with <option>--debug</option>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-N</option></term>
<term><option>--no-sync</option></term>
<listitem>
<para>
By default, <command>pg_combinebackup</command> will wait for all files
to be written safely to disk. This option causes
<command>pg_combinebackup</command> to return without waiting, which is
faster, but means that a subsequent operating system crash can leave
the output backup corrupt. Generally, this option is useful for testing
but should not be used when creating a production installation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-o <replaceable class="parameter">outputdir</replaceable></option></term>
<term><option>--output=<replaceable class="parameter">outputdir</replaceable></option></term>
<listitem>
<para>
Specifies the output directory to which the synthetic full backup
should be written. Currently, this argument is required.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-T <replaceable class="parameter">olddir</replaceable>=<replaceable class="parameter">newdir</replaceable></option></term>
<term><option>--tablespace-mapping=<replaceable class="parameter">olddir</replaceable>=<replaceable class="parameter">newdir</replaceable></option></term>
<listitem>
<para>
Relocates the tablespace in directory <replaceable>olddir</replaceable>
to <replaceable>newdir</replaceable> during the backup.
<replaceable>olddir</replaceable> is the absolute path of the tablespace
as it exists in the first backup specified on the command line,
and <replaceable>newdir</replaceable> is the absolute path to use for the
tablespace in the reconstructed backup. If either path needs to contain
an equal sign (<literal>=</literal>), precede that with a backslash.
This option can be specified multiple times for multiple tablespaces.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--manifest-checksums=<replaceable class="parameter">algorithm</replaceable></option></term>
<listitem>
<para>
Like <xref linkend="app-pgbasebackup"/>,
<application>pg_combinebackup</application> writes a backup manifest
in the output directory. This option specifies the checksum algorithm
that should be applied to each file included in the backup manifest.
Currently, the available algorithms are <literal>NONE</literal>,
<literal>CRC32C</literal>, <literal>SHA224</literal>,
<literal>SHA256</literal>, <literal>SHA384</literal>,
and <literal>SHA512</literal>. The default is <literal>CRC32C</literal>.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--no-manifest</option></term>
<listitem>
<para>
Disables generation of a backup manifest. If this option is not
specified, a backup manifest for the reconstructed backup will be
written to the output directory.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--sync-method=<replaceable class="parameter">method</replaceable></option></term>
<listitem>
<para>
When set to <literal>fsync</literal>, which is the default,
<command>pg_combinebackup</command> will recursively open and synchronize
all files in the backup directory. When the plain format is used, the
search for files will follow symbolic links for the WAL directory and
each configured tablespace.
</para>
<para>
On Linux, <literal>syncfs</literal> may be used instead to ask the
operating system to synchronize the whole file system that contains the
backup directory. When the plain format is used,
<command>pg_combinebackup</command> will also synchronize the file systems
that contain the WAL files and each tablespace. See
<xref linkend="syncfs"/> for more information about using
<function>syncfs()</function>.
</para>
<para>
This option has no effect when <option>--no-sync</option> is used.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-V</option></term>
<term><option>--version</option></term>
<listitem>
<para>
Prints the <application>pg_combinebackup</application> version and
exits.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>-?</option></term>
<term><option>--help</option></term>
<listitem>
<para>
Shows help about <application>pg_combinebackup</application> command
line arguments, and exits.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</refsect1>
<refsect1>
<title>Environment</title>
<para>
This utility, like most other <productname>PostgreSQL</productname> utilities,
uses the environment variables supported by <application>libpq</application>
(see <xref linkend="libpq-envars"/>).
</para>
<para>
The environment variable <envar>PG_COLOR</envar> specifies whether to use
color in diagnostic messages. Possible values are
<literal>always</literal>, <literal>auto</literal> and
<literal>never</literal>.
</para>
</refsect1>
<refsect1>
<title>See Also</title>
<simplelist type="inline">
<member><xref linkend="app-pgbasebackup"/></member>
</simplelist>
</refsect1>
</refentry>