1
0
mirror of https://github.com/postgres/postgres.git synced 2025-08-27 07:42:10 +03:00

pg_combinebackup: Add -k, --link option.

This is similar to pg_upgrade's --link option, except that here we won't
typically be able to use it for every input file: sometimes we will need
to reconstruct a complete backup from blocks stored in different files.
However, when a whole file does need to be copied, we can use an
optimized copying strategy: see the existing --clone and
--copy-file-range options and the code to use CopyFile() on Windows.
This commit adds a new strategy: add a hard link to an existing file.
Making a hard link doesn't actually copy anything, but it makes sense
for the code to treat it as doing so.

This is useful when the input directories are merely staging directories
that will be removed once the restore is complete. In such cases, there
is no need to actually copy the data, and making a bunch of new hard
links can be very quick. However, it would be quite dangerous to use it
if the input directories might later be reused for any other purpose,
since starting postgres on the output directory would destructively
modify the input directories. For that reason, using this new option
causes pg_combinebackup to emit a warning about the danger involved.

Author: Israel Barth Rubio <barthisrael@gmail.com>
Co-authored-by: Robert Haas <robertmhaas@gmail.com> (cosmetic changes)
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Discussion: http://postgr.es/m/CA+TgmoaEFsYHsMefNaNkU=2SnMRufKE3eVJxvAaX=OWgcnPmPg@mail.gmail.com
This commit is contained in:
Robert Haas
2025-03-17 14:03:14 -04:00
parent ed762e9425
commit 99aeb84703
6 changed files with 245 additions and 3 deletions

View File

@@ -137,6 +137,35 @@ PostgreSQL documentation
</listitem>
</varlistentry>
<varlistentry>
<term><option>-k</option></term>
<term><option>--link</option></term>
<listitem>
<para>
Use hard links instead of copying files to the synthetic backup.
Reconstruction of the synthetic backup might be faster (no file copying)
and use less disk space, but care must be taken when using the output
directory, because any modifications to that directory (for example,
starting the server) can also affect the input directories. Likewise,
changes to the input directories (for example, starting the server on
the full backup) could affect the output directory. Thus, this option
is best used when the input directories are only copies that will be
removed after <application>pg_combinebackup</application> has completed.
</para>
<para>
Requires that the input backups and the output directory are in the
same file system.
</para>
<para>
If a backup manifest is not available or does not contain checksum of
the right type, hard links will still be created, but the file will be
also read block-by-block for the checksum calculation.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term><option>--clone</option></term>
<listitem>
@@ -167,7 +196,8 @@ PostgreSQL documentation
<listitem>
<para>
Perform regular file copy. This is the default. (See also
<option>--copy-file-range</option> and <option>--clone</option>.)
<option>--copy-file-range</option>, <option>--clone</option>, and
<option>-k</option>/<option>--link</option>.)
</para>
</listitem>
</varlistentry>