diff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml
index 07c49e47190..391e5db2e2f 100644
--- a/doc/src/sgml/ref/pg_rewind.sgml
+++ b/doc/src/sgml/ref/pg_rewind.sgml
@@ -48,14 +48,16 @@ PostgreSQL documentation
- The result is equivalent to replacing the target data directory with the
- source one. Only changed blocks from relation files are copied;
- all other files are copied in full, including configuration files. The
- advantage of pg_rewind over taking a new base backup, or
- tools like rsync, is that pg_rewind does
- not require reading through unchanged blocks in the cluster. This makes
- it a lot faster when the database is large and only a small
- fraction of blocks differ between the clusters.
+ After a successful rewind, the state of the target data directory is
+ analogous to a base backup of the source data directory. Unlike taking
+ a new base backup or using a tool like rsync,
+ pg_rewind does not require comparing or copying
+ unchanged relation blocks in the cluster. Only changed blocks from existing
+ relation files are copied; all other files, including new relation files,
+ configuration files, and WAL segments, are copied in full. As such the
+ rewind operation is significantly faster than other approaches when the
+ database is large and only a small fraction of blocks differ between the
+ clusters.
@@ -77,16 +79,18 @@ PostgreSQL documentation
- When the target server is started for the first time after running
- pg_rewind, it will go into recovery mode and replay all
- WAL generated in the source server after the point of divergence.
- If some of the WAL was no longer available in the source server when
- pg_rewind was run, and therefore could not be copied by the
- pg_rewind session, it must be made available when the
- target server is started. This can be done by creating a
- recovery.signal file in the target data directory
- and configuring suitable
- in postgresql.conf.
+ After running pg_rewind, WAL replay needs to
+ complete for the data directory to be in a consistent state. When the
+ target server is started again it will enter archive recovery and replay
+ all WAL generated in the source server from the last checkpoint before
+ the point of divergence. If some of the WAL was no longer available in the
+ source server when pg_rewind was run, and
+ therefore could not be copied by the pg_rewind
+ session, it must be made available when the target server is started.
+ This can be done by creating a recovery.signal file
+ in the target data directory and by configuring a suitable
+ in
+ postgresql.conf.
@@ -105,6 +109,15 @@ PostgreSQL documentation
recovered. In such a case, taking a new fresh backup is recommended.
+
+ As pg_rewind copies configuration files
+ entirely from the source, it may be required to correct the configuration
+ used for recovery before restarting the target server, especially if
+ the target is reintroduced as a standby of the source. If you restart
+ the server after the rewind operation has finished but without configuring
+ recovery, the target may again diverge from the primary.
+
+
pg_rewind will fail immediately if it finds
files it cannot write directly to. This can happen for example when
@@ -342,34 +355,45 @@ GRANT EXECUTE ON function pg_catalog.pg_read_binary_file(text, bigint, bigint, b
Copy all those changed blocks from the source cluster to
the target cluster, either using direct file system access
() or SQL ().
+ Relation files are now in a state equivalent to the moment of the last
+ completed checkpoint prior to the point at which the WAL timelines of the
+ source and target diverged plus the current state on the source of any
+ blocks changed on the target after that divergence.
- Copy all other files such as pg_xact and
- configuration files from the source cluster to the target cluster
- (everything except the relation files). Similarly to base backups,
- the contents of the directories pg_dynshmem/,
+ Copy all other files, including new relation files, WAL segments,
+ pg_xact, and configuration files from the source
+ cluster to the target cluster. Similarly to base backups, the contents
+ of the directories pg_dynshmem/,
pg_notify/, pg_replslot/,
pg_serial/, pg_snapshots/,
- pg_stat_tmp/, and
- pg_subtrans/ are omitted from the data copied
- from the source cluster. Any file or directory beginning with
- pgsql_tmp is omitted, as well as are
+ pg_stat_tmp/, and pg_subtrans/
+ are omitted from the data copied from the source cluster. The files
backup_label,
tablespace_map,
pg_internal.init,
- postmaster.opts and
- postmaster.pid.
+ postmaster.opts, and
+ postmaster.pid, as well as any file or directory
+ beginning with pgsql_tmp, are omitted.
- Apply the WAL from the source cluster, starting from the checkpoint
- created at failover. (Strictly speaking, pg_rewind
- doesn't apply the WAL, it just creates a backup label file that
- makes PostgreSQL start by replaying all WAL from
- that checkpoint forward.)
+ Create a backup_label file to begin WAL replay at
+ the checkpoint created at failover and configure the
+ pg_control file with a minimum consistency LSN
+ defined as the result of pg_current_wal_insert_lsn()
+ when rewinding from a live source or the last checkpoint LSN when
+ rewinding from a stopped source.
+
+
+
+
+ When starting the target, PostgreSQL replays
+ all the required WAL, resulting in a data directory in a consistent
+ state.