mirror of
https://github.com/postgres/postgres.git
synced 2025-07-28 23:42:10 +03:00
Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm that all changes made by a transaction had been transferred to at most one synchronous standby server. This commit extends synchronous replication so that it supports multiple synchronous standby servers. It enables users to consider one or more standby servers as synchronous, and increase the level of transaction durability by ensuring that transaction commits wait for replies from all of those synchronous standbys. Multiple synchronous standby servers are configured in synchronous_standby_names which is extended to support new syntax of 'num_sync ( standby_name [ , ... ] )', where num_sync specifies the number of synchronous standbys that transaction commits need to wait for replies from and standby_name is the name of a standby server. The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before is also still supported. It's the same as new syntax with num_sync=1. This commit doesn't include "quorum commit" feature which was discussed in pgsql-hackers. Synchronous standbys are chosen based on their priorities. synchronous_standby_names determines the priority of each standby for being chosen as a synchronous standby. The standbys whose names appear earlier in the list are given higher priority and will be considered as synchronous. Other standby servers appearing later in this list represent potential synchronous standbys. The regression test for multiple synchronous standbys is not included in this commit. It should come later. Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs, Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen, Rajeev Rastogi Many thanks to the various individuals who were involved in discussing and developing this feature.
This commit is contained in:
@ -2906,34 +2906,69 @@ include_dir 'conf.d'
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Specifies a comma-separated list of standby names that can support
|
||||
Specifies a list of standby names that can support
|
||||
<firstterm>synchronous replication</>, as described in
|
||||
<xref linkend="synchronous-replication">.
|
||||
At any one time there will be at most one active synchronous standby;
|
||||
There will be one or more active synchronous standbys;
|
||||
transactions waiting for commit will be allowed to proceed after
|
||||
this standby server confirms receipt of their data.
|
||||
The synchronous standby will be the first standby named in this list
|
||||
these standby servers confirm receipt of their data.
|
||||
The synchronous standbys will be those whose names appear
|
||||
earlier in this list, and
|
||||
that is both currently connected and streaming data in real-time
|
||||
(as shown by a state of <literal>streaming</literal> in the
|
||||
<link linkend="monitoring-stats-views-table">
|
||||
<literal>pg_stat_replication</></link> view).
|
||||
Other standby servers appearing later in this list represent potential
|
||||
synchronous standbys.
|
||||
If the current synchronous standby disconnects for whatever reason,
|
||||
synchronous standbys. If any of the current synchronous
|
||||
standbys disconnects for whatever reason,
|
||||
it will be replaced immediately with the next-highest-priority standby.
|
||||
Specifying more than one standby name can allow very high availability.
|
||||
</para>
|
||||
<para>
|
||||
This parameter specifies a list of standby servers by using
|
||||
either of the following syntaxes:
|
||||
<synopsis>
|
||||
<replaceable class="parameter">num_sync</replaceable> ( <replaceable class="parameter">standby_name</replaceable> [, ...] )
|
||||
<replaceable class="parameter">standby_name</replaceable> [, ...]
|
||||
</synopsis>
|
||||
where <replaceable class="parameter">num_sync</replaceable> is
|
||||
the number of synchronous standbys that transactions need to
|
||||
wait for replies from,
|
||||
and <replaceable class="parameter">standby_name</replaceable>
|
||||
is the name of a standby server. For example, a setting of
|
||||
<literal>'3 (s1, s2, s3, s4)'</> makes transaction commits wait
|
||||
until their WAL records are received by three higher priority standbys
|
||||
chosen from standby servers <literal>s1</>, <literal>s2</>,
|
||||
<literal>s3</> and <literal>s4</>.
|
||||
</para>
|
||||
<para>
|
||||
The second syntax was used before <productname>PostgreSQL</>
|
||||
version 9.6 and is still supported. It's the same as the first syntax
|
||||
with <replaceable class="parameter">num_sync</replaceable>=1.
|
||||
For example, both settings of <literal>'1 (s1, s2)'</> and
|
||||
<literal>'s1, s2'</> have the same meaning; either <literal>s1</>
|
||||
or <literal>s2</> is chosen as a synchronous standby.
|
||||
</para>
|
||||
<para>
|
||||
The name of a standby server for this purpose is the
|
||||
<varname>application_name</> setting of the standby, as set in the
|
||||
<varname>primary_conninfo</> of the standby's WAL receiver. There is
|
||||
no mechanism to enforce uniqueness. In case of duplicates one of the
|
||||
matching standbys will be chosen to be the synchronous standby, though
|
||||
matching standbys will be considered as higher priority, though
|
||||
exactly which one is indeterminate.
|
||||
The special entry <literal>*</> matches any
|
||||
<varname>application_name</>, including the default application name
|
||||
of <literal>walreceiver</>.
|
||||
</para>
|
||||
<note>
|
||||
<para>
|
||||
The <replaceable class="parameter">standby_name</replaceable>
|
||||
must be enclosed in double quotes if a comma (<literal>,</>),
|
||||
a double quote (<literal>"</>), <!-- " font-lock sanity -->
|
||||
a left parentheses (<literal>(</>), a right parentheses (<literal>)</>)
|
||||
or a space is used in the name of a standby server.
|
||||
</para>
|
||||
</note>
|
||||
<para>
|
||||
If no synchronous standby names are specified here, then synchronous
|
||||
replication is not enabled and transaction commits will not wait for
|
||||
|
@ -1027,10 +1027,12 @@ primary_slot_name = 'node_a_slot'
|
||||
|
||||
<para>
|
||||
Synchronous replication offers the ability to confirm that all changes
|
||||
made by a transaction have been transferred to one synchronous standby
|
||||
server. This extends the standard level of durability
|
||||
made by a transaction have been transferred to one or more synchronous
|
||||
standby servers. This extends that standard level of durability
|
||||
offered by a transaction commit. This level of protection is referred
|
||||
to as 2-safe replication in computer science theory.
|
||||
to as 2-safe replication in computer science theory, and group-1-safe
|
||||
(group-safe and 1-safe) when <varname>synchronous_commit</> is set to
|
||||
<literal>remote_write</>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -1084,8 +1086,8 @@ primary_slot_name = 'node_a_slot'
|
||||
In the case that <varname>synchronous_commit</> is set to
|
||||
<literal>remote_apply</>, the standby sends reply messages when the commit
|
||||
record is replayed, making the transaction visible.
|
||||
If the standby is the first matching standby, as specified in
|
||||
<varname>synchronous_standby_names</> on the primary, the reply
|
||||
If the standby is chosen as the synchronous standby, from a priority
|
||||
list of <varname>synchronous_standby_names</> on the primary, the reply
|
||||
messages from that standby will be used to wake users waiting for
|
||||
confirmation that the commit record has been received. These parameters
|
||||
allow the administrator to specify which standby servers should be
|
||||
@ -1126,6 +1128,40 @@ primary_slot_name = 'node_a_slot'
|
||||
|
||||
</sect3>
|
||||
|
||||
<sect3 id="synchronous-replication-multiple-standbys">
|
||||
<title>Multiple Synchronous Standbys</title>
|
||||
|
||||
<para>
|
||||
Synchronous replication supports one or more synchronous standby servers;
|
||||
transactions will wait until all the standby servers which are considered
|
||||
as synchronous confirm receipt of their data. The number of synchronous
|
||||
standbys that transactions must wait for replies from is specified in
|
||||
<varname>synchronous_standby_names</>. This parameter also specifies
|
||||
a list of standby names, which determines the priority of each standby
|
||||
for being chosen as a synchronous standby. The standbys whose names
|
||||
appear earlier in the list are given higher priority and will be considered
|
||||
as synchronous. Other standby servers appearing later in this list
|
||||
represent potential synchronous standbys. If any of the current
|
||||
synchronous standbys disconnects for whatever reason, it will be replaced
|
||||
immediately with the next-highest-priority standby.
|
||||
</para>
|
||||
<para>
|
||||
An example of <varname>synchronous_standby_names</> for multiple
|
||||
synchronous standbys is:
|
||||
<programlisting>
|
||||
synchronous_standby_names = '2 (s1, s2, s3)'
|
||||
</programlisting>
|
||||
In this example, if four standby servers <literal>s1</>, <literal>s2</>,
|
||||
<literal>s3</> and <literal>s4</> are running, the two standbys
|
||||
<literal>s1</> and <literal>s2</> will be chosen as synchronous standbys
|
||||
because their names appear early in the list of standby names.
|
||||
<literal>s3</> is a potential synchronous standby and will take over
|
||||
the role of synchronous standby when either of <literal>s1</> or
|
||||
<literal>s2</> fails. <literal>s4</> is an asynchronous standby since
|
||||
its name is not in the list.
|
||||
</para>
|
||||
</sect3>
|
||||
|
||||
<sect3 id="synchronous-replication-performance">
|
||||
<title>Planning for Performance</title>
|
||||
|
||||
@ -1171,19 +1207,21 @@ primary_slot_name = 'node_a_slot'
|
||||
<title>Planning for High Availability</title>
|
||||
|
||||
<para>
|
||||
Commits made when <varname>synchronous_commit</> is set to <literal>on</>,
|
||||
<literal>remote_apply</> or <literal>remote_write</> will wait until the
|
||||
synchronous standby responds. The response may never occur if the last, or
|
||||
only, standby should crash.
|
||||
<varname>synchronous_standby_names</> specifies the number and
|
||||
names of synchronous standbys that transaction commits made when
|
||||
<varname>synchronous_commit</> is set to <literal>on</>,
|
||||
<literal>remote_apply</> or <literal>remote_write</> will wait for
|
||||
responses from. Such transaction commits may never be completed
|
||||
if any one of synchronous standbys should crash.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The best solution for avoiding data loss is to ensure you don't lose
|
||||
your last remaining synchronous standby. This can be achieved by naming multiple
|
||||
The best solution for high availability is to ensure you keep as many
|
||||
synchronous standbys as requested. This can be achieved by naming multiple
|
||||
potential synchronous standbys using <varname>synchronous_standby_names</>.
|
||||
The first named standby will be used as the synchronous standby. Standbys
|
||||
listed after this will take over the role of synchronous standby if the
|
||||
first one should fail.
|
||||
The standbys whose names appear earlier in the list will be used as
|
||||
synchronous standbys. Standbys listed after these will take over
|
||||
the role of synchronous standby if one of current ones should fail.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -1208,13 +1246,15 @@ primary_slot_name = 'node_a_slot'
|
||||
they show as committed on the primary. The guarantee we offer is that
|
||||
the application will not receive explicit acknowledgement of the
|
||||
successful commit of a transaction until the WAL data is known to be
|
||||
safely received by the standby.
|
||||
safely received by all the synchronous standbys.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If you really do lose your last standby server then you should disable
|
||||
<varname>synchronous_standby_names</> and reload the configuration file
|
||||
on the primary server.
|
||||
If you really cannot keep as many synchronous standbys as requested
|
||||
then you should decrease the number of synchronous standbys that
|
||||
transaction commits must wait for responses from
|
||||
in <varname>synchronous_standby_names</> (or disable it) and
|
||||
reload the configuration file on the primary server.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
|
Reference in New Issue
Block a user