1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-28 23:42:10 +03:00

Improve handling of parameter differences in physical replication

When certain parameters are changed on a physical replication primary,
this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL
record.  The standby then checks whether its own settings are at least
as big as the ones on the primary.  If not, the standby shuts down
with a fatal error.

The correspondence of settings between primary and standby is required
because those settings influence certain shared memory sizings that
are required for processing WAL records that the primary might send.
For example, if the primary sends a prepared transaction, the standby
must have had max_prepared_transaction set appropriately or it won't
be able to process those WAL records.

However, fatally shutting down the standby immediately upon receipt of
the parameter change record might be a bit of an overreaction.  The
resources related to those settings are not required immediately at
that point, and might never be required if the activity on the primary
does not exhaust all those resources.  If we just let the standby roll
on with recovery, it will eventually produce an appropriate error when
those resources are used.

So this patch relaxes this a bit.  Upon receipt of
XLOG_PARAMETER_CHANGE, we still check the settings but only issue a
warning and set a global flag if there is a problem.  Then when we
actually hit the resource issue and the flag was set, we issue another
warning message with relevant information.  At that point we pause
recovery, so a hot standby remains usable.  We also repeat the last
warning message once a minute so it is harder to miss or ignore.

Reviewed-by: Sergei Kornilov <sk@zsrv.org>
Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/4ad69a4c-cc9b-0dfe-0352-8b1b0cd36c7b@2ndquadrant.com
This commit is contained in:
Peter Eisentraut
2020-03-30 09:19:40 +02:00
parent a01e1b8b9d
commit 246f136e76
6 changed files with 122 additions and 23 deletions

View File

@ -2148,18 +2148,14 @@ LOG: database system is ready to accept read only connections
</para>
<para>
The setting of some parameters on the standby will need reconfiguration
if they have been changed on the primary. For these parameters,
the value on the standby must
be equal to or greater than the value on the primary.
Therefore, if you want to increase these values, you should do so on all
standby servers first, before applying the changes to the primary server.
Conversely, if you want to decrease these values, you should do so on the
primary server first, before applying the changes to all standby servers.
If these parameters
are not set high enough then the standby will refuse to start.
Higher values can then be supplied and the server
restarted to begin recovery again. These parameters are:
The settings of some parameters determine the size of shared memory for
tracking transaction IDs, locks, and prepared transactions. These shared
memory structures should be no smaller on a standby than on the primary.
Otherwise, it could happen that the standby runs out of shared memory
during recovery. For example, if the primary uses a prepared transaction
but the standby did not allocate any shared memory for tracking prepared
transactions, then recovery will abort and cannot continue until the
standby's configuration is changed. The parameters affected are:
<itemizedlist>
<listitem>
@ -2188,6 +2184,34 @@ LOG: database system is ready to accept read only connections
</para>
</listitem>
</itemizedlist>
The easiest way to ensure this does not become a problem is to have these
parameters set on the standbys to values equal to or greater than on the
primary. Therefore, if you want to increase these values, you should do
so on all standby servers first, before applying the changes to the
primary server. Conversely, if you want to decrease these values, you
should do so on the primary server first, before applying the changes to
all standby servers. The WAL tracks changes to these parameters on the
primary, and if a standby processes WAL that indicates that the current
value on the primary is higher than its own value, it will log a warning, for example:
<screen>
WARNING: insufficient setting for parameter max_connections
DETAIL: max_connections = 80 is a lower setting than on the master server (where its value was 100).
HINT: Change parameters and restart the server, or there may be resource exhaustion errors sooner or later.
</screen>
Recovery will continue but could abort at any time thereafter. (It could
also never end up failing if the activity on the primary does not actually
require the full extent of the allocated shared memory resources.) If
recovery reaches a point where it cannot continue due to lack of shared
memory, recovery will pause and another warning will be logged, for example:
<screen>
WARNING: recovery paused because of insufficient parameter settings
DETAIL: See earlier in the log about which settings are insufficient.
HINT: Recovery cannot continue unless the configuration is changed and the server restarted.
</screen>
This warning will repeated once a minute. At that point, the settings on
the standby need to be updated and the instance restarted before recovery
can continue.
</para>
<para>