This commit adds a new property called last_inactive_time for slots. It is
set to 0 whenever a slot is made active/acquired and set to the current
timestamp whenever the slot is inactive/released or restored from the disk.
Note that we don't set the last_inactive_time for the slots currently being
synced from the primary to the standby because such slots are typically
inactive as decoding is not allowed on those.
The 'last_inactive_time' will be useful on production servers to debug and
analyze inactive replication slots. It will also help to know the lifetime
of a replication slot - one can know how long a streaming standby, logical
subscriber, or replication slot consumer is down.
The 'last_inactive_time' will also be useful to implement inactive
timeout-based replication slot invalidation in a future commit.
Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
Till now, the reason for replication slot invalidation is not tracked
directly in pg_replication_slots. A recent commit 007693f2a3 added
'conflict_reason' to show the reasons for slot conflict/invalidation, but
only for logical slots.
This commit adds a new column 'invalidation_reason' to show invalidation
reasons for both physical and logical slots. And, this commit also turns
'conflict_reason' text column to 'conflicting' boolean column (effectively
reverting commit 007693f2a3). The 'conflicting' column is true for
invalidation reasons 'rows_removed' and 'wal_level_insufficient' because
those make the slot conflict with recovery. When 'conflicting' is true,
one can now look at the new 'invalidation_reason' column for the reason
for the logical slot's conflict with recovery.
The new 'invalidation_reason' column will also be useful to track other
invalidation reasons in the future commit.
Author: Bharath Rupireddy
Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik
Discussion: https://www.postgresql.org/message-id/ZfR7HuzFEswakt/a%40ip-10-97-1-34.eu-west-3.compute.internal
Discussion: https://www.postgresql.org/message-id/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com
This patch provides a way to ensure that physical standbys that are
potential failover candidates have received and flushed changes before
the primary server making them visible to subscribers. Doing so guarantees
that the promoted standby server is not lagging behind the subscribers
when a failover is necessary.
The logical walsender now guarantees that all local changes are sent and
flushed to the standby servers corresponding to the replication slots
specified in 'standby_slot_names' before sending those changes to the
subscriber.
Additionally, the SQL functions pg_logical_slot_get_changes,
pg_logical_slot_peek_changes and pg_replication_slot_advance are modified
to ensure that they process changes for failover slots only after physical
slots specified in 'standby_slot_names' have confirmed WAL receipt for those.
Author: Hou Zhijie and Shveta Malik
Reviewed-by: Masahiko Sawada, Peter Smith, Bertrand Drouvot, Ajin Cherian, Nisha Moond, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
By enabling slot synchronization, all the failover logical replication
slots on the primary (assuming configurations are appropriate) are
automatically created on the physical standbys and are synced
periodically. The slot sync worker on the standby server pings the primary
server at regular intervals to get the necessary failover logical slots
information and create/update the slots locally. The slots that no longer
require synchronization are automatically dropped by the worker.
The nap time of the worker is tuned according to the activity on the
primary. The slot sync worker waits for some time before the next
synchronization, with the duration varying based on whether any slots were
updated during the last cycle.
A new parameter sync_replication_slots enables or disables this new
process.
On promotion, the slot sync worker is shut down by the startup process to
drop any temporary slots acquired by the slot sync worker and to prevent
the worker from trying to fetch the failover slots.
A functionality to allow logical walsenders to wait for the physical will
be done in a subsequent commit.
Author: Shveta Malik, Hou Zhijie based on design inputs by Masahiko Sawada and Amit Kapila
Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Ajin Cherian, Nisha Moond, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
This commit switches the handling of the conflict cause strings for
replication slots to use a table rather than being explicitly listed,
using a C99-designated initializer syntax for the array elements. This
makes the whole more readable while easing future maintenance with less
areas to update when adding a new conflict reason.
This is similar to 74a7306310, but the scale of the change is smaller
as there are less conflict causes than LWLock builtin tranche names.
Author: Bharath Rupireddy
Reviewed-by: Jelte Fennema-Nio
Discussion: https://postgr.es/m/CALj2ACUxSLA91QGFrJsWNKs58KXb1C03mbuwKmzqqmoAKLwJaw@mail.gmail.com
This commit introduces a new SQL function pg_sync_replication_slots()
which is used to synchronize the logical replication slots from the
primary server to the physical standby so that logical replication can be
resumed after a failover or planned switchover.
A new 'synced' flag is introduced in pg_replication_slots view, indicating
whether the slot has been synchronized from the primary server. On a
standby, synced slots cannot be dropped or consumed, and any attempt to
perform logical decoding on them will result in an error.
The logical replication slots on the primary can be synchronized to the
hot standby by using the 'failover' parameter of
pg-create-logical-replication-slot(), or by using the 'failover' option of
CREATE SUBSCRIPTION during slot creation, and then calling
pg_sync_replication_slots() on standby. For the synchronization to work,
it is mandatory to have a physical replication slot between the primary
and the standby aka 'primary_slot_name' should be configured on the
standby, and 'hot_standby_feedback' must be enabled on the standby. It is
also necessary to specify a valid 'dbname' in the 'primary_conninfo'.
If a logical slot is invalidated on the primary, then that slot on the
standby is also invalidated.
If a logical slot on the primary is valid but is invalidated on the
standby, then that slot is dropped but will be recreated on the standby in
the next pg_sync_replication_slots() call provided the slot still exists
on the primary server. It is okay to recreate such slots as long as these
are not consumable on standby (which is the case currently). This
situation may occur due to the following reasons:
- The 'max_slot_wal_keep_size' on the standby is insufficient to retain
WAL records from the restart_lsn of the slot.
- 'primary_slot_name' is temporarily reset to null and the physical slot
is removed.
The slot synchronization status on the standby can be monitored using the
'synced' column of pg_replication_slots view.
A functionality to automatically synchronize slots by a background worker
and allow logical walsenders to wait for the physical will be done in
subsequent commits.
Author: Hou Zhijie, Shveta Malik, Ajin Cherian based on an earlier version by Peter Eisentraut
Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Nisha Moond, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
This commit adds the failover property to the replication slot. The
failover property indicates whether the slot will be synced to the standby
servers, enabling the resumption of corresponding logical replication
after failover. But note that this commit does not yet include the
capability to sync the replication slot; the subsequent commits will add
that capability.
A new optional parameter 'failover' is added to the
pg_create_logical_replication_slot() function. We will also enable to set
'failover' option for slots via the subscription commands in the
subsequent commits.
The value of the 'failover' flag is displayed as part of
pg_replication_slots view.
Author: Hou Zhijie, Shveta Malik, Ajin Cherian
Reviewed-by: Peter Smith, Bertrand Drouvot, Dilip Kumar, Masahiko Sawada, Nisha Moond, Kuroda, Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
This patch changes the existing 'conflicting' field to 'conflict_reason'
in pg_replication_slots. This new field indicates the reason for the
logical slot's conflict with recovery. It is always NULL for physical
slots, as well as for logical slots which are not invalidated. The
non-NULL values indicate that the slot is marked as invalidated. Possible
values are:
wal_removed = required WAL has been removed.
rows_removed = required rows have been removed.
wal_level_insufficient = the primary doesn't have a wal_level sufficient
to perform logical decoding.
The existing users of 'conflicting' column can get the same answer by
using 'conflict_reason' IS NOT NULL.
Author: Shveta Malik
Reviewed-by: Amit Kapila, Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/ZYOE8IguqTbp-seF@paquier.xyz
pg_logical_replication_slot_advance() included a break condition to stop
when a targeted LSN is reached, when processing a series of WAL records
with XLogReadRecord(). Since 38a957316d, it matched with the check of
its main while loop. This condition saved from an extra CFI check,
actually pointless, so let's remove this condition and simplify the
code.
In passing, fix an incorrect comment.
Author: Bharath Rupireddy
Reviewed-by: Tom Lane, Gurjeet Singh
Discussion: https://postgr.es/m/CALj2ACWfGDLQ2cy7ZKwxnJqbDkO6Yvqqrqxne5ZN4HYm=PRTGg@mail.gmail.com
This is mainly useful because the upcoming logical-decoding-on-standby feature
adds further reasons for invalidating slots, and we don't want to end up with
multiple invalidated_* fields, or check different attributes.
Eventually we should consider not resetting restart_lsn when invalidating a
slot due to max_slot_wal_keep_size. But that's a user visible change, so left
for later.
Increases SLOT_VERSION, due to the changed field (with a different alignment,
no less).
Reviewed-by: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20230407075009.igg7be27ha2htkbt@awork3.anarazel.de
Per discussion, the existing routine name able to initialize a SRF
function with materialize mode is unpopular, so rename it. Equally, the
flags of this function are renamed, as of:
- SRF_SINGLE_USE_EXPECTED -> MAT_SRF_USE_EXPECTED_DESC
- SRF_SINGLE_BLESS -> MAT_SRF_BLESS
The previous function and flags introduced in 9e98583 are kept around
for compatibility purposes, so as any extension code already compiled
with v15 continues to work as-is. The declarations introduced here for
compatibility will be removed from HEAD in a follow-up commit.
The new names have been suggested by Andres Freund and Melanie
Plageman.
Discussion: https://postgr.es/m/20221013194820.ciktb2sbbpw7cljm@awork3.anarazel.de
Backpatch-through: 15
Set-returning functions that use the Materialize mode, creating a
tuplestore to include all the tuples returned in a set rather than doing
so in multiple calls, use roughly the same set of steps to prepare
ReturnSetInfo for this job:
- Check if ReturnSetInfo supports returning a tuplestore and if the
materialize mode is enabled.
- Create a tuplestore for all the tuples part of the returned set in the
per-query memory context, stored in ReturnSetInfo->setResult.
- Build a tuple descriptor mostly from get_call_result_type(), then
stored in ReturnSetInfo->setDesc. Note that there are some cases where
the SRF's tuple descriptor has to be the one specified by the function
caller.
This refactoring is done so as there are (well, should be) no behavior
changes in any of the in-core functions refactored, and the centralized
function that checks and sets up the function's ReturnSetInfo can be
controlled with a set of bits32 options. Two of them prove to be
necessary now:
- SRF_SINGLE_USE_EXPECTED to use expectedDesc as tuple descriptor, as
expected by the function's caller.
- SRF_SINGLE_BLESS to validate the tuple descriptor for the SRF.
The same initialization pattern is simplified in 28 places per my
count as of src/backend/, shaving up to ~900 lines of code. These
mostly come from the removal of the per-query initializations and the
sanity checks now grouped in a single location. There are more
locations that could be simplified in contrib/, that are left for a
follow-up cleanup.
fcc2817, 07daca5 and d61a361 have prepared the areas of the code related
to this change, to ease this refactoring.
Author: Melanie Plageman, Michael Paquier
Reviewed-by: Álvaro Herrera, Justin Pryzby
Discussion: https://postgr.es/m/CAAKRu_azyd1Z3W_r7Ou4sorTjRCs+PxeHw1CWJeXKofkE6TuZg@mail.gmail.com
This routine is a no-op since dd04e95 from 2003, with a macro kept
around for compatibility purposes. This has led to the same code
patterns being copy-pasted around for no effect, sometimes in confusing
ways like in pg_logical_slot_get_changes_guts() from logical.c where the
code was actually incorrect.
This issue has been discussed on two different threads recently, so
rather than living with this legacy, remove any uses of this routine in
the C code to simplify things. The compatibility macro is kept to avoid
breaking any out-of-core modules that depend on it.
Reported-by: Tatsuhito Kasahara, Justin Pryzby
Author: Tatsuhito Kasahara
Discussion: https://postgr.es/m/20211217200419.GQ17618@telsasoft.com
Discussion: https://postgr.es/m/CAP0=ZVJeeYfAeRfmzqAF2Lumdiv4S4FewyBnZd4DPTrsSQKJKw@mail.gmail.com
This moves the functions related to performing WAL recovery into the new
xlogrecovery.c source file, leaving xlog.c responsible for maintaining
the WAL buffers, coordinating the startup and switch from recovery to
normal operations, and other miscellaneous stuff that have always been in
xlog.c.
Reviewed-by: Andres Freund, Kyotaro Horiguchi, Robert Haas
Discussion: https://www.postgresql.org/message-id/a31f27b4-a31d-f976-6217-2b03be646ffa%40iki.fi
A couple of code paths related to logical decoding (WAL sender, slot
advancing, etc.) use XLogReadRecord(), feeding on error messages
generated by walreader.c on a failure. All those messages have no
context, making it harder to spot from where an error could come even if
these should not happen. All the other callers of XLogReadRecord() do
that already.
Reviewed-by: Kyotaro Horiguchi
Discussion: https://postgr.es/m/YYnTH6OyOwQcAdkw@paquier.xyz
All such code deals with this global variable in one of three ways.
Sometimes the same functions use it in more than one of these ways
at the same time.
First, sometimes it's an implicit argument to one or more functions
being called in xlog.c or elsewhere, and must be set to the
appropriate value before calling those functions lest they
misbehave. In those cases, it is now passed as an explicit argument
instead.
Second, sometimes it's used to obtain the current timeline after
the end of recovery, i.e. the timeline to which WAL is being
written and flushed. Such code now calls GetWALInsertionTimeLine()
or relies on the new out parameter added to GetFlushRecPtr().
Third, sometimes it's used during recovery to store the current
replay timeline. That can change, so such code must generally
update the value before each use. It can still do that, but must
now use a local variable instead.
The net effect of these changes is to reduce by a fair amount the
amount of code that is directly accessing this global variable.
That's good, because history has shown that we don't always think
clearly about which timeline ID it's supposed to contain at any
given point in time, or indeed, whether it has been or needs to
be initialized at any given point in the code.
Patch by me, reviewed and tested by Michael Paquier, Amul Sul, and
Álvaro Herrera.
Discussion: https://postgr.es/m/CA+TgmobfAAqhfWa1kaFBBFvX+5CjM=7TE=n4r4Q1o2bjbGYBpA@mail.gmail.com
In slotfuncs.c, pg_replication_slot_advance() needs to determine
the LSN up to which the slot should be advanced, but that doesn't
require us to update ThisTimeLineID, because none of the code called
from here depends on it. If the replication slot is logical,
pg_logical_replication_slot_advance will call read_local_xlog_page,
which does use ThisTimeLineID, but also takes care of making sure
it's up to date. If the replication slot is physical, the timeline
isn't used for anything at all.
In logicalfuncs.c, pg_logical_slot_get_changes_guts() has the same
issue: the only code we're going to run that cares about timelines
is in or downstream of read_local_xlog_page, which already makes
sure that the correct value gets set. Hence, don't do it here.
Patch by me, reviewed and tested by Michael Paquier, Amul Sul, and
Álvaro Herrera.
Discussion: https://postgr.es/m/CA+TgmobfAAqhfWa1kaFBBFvX+5CjM=7TE=n4r4Q1o2bjbGYBpA@mail.gmail.com
Two functions, both named check_permissions(), used the same checks to
verify if a user had required privileges to work on replication slots.
This commit removes the duplication, and moves the function doing the
checks to slot.c to be centralized.
Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart, Euler Taveira
Discussion: https://postgr.es/m/CALj2ACUPpVw1u7sQocFVWrSs0n10pt_G_4NPZKSxXK6cW1dErw@mail.gmail.com
Also "make reformat-dat-files".
The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
This set of commits has some bugs with known fixes, but at this late
stage in the release cycle it seems best to revert and resubmit next
time, along with some new automated test coverage for this whole area.
Commits reverted:
dc88460c: Doc: Review for "Optionally prefetch referenced data in recovery."
1d257577: Optionally prefetch referenced data in recovery.
f003d9f8: Add circular WAL decoding buffer.
323cbe7c: Remove read_page callback from XLogReader.
Remove the new GUC group WAL_RECOVERY recently added by a55a9847, as the
corresponding section of config.sgml is now reverted.
Discussion: https://postgr.es/m/CAOuzzgrn7iKnFRsB4MHp3UisEQAGgZMbk_ViTN4HV4-Ksq8zCg%40mail.gmail.com
Previously, the XLogReader module would fetch new input data using a
callback function. Redesign the interface so that it tells the caller
to insert more data with a special return value instead. This API suits
later patches for prefetching, encryption and maybe other future
projects that would otherwise require continually extending the callback
interface.
As incidental cleanup work, move global variables readOff, readLen and
readSegNo inside XlogReaderState.
Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Author: Heikki Linnakangas <hlinnaka@iki.fi> (parts of earlier version)
Reviewed-by: Antonin Houska <ah@cybertec.at>
Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Takashi Menjo <takashi.menjo@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20190418.210257.43726183.horiguchi.kyotaro%40lab.ntt.co.jp
Commit 0aa8a01d04 extends the output plugin API to allow decoding of
prepared xacts and allowed the user to enable/disable the two-phase option
via pg_logical_slot_get_changes(). This can lead to a problem such that
the first time when it gets changes via pg_logical_slot_get_changes()
without two_phase option enabled it will not get the prepared even though
prepare is after consistent snapshot. Now next time during getting changes,
if the two_phase option is enabled it can skip prepare because by that
time start decoding point has been moved. So the user will only get commit
prepared.
Allow to enable/disable this option at the create slot time and default
will be false. It will break the existing slots which is fine in a major
release.
Author: Ajin Cherian
Reviewed-by: Amit Kapila and Vignesh C
Discussion: https://postgr.es/m/d0f60d60-133d-bf8d-bd70-47784d8fabf3@enterprisedb.com
max_slot_wal_keep_size that was added in v13 and wal_keep_segments are
the GUC parameters to specify how much WAL files to retain for
the standby servers. While max_slot_wal_keep_size accepts the number of
bytes of WAL files, wal_keep_segments accepts the number of WAL files.
This difference of setting units between those similar parameters could
be confusing to users.
To alleviate this situation, this commit renames wal_keep_segments to
wal_keep_size, and make users specify the WAL size in it instead of
the number of WAL files.
There was also the idea to rename max_slot_wal_keep_size to
max_slot_wal_keep_segments, in the discussion. But we have been moving
away from measuring in segments, for example, checkpoint_segments was
replaced by max_wal_size. So we concluded to rename wal_keep_segments
to wal_keep_size.
Back-patch to v13 where max_slot_wal_keep_size was added.
Author: Fujii Masao
Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/574b4ea3-e0f9-b175-ead2-ebea7faea855@oss.nttdata.com
The previous definition of the column was almost universally disliked,
so provide this updated definition which is more useful for monitoring
purposes: a large positive value is good, while zero or a negative value
means danger. This should be operationally more convenient.
Backpatch to 13, where the new column to pg_replication_slots (and the
feature it represents) were added.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/9ddfbf8c-2f67-904d-44ed-cf8bc5916228@oss.nttdata.com
If restart_lsn of logical replication slot gets behind more than
max_slot_wal_keep_size from the current LSN, the logical replication slot
would be invalidated and its restart_lsn is reset to an invalid LSN.
If this logical replication slot with an invalid restart_lsn was specified as
the source slot in pg_copy_logical_replication_slot(), the function caused
the assertion failure unexpectedly.
This assertion was added because restart_lsn should not be invalid before.
But in v13, it can be invalid thanks to max_slot_wal_keep_size. So since this
assertion is no longer useful, this commit removes it.
This commit also changes the errcode in the error message that
pg_copy_logical_replication_slot() emits when the slot with an invalid
restart_lsn is specified, to more appropriate one.
Back-patch to v13 where max_slot_wal_keep_size was added and
the assertion was no longer valid.
Author: Fujii Masao
Reviewed-by: Alvaro Herrera, Kyotaro Horiguchi
Discussion: https://postgr.es/m/f91de4fb-a7ab-b90e-8132-74796e049d51@oss.nttdata.com
In pg_replication_slot, change output from normal/reserved/lost to
reserved/extended/unreserved/ lost, which better expresses the possible
states particularly near the time where segments are no longer safe but
checkpoint has not run yet.
Under the new definition, reserved means the slot is consuming WAL
that's still under the normal WAL size constraints; extended means it's
consuming WAL that's being protected by wal_keep_segments or the slot
itself, whose size is below max_slot_wal_keep_size; unreserved means the
WAL is no longer safe, but checkpoint has not yet removed those files.
Such as slot is in imminent danger, but can still continue for a little
while and may catch up to the reserved WAL space.
Also, there were some bugs in the calculations used to report the
status; fixed those.
Backpatch to 13.
Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200616.120236.1809496990963386593.horikyota.ntt@gmail.com
Advancing a replication slot did not recompute the oldest xmin and LSN
values across replication slots, preventing resource removal like
segments not recycled at checkpoint time. The original commit that
introduced the slot advancing in 9c7d06d never did the update of those
oldest values, and b0afdca removed this code.
This commit adds a TAP test to check segment recycling with advancing
for physical slots, enforcing an extra segment switch before advancing
to check if the segment gets correctly recycled after a checkpoint.
Reported-by: Andres Freund
Reviewed-by: Alexey Kondratov, Kyptaro Horiguchi
Discussion: https://postgr.es/m/20200609171904.kpltxxvjzislidks@alap3.anarazel.de
Backpatch-through: 11
Fix some more violations of the "only straight-line code inside a
spinlock" rule. These are hazardous not only because they risk
holding the lock for an excessively long time, but because it's
possible for palloc to throw elog(ERROR), leaving a stuck spinlock
behind.
copy_replication_slot() had two separate places that did pallocs
while holding a spinlock. We can make the code simpler and safer
by copying the whole ReplicationSlot struct into a local variable
while holding the spinlock, and then referencing that copy.
(While that's arguably more cycles than we really need to spend
holding the lock, the struct isn't all that big, and this way seems
far more maintainable than copying fields piecemeal. Anyway this
is surely much cheaper than a palloc.) That bug goes back to v12.
InvalidateObsoleteReplicationSlots() not only did a palloc while
holding a spinlock, but for extra sloppiness then leaked the memory
--- probably for the lifetime of the checkpointer process, though
I didn't try to verify that. Fortunately that silliness is new
in HEAD.
pg_get_replication_slots() had a cosmetic violation of the rule,
in that it only assumed it's safe to call namecpy() while holding
a spinlock. Still, that's a hazard waiting to bite somebody, and
there were some other cosmetic coding-rule violations in the same
function, so clean it up. I back-patched this as far as v10; the
code exists before that but it looks different, and this didn't
seem important enough to adapt the patch further back.
Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com
Includes some manual cleanup of places that pgindent messed up,
most of which weren't per project style anyway.
Notably, it seems some people didn't absorb the style rules of
commit c9d297751, because there were a bunch of new occurrences
of function calls with a newline just after the left paren, all
with faulty expectations about how the rest of the call would get
indented.
Code review for 0dc8ead463, prompted by a bug closed by 91c40548d5.
XLogReader's system for opening and closing segments had gotten too
complicated, with callbacks being passed at both the XLogReaderAllocate
level (read_page) as well as at the WALRead level (segment_open). This
was confusing and hard to follow, so restructure things so that these
callbacks are passed together at XLogReaderAllocate time, and add
another callback to the set (segment_close) to make it a coherent whole.
Also, ensure XLogReaderState is an argument to all the callbacks, so
that they can grab at the ->private data if necessary.
Document the whole arrangement more clearly.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200422175754.GA19858@alvherre.pgsql
Replication slots are useful to retain data that may be needed by a
replication system. But experience has shown that allowing them to
retain excessive data can lead to the primary failing because of running
out of space. This new feature allows the user to configure a maximum
amount of space to be reserved using the new option
max_slot_wal_keep_size. Slots that overrun that space are invalidated
at checkpoint time, enabling the storage to be released.
Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20170228.122736.123383594.horiguchi.kyotaro@lab.ntt.co.jp
It devolved into a content-less wrapper over read_local_xlog_page, with
nothing to add, plus it's easily confused with walsender's
logical_read_xlog_page. There doesn't seem to be any reason for it to
stay.
src/include/replication/logicalfuncs.h becomes empty, so remove it too.
The prototypes it initially had were absorbed by generated fmgrprotos.h.
Discussion: https://postgr.es/m/20191115214102.GA15616@alvherre.pgsql
Commit 9f06d79ef831's replication slot copying failed to
properly reserve the WAL that the slot is expecting to see
during DecodingContextFindStartpoint (to set the confirmed_flush
LSN), so concurrent activity could remove that WAL and cause the
copy process to error out. But it doesn't actually *need* that
WAL anyway: instead of running decode to find confirmed_flush, it
can be copied from the source slot. Fix this by rearranging things
to avoid DecodingContextFindStartpoint() (leaving the target slot's
confirmed_flush_lsn to invalid), and set that up afterwards by copying
from the target slot's value.
Also ensure the source slot's confirmed_flush_lsn is valid.
Reported-by: Arseny Sher
Author: Masahiko Sawada, Arseny Sher
Discussion: https://postgr.es/m/871rr3ohbo.fsf@ars-thinkpad
Advancing a physical replication slot with pg_replication_slot_advance()
did not mark the slot as dirty if any advancing was done, preventing the
follow-up checkpoint to flush the slot data to disk. This caused the
advancing to be lost even on clean restarts. This does not happen for
logical slots as any advancing marked the slot as dirty. Per
discussion, the original feature has been implemented so as in the event
of a crash the slot may move backwards to a past LSN. This property is
kept and more documentation is added about that.
This commit adds some new TAP tests to check the persistency of physical
and logical slots after advancing across clean restarts.
Author: Alexey Kondratov, Michael Paquier
Reviewed-by: Andres Freund, Kyotaro Horiguchi, Craig Ringer
Discussion: https://postgr.es/m/059cc53a-8b14-653a-a24d-5f867503b0ee@postgrespro.ru
Backpatch-through: 11