1
0
mirror of https://github.com/postgres/postgres.git synced 2025-08-30 06:01:21 +03:00
Commit Graph

920 Commits

Author SHA1 Message Date
Michael Paquier
aaf8c99050 Fix typos and some format mistakes in comments
Author: Justin Pryzby
Discussion: https://postgr.es/m/20200612023709.GC14879@telsasoft.com
2020-06-12 21:05:10 +09:00
Amit Kapila
c5c000b103 Fix ReorderBuffer memory overflow check.
Commit cec2edfa78 introduced logical_decoding_work_mem to limit
ReorderBuffer memory usage. We spill the changes once the memory occupied
by changes exceeds logical_decoding_work_mem.  There was an assumption
in the code that by evicting the largest (sub)transaction we will come
under the memory limit as the selected transaction will be at least as
large as the most recent change (which caused us to go over the memory
limit).  However, that is not true because a user can reduce the
logical_decoding_work_mem to a smaller value before the most recent
change.

We fix it by allowing to evict the transactions until we reach under the
memory limit.

Reported-by: Fujii Masao
Author: Amit Kapila
Reviewed-by: Fujii Masao
Backpatch-through: 13, where it was introduced
Discussion: https://postgr.es/m/2b7ba291-22e0-a187-d167-9e5309a3458d@oss.nttdata.com
2020-06-10 10:20:10 +05:30
Michael Paquier
879ad9f90e Fix crash in WAL sender when starting physical replication
Since database connections can be used with WAL senders in 9.4, it is
possible to use physical replication.  This commit fixes a crash when
starting physical replication with a WAL sender using a database
connection, caused by the refactoring done in 850196b.

There have been discussions about forbidding the use of physical
replication in a database connection, but this is left for later,
taking care only of the crash new to 13.

While on it, add a test to check for a failure when attempting logical
replication if the WAL sender does not have a database connection.  This
part is extracted from a larger patch by Kyotaro Horiguchi.

Reported-by: Vladimir Sitnikov
Author: Michael Paquier, Kyotaro Horiguchi
Reviewed-by: Kyotaro Horiguchi, Álvaro Herrera
Discussion: https://postgr.es/m/CAB=Je-GOWMj1PTPkeUhjqQp-4W3=nW-pXe2Hjax6rJFffB5_Aw@mail.gmail.com
Backpatch-through: 13
2020-06-08 10:12:24 +09:00
Peter Eisentraut
0fd2a79a63 Spelling adjustments 2020-06-07 15:06:51 +02:00
Michael Paquier
c1669fd581 Fix instance of elog() called while holding a spinlock
This broke the project rule to not call any complex code while a
spinlock is held.  Issue introduced by b89e151.

Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com
Backpatch-through: 9.5
2020-06-04 10:17:49 +09:00
Tom Lane
f88bd3139f Don't call palloc() while holding a spinlock, either.
Fix some more violations of the "only straight-line code inside a
spinlock" rule.  These are hazardous not only because they risk
holding the lock for an excessively long time, but because it's
possible for palloc to throw elog(ERROR), leaving a stuck spinlock
behind.

copy_replication_slot() had two separate places that did pallocs
while holding a spinlock.  We can make the code simpler and safer
by copying the whole ReplicationSlot struct into a local variable
while holding the spinlock, and then referencing that copy.
(While that's arguably more cycles than we really need to spend
holding the lock, the struct isn't all that big, and this way seems
far more maintainable than copying fields piecemeal.  Anyway this
is surely much cheaper than a palloc.)  That bug goes back to v12.

InvalidateObsoleteReplicationSlots() not only did a palloc while
holding a spinlock, but for extra sloppiness then leaked the memory
--- probably for the lifetime of the checkpointer process, though
I didn't try to verify that.  Fortunately that silliness is new
in HEAD.

pg_get_replication_slots() had a cosmetic violation of the rule,
in that it only assumed it's safe to call namecpy() while holding
a spinlock.  Still, that's a hazard waiting to bite somebody, and
there were some other cosmetic coding-rule violations in the same
function, so clean it up.  I back-patched this as far as v10; the
code exists before that but it looks different, and this didn't
seem important enough to adapt the patch further back.

Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com
2020-06-03 12:36:23 -04:00
Fujii Masao
caa3c4242c Don't call elog() while holding spinlock.
Previously UpdateSpillStats() called elog(DEBUG2) while holding
the spinlock even though the local variables that the elog() accesses
don't need to be protected by the lock. Since spinlocks are intended
for very short-term locks, they should not be used when calling
elog(DEBUG2). So this commit moves that elog() out of spinlock period.

Author: Kyotaro Horiguchi
Reviewed-by: Amit Kapila and Fujii Masao
Discussion: https://postgr.es/m/20200602.161518.1399689010416646074.horikyota.ntt@gmail.com
2020-06-02 19:21:04 +09:00
Peter Eisentraut
574925bfd0 Remove unnecessary cast
Probably copied from nearby calls where it is necessary.  But this one
also casts away constness, so it was doubly annoying.
2020-05-22 10:36:49 +02:00
Amit Kapila
7e041b0c1d Fix comment in slot.c.
Reported-by: Sawada Masahiko
Author: Sawada Masahiko
Reviewed-by: Amit Kapila
Backpatch-through: 9.5
Discussion: https://postgr.es/m/CA+fd4k4Ws7M7YQ8PqSym5WB1y75dZeBTd1sZJUQdfe0KJQ-iSA@mail.gmail.com
2020-05-18 07:53:26 +05:30
Michael Paquier
2c8dd05d6c Make pg_stat_wal_receiver consistent with the WAL receiver's shmem info
d140f2f3 has renamed receivedUpto to flushedUpto, and has added
writtenUpto to the WAL receiver's shared memory information, but
pg_stat_wal_receiver was not consistent with that.  This commit renames
received_lsn to flushed_lsn, and adds a new column called written_lsn.

Bump catalog version.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/20200515090817.GA212736@paquier.xyz
2020-05-17 09:22:07 +09:00
Alvaro Herrera
1d3743023e Fix walsender error cleanup code
In commit 850196b610 I (Álvaro) failed to handle the case of walsender
shutting down on an error before setting up its 'xlogreader' pointer;
the error handling code dereferences the pointer, causing a crash.
Fix by testing the pointer before trying to dereference it.

Kyotaro authored the code fix; I adopted Nathan's test case to be used
by the TAP tests and added the necessary PostgresNode change.

Reported-by: Nathan Bossart <bossartn@amazon.com>
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/C04FC24E-903D-4423-B312-6910E4D846E5@amazon.com
2020-05-15 20:00:52 -04:00
Tom Lane
8048404939 Fix bogus initialization of replication origin shared memory state.
The previous coding zeroed out offsetof(ReplicationStateCtl, states)
more bytes than it was entitled to, as a consequence of starting the
zeroing from the wrong pointer (or, if you prefer, using the wrong
calculation of how much to zero).

It's unsurprising that this has not caused any reported problems,
since it can be expected that the newly-allocated block is at the end
of what we've used in shared memory, and we always make the shmem
block substantially bigger than minimally necessary.  Nonetheless,
this is wrong and it could bite us someday; plus it's a dangerous
model for somebody to copy.

This dates back to the introduction of this code (commit 5aa235042),
so back-patch to all supported branches.
2020-05-15 19:05:39 -04:00
Tom Lane
36ac359d36 Rename assorted LWLock tranches.
Choose names that fit into the conventions for wait event names
(particularly, that multi-word names are in the style MultiWordName)
and hopefully convey more information to non-hacker users than the
previous names did.

Also rename SerializablePredicateLockListLock to
SerializablePredicateListLock; the old name was long enough to cause
table formatting problems, plus the double occurrence of "Lock" seems
confusing/error-prone.

Also change a couple of particularly opaque LWLock field names.

Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us
2020-05-15 18:11:07 -04:00
Tom Lane
5da14938f7 Rename SLRU structures and associated LWLocks.
Originally, the names assigned to SLRUs had no purpose other than
being shmem lookup keys, so not a lot of thought went into them.
As of v13, though, we're exposing them in the pg_stat_slru view and
the pg_stat_reset_slru function, so it seems advisable to take a bit
more care.  Rename them to names based on the associated on-disk
storage directories (which fortunately we *did* think about, to some
extent; since those are also visible to DBAs, consistency seems like
a good thing).  Also rename the associated LWLocks, since those names
are likewise user-exposed now as wait event names.

For the most part I only touched symbols used in the respective modules'
SimpleLruInit() calls, not the names of other related objects.  This
renaming could have been taken further, and maybe someday we will do so.
But for now it seems undesirable to change the names of any globally
visible functions or structs, so some inconsistency is unavoidable.

(But I *did* terminate "oldserxid" with prejudice, as I found that
name both unreadable and not descriptive of the SLRU's contents.)

Table 27.12 needs re-alphabetization now, but I'll leave that till
after the other LWLock renamings I have in mind.

Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us
2020-05-15 14:28:25 -04:00
Tom Lane
5cbfce562f Initial pgindent and pgperltidy run for v13.
Includes some manual cleanup of places that pgindent messed up,
most of which weren't per project style anyway.

Notably, it seems some people didn't absorb the style rules of
commit c9d297751, because there were a bunch of new occurrences
of function calls with a newline just after the left paren, all
with faulty expectations about how the rest of the call would get
indented.
2020-05-14 13:06:50 -04:00
Tom Lane
29c3e2dd5a Collect built-in LWLock tranche names statically, not dynamically.
There is little point in using the LWLockRegisterTranche mechanism for
built-in tranche names.  It wastes cycles, it creates opportunities for
bugs (since failing to register a tranche name is a very hard-to-detect
problem), and the lack of any centralized list of names encourages
sloppy nonconformity in name choices.  Moreover, since we have a
centralized list of the tranches anyway in enum BuiltinTrancheIds, we're
certainly not buying any flexibility in return for these disadvantages.

Hence, nuke all the backend-internal LWLockRegisterTranche calls,
and instead provide a const array of the builtin tranche names.

(I have in mind to change a bunch of these names shortly, but this
patch is just about getting them into one place.)

Discussion: https://postgr.es/m/9056.1589419765@sss.pgh.pa.us
2020-05-14 11:10:31 -04:00
Alvaro Herrera
17cc133f01 Dial back -Wimplicit-fallthrough to level 3
The additional pain from level 4 is excessive for the gain.

Also revert all the source annotation changes to their original
wordings, to avoid back-patching pain.

Discussion: https://postgr.es/m/31166.1589378554@sss.pgh.pa.us
2020-05-13 15:31:14 -04:00
Alvaro Herrera
850196b610 Adjust walsender usage of xlogreader, simplify APIs
* Have both physical and logical walsender share a 'xlogreader' state
  struct for tracking state.  This replaces the existing globals sendSeg
  and sendCxt.

* Change WALRead not to receive XLogReaderState->seg and ->segcxt as
  separate arguments anymore; just use the ones from 'state'.  This is
  made possible by the above change.

* have the XLogReader segment_open contract require the callbacks to
  install the file descriptor in the state struct themselves instead of
  returning it.  xlogreader was already ignoring any possible failed
  return from the callbacks, relying solely on them never returning.

  (This point is not altogether excellent, as it means the callbacks
  have to know more of XLogReaderState; but to really improve on that
  we would have to pass back error info from the callbacks to
  xlogreader.  And the complexity would not be saved but instead just
  transferred to the callers of WALRead, which would have to learn how
  to throw errors from the open_segment callback in addition of, as
  currently, from pg_pread.)

* segment_open no longer receives the 'segcxt' as a separate argument,
  since it's part of the XLogReaderState argument.

Per comments from Kyotaro Horiguchi.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200511203336.GA9913@alvherre.pgsql
2020-05-13 12:17:08 -04:00
Alvaro Herrera
3e9744465d Add -Wimplicit-fallthrough to CFLAGS and CXXFLAGS
Use it at level 4, a bit more restrictive than the default level, and
tweak our commanding comments to FALLTHROUGH.

(However, leave zic.c alone, since it's external code; to avoid the
warnings that would appear there, change CFLAGS for that file in the
Makefile.)

Author: Julien Rouhaud <rjuju123@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20200412081825.qyo5vwwco3fv4gdo@nol
Discussion: https://postgr.es/m/flat/E1fDenm-0000C8-IJ@gemulon.postgresql.org
2020-05-12 16:07:30 -04:00
Alvaro Herrera
a8be5364ac Fix obsolete references to "XLogRead"
The one in xlogreader.h was pointed out by Antonin Houska; I (Álvaro) noticed the
others by grepping.

Author: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/28250.1589186654@antos
2020-05-11 12:46:41 -04:00
Alvaro Herrera
b060dbe000 Rework XLogReader callback system
Code review for 0dc8ead463, prompted by a bug closed by 91c40548d5.

XLogReader's system for opening and closing segments had gotten too
complicated, with callbacks being passed at both the XLogReaderAllocate
level (read_page) as well as at the WALRead level (segment_open).  This
was confusing and hard to follow, so restructure things so that these
callbacks are passed together at XLogReaderAllocate time, and add
another callback to the set (segment_close) to make it a coherent whole.
Also, ensure XLogReaderState is an argument to all the callbacks, so
that they can grab at the ->private data if necessary.

Document the whole arrangement more clearly.

Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200422175754.GA19858@alvherre.pgsql
2020-05-08 15:40:11 -04:00
Alvaro Herrera
d0abe78d84 Check slot->restart_lsn validity in a few more places
Lack of these checks could cause visible misbehavior, including
assertion failures.  This was missed in commit c655077639, whereby
restart_lsn becomes invalid when the size limit is exceeded.

Also reword some existing error messages, and add errdetail(), so that
the reported errors all match in spirit.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200408.093710.447591748588426656.horikyota.ntt@gmail.com
2020-04-28 20:39:04 -04:00
Noah Misch
f246ea3b2a In caught-up logical walsender, sleep only in WalSndWaitForWal().
Before sleeping, WalSndWaitForWal() sends a keepalive if MyWalSnd->write
< sentPtr.  When the latest physical LSN yields no logical replication
messages (a common case), that keepalive elicits a reply.  Processing
the reply updates pg_stat_replication.replay_lsn.  WalSndLoop() lacks
that; when WalSndLoop() slept, replay_lsn advancement could stall until
wal_receiver_status_interval elapsed.  This sometimes stalled
src/test/subscription/t/001_rep_changes.pl for up to 10s.

Reviewed by Fujii Masao and Michael Paquier.

Discussion: https://postgr.es/m/20200418070142.GA1075445@rfd.leadboat.com
2020-04-25 10:18:12 -07:00
Noah Misch
72a3dc321d Revert "When WalSndCaughtUp, sleep only in WalSndWaitForWal()."
This reverts commit 4216858122.  It caused
idle physical walsenders to busy-wait, as reported by Fujii Masao.

Discussion: https://postgr.es/m/20200417054146.GA1061007@rfd.leadboat.com
2020-04-25 10:17:26 -07:00
Robert Haas
3989dbdf12 Rename exposed identifiers to say "backup manifest".
Function names declared "extern" now use BackupManifest in the name
rather than just Manifest, and data types use backup_manifest
rather than just manifest.

Per note from Michael Paquier.

Discussion: http://postgr.es/m/20200418125713.GG350229@paquier.xyz
2020-04-23 08:44:06 -04:00
Robert Haas
079ac29d4d Move the server's backup manifest code to a separate file.
basebackup.c is already a pretty big and complicated file, so it
makes more sense to keep the backup manifest support routines
in a separate file, for clarity and ease of maintenance.

Discussion: http://postgr.es/m/CA+TgmoavRak5OdP76P8eJExDYhPEKWjMb0sxW7dF01dWFgE=uA@mail.gmail.com
2020-04-20 14:38:15 -04:00
Tom Lane
f332241a60 Fix race conditions in synchronous standby management.
We have repeatedly seen the buildfarm reach the Assert(false) in
SyncRepGetSyncStandbysPriority.  This apparently is due to failing to
consider the possibility that the sync_standby_priority values in
shared memory might be inconsistent; but they will be whenever only
some of the walsenders have updated their values after a change in
the synchronous_standby_names setting.  That function is vastly too
complex for what it does, anyway, so rewriting it seems better than
trying to apply a band-aid fix.

Furthermore, the API of SyncRepGetSyncStandbys is broken by design:
it returns a list of WalSnd array indexes, but there is nothing
guaranteeing that the contents of the WalSnd array remain stable.
Thus, if some walsender exits and then a new walsender process
takes over that WalSnd array slot, a caller might make use of
WAL position data that it should not, potentially leading to
incorrect decisions about whether to release transactions that
are waiting for synchronous commit.

To fix, replace SyncRepGetSyncStandbys with a new function
SyncRepGetCandidateStandbys that copies all the required data
from shared memory while holding the relevant mutexes.  If the
associated walsender process then exits, this data is still safe to
make release decisions with, since we know that that much WAL *was*
sent to a valid standby server.  This incidentally means that we no
longer need to treat sync_standby_priority as protected by the
SyncRepLock rather than the per-walsender mutex.

SyncRepGetSyncStandbys is no longer used by the core code, so remove
it entirely in HEAD.  However, it seems possible that external code is
relying on that function, so do not remove it from the back branches.
Instead, just remove the known-incorrect Assert.  When the bug occurs,
the function will return a too-short list, which callers should treat
as meaning there are not enough sync standbys, which seems like a
reasonably safe fallback until the inconsistent state is resolved.
Moreover it's bug-compatible with what has been happening in non-assert
builds.  We cannot do anything about the walsender-replacement race
condition without an API/ABI break.

The bogus assertion exists back to 9.6, but 9.6 is sufficiently
different from the later branches that the patch doesn't apply at all.
I chose to just remove the bogus assertion in 9.6, feeling that the
probability of a bad outcome from the walsender-replacement race
condition is too low to justify rewriting the whole patch for 9.6.

Discussion: https://postgr.es/m/21519.1585272409@sss.pgh.pa.us
2020-04-18 14:02:44 -04:00
Michael Paquier
8128b0c152 Fix collection of typos and grammar mistakes in the tree, volume 2
This fixes some comments and documentation new as of Postgres 13, and is
a follow-up of the work done in dd0f37e.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20200408165653.GF2228@telsasoft.com
2020-04-14 14:45:43 +09:00
Noah Misch
4216858122 When WalSndCaughtUp, sleep only in WalSndWaitForWal().
Before sleeping, WalSndWaitForWal() sends a keepalive if MyWalSnd->write
< sentPtr.  That is important in logical replication.  When the latest
physical LSN yields no logical replication messages (a common case),
that keepalive elicits a reply, and processing the reply updates
pg_stat_replication.replay_lsn.  WalSndLoop() lacks that; when
WalSndLoop() slept, replay_lsn advancement could stall until
wal_receiver_status_interval elapsed.  This sometimes stalled
src/test/subscription/t/001_rep_changes.pl for up to 10s.

Discussion: https://postgr.es/m/20200406063649.GA3738151@rfd.leadboat.com
2020-04-11 10:30:00 -07:00
Peter Eisentraut
12fb189bfe Fix RELCACHE_FORCE_RELEASE issue
Introduced by 83fd4532a7.  To fix, the
tuple descriptors need to be copied into the current memory context.

Discussion: https://www.postgresql.org/message-id/04d78603-edae-9243-9dde-fe3037176a7d@2ndquadrant.com
2020-04-11 15:07:25 +02:00
Michael Paquier
dd0f37ecce Fix collection of typos and grammar mistakes in the tree
This fixes some comments and documentation new as of Postgres 13.

Author: Justin Pryzby
Discussion: https://postgr.es/m/20200408165653.GF2228@telsasoft.com
2020-04-10 11:18:39 +09:00
Fujii Masao
1ec50a81ec Exclude backup_manifest file that existed in database, from BASE_BACKUP.
If there is already a backup_manifest file in the database cluster,
it belongs to the past backup that was used to start this server.
It is not correct for the backup being taken now. So this commit
changes pg_basebackup so that it always skips such backup_manifest
file. The backup_manifest file for the current backup will be injected
separately if users want it.

Author: Fujii Masao
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/78f76a3d-1a28-a97d-0394-5c96985dd1c0@oss.nttdata.com
2020-04-09 22:37:11 +09:00
Thomas Munro
d140f2f3e2 Rationalize GetWalRcv{Write,Flush}RecPtr().
GetWalRcvWriteRecPtr() previously reported the latest *flushed*
location.  Adopt the conventional terminology used elsewhere in the tree
by renaming it to GetWalRcvFlushRecPtr(), and likewise for some related
variables that used the term "received".

Add a new definition of GetWalRcvWriteRecPtr(), which returns the latest
*written* value.  This will allow later patches to use the value for
non-data-integrity purposes, without having to wait for the flush
pointer to advance.

Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKGJ4VJN8ttxScUFM8dOKX0BrBiboo5uz1cq%3DAovOddfHpA%40mail.gmail.com
2020-04-08 23:45:09 +12:00
Peter Eisentraut
83fd4532a7 Allow publishing partition changes via ancestors
To control whether partition changes are replicated using their own
identity and schema or an ancestor's, add a new parameter that can be
set per publication named 'publish_via_partition_root'.

This allows replicating a partitioned table into a different partition
structure on the subscriber.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Petr Jelinek <petr@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-04-08 11:19:23 +02:00
Alvaro Herrera
c655077639 Allow users to limit storage reserved by replication slots
Replication slots are useful to retain data that may be needed by a
replication system.  But experience has shown that allowing them to
retain excessive data can lead to the primary failing because of running
out of space.  This new feature allows the user to configure a maximum
amount of space to be reserved using the new option
max_slot_wal_keep_size.  Slots that overrun that space are invalidated
at checkpoint time, enabling the storage to be released.

Author: Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20170228.122736.123383594.horiguchi.kyotaro@lab.ntt.co.jp
2020-04-07 18:35:00 -04:00
Peter Eisentraut
f1ac27bfda Add logical replication support to replicate into partitioned tables
Mainly, this adds support code in logical/worker.c for applying
replicated operations whose target is a partitioned table to its
relevant partitions.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Reviewed-by: Petr Jelinek <petr@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-04-06 15:15:52 +02:00
Peter Eisentraut
a9d9bdd3ad Save errno across LWLockRelease() calls
Fixup for "Drop slot's LWLock before returning from SaveSlotToPath()"

Reported-by: Michael Paquier <michael@paquier.xyz>
2020-04-05 10:02:00 +02:00
Robert Haas
3e0d80fd8d Fix resource management bug with replication=database.
Commit 0d8c9c1210 allowed BASE_BACKUP to
acquire a ResourceOwner without a transaction so that the backup
manifest functionality could use a BufFile, but it overlooked the fact
that when a walsender is used with replication=database, it might have
a transaction in progress, because in that mode, SQL and replication
commands can be mixed.  Try to fix things up so that the two cleanup
mechanisms don't conflict.

Per buildfarm member serinus, which triggered the problem when
CREATE_REPLICATION_SLOT failed from inside a transaction.  It passed
on the subsequent run, so evidently the failure doesn't happen every
time.
2020-04-03 22:28:37 -04:00
Robert Haas
db1531cae0 Be more careful about time_t vs. pg_time_t in basebackup.c.
lapwing is complaining that about a call to pg_gmtime, saying that
it "expected 'const pg_time_t *' but argument is of type 'time_t *'".
I at first thought that the problem had someting to do with const,
but Thomas Munro suggested that it might be just because time_t
and pg_time_t are different identifers. lapwing is i686 rather than
x86_64, and pg_time_t is always int64, so that seems like a good
guess.

There is other code that just casts time_t to pg_time_t without
any conversion function, so try that approach here.

Introduced in commit 0d8c9c1210.
2020-04-03 20:18:47 -04:00
Robert Haas
0d8c9c1210 Generate backup manifests for base backups, and validate them.
A manifest is a JSON document which includes (1) the file name, size,
last modification time, and an optional checksum for each file backed
up, (2) timelines and LSNs for whatever WAL will need to be replayed
to make the backup consistent, and (3) a checksum for the manifest
itself. By default, we use CRC-32C when checksumming data files,
because we are trying to detect corruption and user error, not foil an
adversary. However, pg_basebackup and the server-side BASE_BACKUP
command now have options to select a different algorithm, so users
wanting a cryptographic hash function can select SHA-224, SHA-256,
SHA-384, or SHA-512. Users not wanting file checksums at all can
disable them, or disable generating of the backup manifest altogether.
Using a cryptographic hash function in place of CRC-32C consumes
significantly more CPU cycles, which may slow down backups in some
cases.

A new tool called pg_validatebackup can validate a backup against the
manifest. If no checksums are present, it can still check that the
right files exist and that they have the expected sizes. If checksums
are present, it can also verify that each file has the expected
checksum. Additionally, it calls pg_waldump to verify that the
expected WAL files are present and parseable. Only plain format
backups can be validated directly, but tar format backups can be
validated after extracting them.

Robert Haas, with help, ideas, review, and testing from David Steele,
Stephen Frost, Andrew Dunstan, Rushabh Lathia, Suraj Kharage, Tushar
Ahuja, Rajkumar Raghuwanshi, Mark Dilger, Davinder Singh, Jeevan
Chalke, Amit Kapila, Andres Freund, and Noah Misch.

Discussion: http://postgr.es/m/CA+TgmoZV8dw1H2bzZ9xkKwdrk8+XYa+DC9H=F7heO2zna5T6qg@mail.gmail.com
2020-04-03 15:05:59 -04:00
Peter Eisentraut
d8653f4687 Refactor code to look up local replication tuple
This unifies some duplicate code.

Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://www.postgresql.org/message-id/CA+HiwqFjYE5anArxvkjr37AQMd52L-LZtz9Ld2QrLQ3YfcYhTw@mail.gmail.com
2020-04-01 15:34:41 +02:00
Michael Paquier
616ae3d2b0 Move routine definitions of xlogarchive.c to a new header file
The definitions of the routines defined in xlogarchive.c have been part
of xlog_internal.h which is included by several frontend tools, but all
those routines are only called by the backend.  More cleanup could be
done within xlog_internal.h, but that's already a nice cut.

This will help a follow-up patch for pg_rewind where handling of
restore_command is added for frontends.

Author: Alexey Kondratov, Michael Paquier
Reviewed-by: Álvaro Herrera, Alexander Korotkov
Discussion: https://postgr.es/m/a3acff50-5a0d-9a2c-b3b2-ee36168955c1@postgrespro.ru
2020-03-31 15:33:04 +09:00
Alvaro Herrera
1e6148032e Allow walreceiver configuration to change on reload
The parameters primary_conninfo, primary_slot_name and
wal_receiver_create_temp_slot can now be changed with a simple "reload"
signal, no longer requiring a server restart.  This is achieved by
signalling the walreceiver process to terminate and having it start
again with the new values.

Thanks to Andres Freund, Kyotaro Horiguchi, Fujii Masao for discussion.

Author: Sergei Kornilov <sk@zsrv.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/19513901543181143@sas1-19a94364928d.qloud-c.yandex.net
2020-03-27 19:51:37 -03:00
Alvaro Herrera
092c6936de Set wal_receiver_create_temp_slot PGC_POSTMASTER
Commit 3297308278 gave walreceiver the ability to create and use a
temporary replication slot, and made it controllable by a GUC (enabled
by default) that can be changed with SIGHUP.  That's useful but has two
problems: one, it's possible to cause the origin server to fill its disk
if the slot doesn't advance in time; and also there's a disconnect
between state passed down via the startup process and GUCs that
walreceiver reads directly.

We handle the first problem by setting the option to disabled by
default.  If the user enables it, its on their head to make sure that
disk doesn't fill up.

We handle the second problem by passing the flag via startup rather than
having walreceiver acquire it directly, and making it PGC_POSTMASTER
(which ensures a walreceiver always has the fresh value).  A future
commit can relax this (to PGC_SIGHUP again) by having the startup
process signal walreceiver to shutdown whenever the value changes.

Author: Sergei Kornilov <sk@zsrv.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://postgr.es/m/20200122055510.GH174860@paquier.xyz
2020-03-27 16:20:33 -03:00
Peter Eisentraut
49bf81536e Drop slot's LWLock before returning from SaveSlotToPath()
When SaveSlotToPath() is called with elevel=LOG, the early exits didn't
release the slot's io_in_progress_lock.

This could result in a walsender being stuck on the lock forever.  A
possible way to get into this situation is if the offending code paths
are triggered in a low disk space situation.

Author: Pavan Deolasee <pavan.deolasee@2ndquadrant.com>
Reported-by: Craig Ringer <craig@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/56a138c5-de61-f553-7e8f-6789296de785%402ndquadrant.com
2020-03-26 13:29:20 +01:00
Andres Freund
112b006fe7 logical decoding: Remove TODO about unnecessary optimization.
Measurements show, and intuition agrees, that there's currently no
known cases where adding a fastpath to avoid allocating / ordering a
heap for a single transaction is worthwhile.

Author: Dilip Kumar
Discussion: https://postgr.es/m/CAFiTN-sp701wvzvnLQJGk7JDqrFM8f--97-ihbwkU8qvn=p8nw@mail.gmail.com
2020-03-24 12:15:03 -07:00
Peter Eisentraut
cef27ae01a Fix compiler warning
A variable was unused in non-assert builds.  Simplify the code to
avoid the issue.

Reported-by: Erik Rijkers <er@xs4all.nl>
2020-03-24 16:02:01 +01:00
Peter Eisentraut
97ee604d9b Some refactoring of logical/worker.c
This moves the main operations of apply_handle_{insert|update|delete},
that of inserting, updating, deleting a tuple into/from a given
relation, into corresponding
apply_handle_{insert|update|delete}_internal functions.  This allows
performing those operations on relations that are not directly the
targets of replication, which is something a later patch will use for
targeting partitioned tables.

Author: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Peter Eisentraut <peter.eisentraut@2ndquadrant.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-24 15:00:54 +01:00
Fujii Masao
67e0adfb3f Report NULL as total backup size if it's not estimated.
Previously 0 was reported in pg_stat_progress_basebackup.total_backup
if the total backup size was not estimated. Per discussion, our consensus
is that NULL is better choise as the value in total_backup in that case.
So this commit makes pg_stat_progress_basebackup view report NULL
in total_backup column if the estimation is disabled.

Bump catversion.

Author: Fujii Masao
Reviewed-by: Amit Langote, Magnus Hagander, Alvaro Herrera
Discussion: https://postgr.es/m/CABUevExnhOD89zBDuPvfAAh243RzNpwCPEWNLtMYpKHMB8gbAQ@mail.gmail.com
2020-03-24 10:43:41 +09:00
Peter Eisentraut
c314c147c0 Prepare to support non-tables in publications
This by itself doesn't change any functionality but prepares the way
for having relations other than base tables in publications.

Make arrangements for COPY handling the initial table sync.  For
non-tables we have to use COPY (SELECT ...) instead of directly
copying from the table, but then we have to take care to omit
generated columns from the column list.

Also, remove a hardcoded reference to relkind = 'r' and rely on the
publisher to send only what it can actually publish, which will be
correct even in future cross-version scenarios.

Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+HiwqH=Y85vRK3mOdjEkqFK+E=ST=eQiHdpj43L=_eJMOOznQ@mail.gmail.com
2020-03-19 08:25:07 +01:00