When ALTER SUBSCRIPTION DISABLE is run in the same transaction before
DROP SUBSCRIPTION, the latter will hang because workers will still be
running, not having seen the DISABLE committed, and DROP SUBSCRIPTION
will wait until the workers have vacated the replication origin slots.
Previously, DROP SUBSCRIPTION killed the logical replication workers
immediately only if it was going to drop the replication slot, otherwise
it scheduled the worker killing for the end of the transaction, as a
result of 7e174fa793a2df89fe03d002a5087ef67abcdde8. This, however,
causes the present problem. To fix, kill the workers immediately in all
cases. This covers all cases: A subscription that doesn't have a
replication slot must be disabled. It was either disabled in the same
transaction, or it was already disabled before the current transaction,
but then there shouldn't be any workers left and this won't make a
difference.
Reported-by: Arseny Sher <a.sher@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/flat/87mv6av84w.fsf%40ars-thinkpad
Add item about number of times statement-level triggers will be fired.
Rearrange the compatibility items into (what seems to me) a less
random ordering.
AfterTriggerEndQuery correctly notes that the query_stack could get
repalloc'd during a trigger firing, but it nonetheless passes the address
of a query_stack entry to afterTriggerInvokeEvents, so that if such a
repalloc occurs, afterTriggerInvokeEvents is already working with an
obsolete dangling pointer while it scans the rest of the events. Oops.
The only code at risk is its "delete_ok" cleanup code, so we can
prevent unsafe behavior by passing delete_ok = false instead of true.
However, that could have a significant performance penalty, because the
point of passing delete_ok = true is to not have to re-scan possibly
a large number of dead trigger events on the next time through the loop.
There's more than one way to skin that cat, though. What we can do is
delete all the "chunks" in the event list except the last one, since
we know all events in them must be dead. Deleting the chunks is work
we'd have had to do later in AfterTriggerEndQuery anyway, and it ends
up saving rescanning of just about the same events we'd have gotten
rid of with delete_ok = true.
In v10 and HEAD, we also have to be careful to mop up any per-table
after_trig_events pointers that would become dangling. This is slightly
annoying, but I don't think that normal use-cases will traverse this code
path often enough for it to be a performance problem.
It's pretty hard to hit this in practice because of the unlikelihood
of the query_stack getting resized at just the wrong time. Nonetheless,
it's definitely a live bug of ancient standing, so back-patch to all
supported branches.
Discussion: https://postgr.es/m/2891.1505419542@sss.pgh.pa.us
Commit 0f79440fb introduced mechanism to keep AFTER STATEMENT triggers
from firing more than once per statement, which was formerly possible
if more than one FK enforcement action had to be applied to a given
table. Add a similar mechanism for BEFORE STATEMENT triggers, so that
we don't have the unexpected situation of firing BEFORE STATEMENT
triggers more often than AFTER STATEMENT.
As with the previous patch, back-patch to v10.
Discussion: https://postgr.es/m/22315.1505584992@sss.pgh.pa.us
I noticed that there were exactly no complete examples of use of
a transition table in a trigger function, and no clear description
of just how you'd do it either. Improve that.
The standard says that all changes of the same kind (insert, update, or
delete) caused in one table by a single SQL statement should be reported
in a single transition table; and by that, they mean to include foreign key
enforcement actions cascading from the statement's direct effects. It's
also reasonable to conclude that if the standard had wCTEs, they would say
that effects of wCTEs applying to the same table as each other or the outer
statement should be merged into one transition table. We weren't doing it
like that.
Hence, arrange to merge tuples from multiple update actions into a single
transition table as much as we can. There is a problem, which is that if
the firing of FK enforcement triggers and after-row triggers with
transition tables is interspersed, we might need to report more tuples
after some triggers have already seen the transition table. It seems like
a bad idea for the transition table to be mutable between trigger calls.
There's no good way around this without a major redesign of the FK logic,
so for now, resolve it by opening a new transition table each time this
happens.
Also, ensure that AFTER STATEMENT triggers fire just once per statement,
or once per transition table when we're forced to make more than one.
Previous versions of Postgres have allowed each FK enforcement query
to cause an additional firing of the AFTER STATEMENT triggers for the
referencing table, but that's certainly not per spec. (We're still
doing multiple firings of BEFORE STATEMENT triggers, though; is that
something worth changing?)
Also, forbid using transition tables with column-specific UPDATE triggers.
The spec requires such transition tables to show only the tuples for which
the UPDATE trigger would have fired, which means maintaining multiple
transition tables or else somehow filtering the contents at readout.
Maybe someday we'll bother to support that option, but it looks like a
lot of trouble for a marginal feature.
The transition tables are now managed by the AfterTriggers data structures,
rather than being directly the responsibility of ModifyTable nodes. This
removes a subtransaction-lifespan memory leak introduced by my previous
band-aid patch 3c4359521.
In passing, refactor the AfterTriggers data structures to reduce the
management overhead for them, by using arrays of structs rather than
several parallel arrays for per-query-level and per-subtransaction state.
I failed to resist the temptation to do some copy-editing on the SGML
docs about triggers, above and beyond merely documenting the effects
of this patch.
Back-patch to v10, because we don't want the semantics of transition
tables to change post-release.
Patch by me, with help and review from Thomas Munro.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
Document that rsync is an _optional_ way to upgrade standbys, suggest
rsync option --dry-run, and mention a way of upgrading one standby from
another using rsync. Also clarify some instructions by specifying if
they operate on the old or new clusters.
Reported-by: Stephen Frost, Magnus Hagander
Discussion: https://postgr.es/m/20170914191250.GB6595@momjian.us
Backpatch-through: 9.5
During the development of d47cfef711 the CFI()s in ExecScan() were
moved back and forth, ending up in the wrong place. Thus queries that
largely spend their time in ExecScan(), and have neither projection
nor a qual, can't be cancelled in a timely manner.
Reported-By: Jeff Janes
Author: Andres Freund
Discussion: https://postgr.es/m/CAMkU=1weDXp8eLLPt9SO1LEUsJYYK9cScaGhLKpuN+WbYo9b5g@mail.gmail.com
Backpatch: 10, as d47cfef711
The order in which GRANTs are output is important as GRANTs which have
been GRANT'd by individuals via WITH GRANT OPTION GRANTs have to come
after the GRANT which included the WITH GRANT OPTION. This happens
naturally in the backend during normal operation as we only change
existing ACLs in-place, only add new ACLs to the end, and when removing
an ACL we remove any which depend on it also.
Also, adjust the comments in acl.h to make this clear.
Unfortunately, the updates to pg_dump to handle initial privileges
involved pulling apart ACLs and then combining them back together and
could end up putting them back together in an invalid order, leading to
dumps which wouldn't restore.
Fix this by adjusting the queries used by pg_dump to ensure that the
ACLs are rebuilt in the same order in which they were originally.
Back-patch to 9.6 where the changes for initial privileges were done.
The documentation claimed that one should send
"pg_same_as_startup_message" as the user name in the SCRAM messages, but
this did not match the actual implementation, so remove it.
This makes it clear that pg_upgrade standby upgrade instructions should
only be used in link mode, adds examples, and explains how rsync works
with links.
Reported-by: Andreas Joseph Krogh
Discussion: https://postgr.es/m/VisenaEmail.6c.c0e592c5af4ef0a2.15e785dcb61@tc7-visena
Backpatch-through: 9.5
The previous error message when attempting to run a general SQL command
in a physical replication WAL sender was a bit sloppy.
Reported-by: Fujii Masao <masao.fujii@gmail.com>
When copying from an active database tree, it's possible for files to be
deleted after we see them in a readdir() scan but before we can open them.
(Once we've got a file open, we don't expect any further errors from it
getting unlinked, though.) Tweak RecursiveCopy so it can cope with this
case, so as to avoid irreproducible test failures.
Back-patch to 9.6 where this code was added. In v10 and HEAD, also
remove unused "use RecursiveCopy" in one recovery test script.
Michael Paquier and Tom Lane
Discussion: https://postgr.es/m/24621.1504924323@sss.pgh.pa.us
AFTER triggers using transition tables crashed if they were fired due
to a foreign key ON CASCADE update. This is because ExecEndModifyTable
flushes the transition tables, on the assumption that any trigger that
could need them was already fired during ExecutorFinish. Normally
that's true, because we don't allow transition-table-using triggers
to be deferred. However, foreign key CASCADE updates force any
triggers on the referencing table to be deferred to the outer query
level, by means of the EXEC_FLAG_SKIP_TRIGGERS flag. I don't recall
all the details of why it's like that and am pretty loath to redesign
it right now. Instead, just teach ExecEndModifyTable to skip destroying
the TransitionCaptureState when that flag is set. This will allow the
transition table data to survive until end of the current subtransaction.
This isn't a terribly satisfactory solution, because (1) we might be
leaking the transition tables for much longer than really necessary,
and (2) as things stand, an AFTER STATEMENT trigger will fire once per
RI updating query, ie once per row updated or deleted in the referenced
table. I suspect that is not per SQL spec. But redesigning this is a
research project that we're certainly not going to get done for v10.
So let's go with this hackish answer for now.
In passing, tweak AfterTriggerSaveEvent to not save the transition_capture
pointer into the event record for a deferrable trigger. This is not
necessary to fix the current bug, but it avoids letting dangling pointers
to long-gone transition tables persist in the trigger event queue. That's
at least a safety feature. It might also allow merging shared trigger
states in more cases than before.
I added a regression test that demonstrates the crash on unpatched code,
and also exposes the behavior of firing the AFTER STATEMENT triggers
once per row update.
Per bug #14808 from Philippe Beaudoin. Back-patch to v10.
Discussion: https://postgr.es/m/20170909064853.25630.12825@wrigleys.postgresql.org
map_partition_varattnos() failed to set its found_whole_row output
parameter if the given expression list was NIL. This seems to be
a pre-existing bug that chanced to be exposed by commit 6f6b99d13.
It might be unreachable in v10, but I have little faith in that
proposition, so back-patch.
Per buildfarm.
This doesn't allow routing tuple to the foreign partitions themselves,
but it permits tuples to be routed to regular partitions despite the
presence of foreign partitions in the same inheritance hierarchy.
Etsuro Fujita, reviewed by Amit Langote and by me.
Discussion: http://postgr.es/m/bc3db4c1-1693-3b8a-559f-33ad2b50b7ad@lab.ntt.co.jp
We already had a psql variable VERSION that shows the verbose form of
psql's own version. Add VERSION_NAME to show the short form (e.g.,
"11devel") and VERSION_NUM to show the numeric form (e.g., 110000).
Also add SERVER_VERSION_NAME and SERVER_VERSION_NUM to show the short and
numeric forms of the server's version. (We'd probably add SERVER_VERSION
with the verbose string if it were readily available; but adding another
network round trip to get it seems too expensive.)
The numeric forms, in particular, are expected to be useful for scripting
purposes, now that psql can do conditional tests.
Back-patch of commit 9ae9d8c1549c384dbdb8363e1d932b7311d25c56.
Fabien Coelho, reviewed by Pavel Stehule
Discussion: https://postgr.es/m/alpine.DEB.2.20.1704020917220.4632@lancre
The NAMEDTUPLESTORE patch piggybacked on the infrastructure for
TABLEFUNC/VALUES/CTE RTEs, none of which can ever have dropped columns,
so the possibility was ignored most places. Fix that, including adding a
specification to parsenodes.h about what it's supposed to look like.
In passing, clean up assorted comments that hadn't been maintained
properly by said patch.
Per bug #14799 from Philippe Beaudoin. Back-patch to v10.
Discussion: https://postgr.es/m/20170906120005.25630.84360@wrigleys.postgresql.org
Evidently somebody neglected to update this sometime in the v10 cycle.
Patching REL_10_STABLE only; this value is about to be obsolete in
HEAD anyway. Noted while examining \gdesc patch.
Throttling for sending a base backup in walsender is broken for the case
where there is a lot of WAL traffic, because the latch used to put the
walsender to sleep is also signalled by regular WAL traffic (and each
signal causes an additional batch of data to be sent); the net effect is
that there is no or little actual throttling. This is undesirable, so
rewrite the sleep into a loop to achieve the desired effeect.
Author: Jeff Janes, small tweaks by me
Reviewed-by: Antonin Houska
Discussion: https://postgr.es/m/CAMkU=1xH6mde-yL-Eo1TKBGNd0PB1-TMxvrNvqcAkN-qr2E9mw@mail.gmail.com
In commit 9d6b160d7, I tweaked pg_config.h.win32 to use
"#define HAVE_LONG_LONG_INT_64 1" rather than defining it as empty,
for consistency with what happens in an autoconf'd build.
But Solution.pm injects another definition of that macro into
ecpg_config.h, leading to justifiable (though harmless) compiler whining.
Make that one consistent too. Back-patch, like the previous patch.
Discussion: https://postgr.es/m/CAEepm=1dWsXROuSbRg8PbKLh0S=8Ou-V8sr05DxmJOF5chBxqQ@mail.gmail.com
Move the responsibility for creating/destroying TupleQueueReaders into
execParallel.c, to avoid duplicative coding in nodeGather.c and
nodeGatherMerge.c. Also, instead of having DestroyTupleQueueReader do
shm_mq_detach, do it in the caller (which is now only ExecParallelFinish).
This means execParallel.c does both the attaching and detaching of the
tuple-queue-reader shm_mqs, which seems less weird than the previous
arrangement.
These changes also eliminate a vestigial memory leak (of the pei->tqueue
array). It's now demonstrable that rescans of Gather or GatherMerge don't
leak memory.
Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us
Instead of using a cast to force the constant to be the right width,
assume we can plaster on an L, UL, LL, or ULL suffix as appropriate.
The old approach to this is very hoary, dating from before we were
willing to require compilers to have working int64 types.
This fix makes the PG_INT64_MIN, PG_INT64_MAX, and PG_UINT64_MAX
constants safe to use in preprocessor conditions, where a cast
doesn't work. Other symbolic constants that might be defined using
[U]INT64CONST are likewise safer than before.
Also fix the SIZE_MAX macro to be similarly safe, if we are forced
to provide a definition for that. The test added in commit 2e70d6b5e
happens to do what we want even with the hack "(size_t) -1" definition,
but we could easily get burnt on other tests in future.
Back-patch to all supported branches, like the previous commits.
Discussion: https://postgr.es/m/15883.1504278595@sss.pgh.pa.us
Pre-C99 platforms may lack <stdint.h> and thereby SIZE_MAX. We have
a couple of places using the hack "(size_t) -1" as a fallback, but
it wasn't universally available; which means the code added in commit
2e70d6b5e fails to compile everywhere. Move that hack to c.h so that
we can rely on having SIZE_MAX everywhere.
Per discussion, it'd be a good idea to make the macro's value safe
for use in #if-tests, but that will take a bit more work. This is
just a quick expedient to get the buildfarm green again.
Back-patch to all supported branches, like the previous commit.
Discussion: https://postgr.es/m/15883.1504278595@sss.pgh.pa.us
Do for replication origins what the previous commit did for replication
slots: restore the original behavior of replication origin drop to raise
an error rather than blocking, because users might be depending on the
original behavior. Maintain the blocking behavior when invoked
internally from logical replication subscription handling.
Discussion: https://postgr.es/m/20170830133922.tlpo3lgfejm4n2cs@alvherre.pgsql
Commit 9915de6c1cb2 changed the default behavior of
DROP_REPLICATION_SLOT so that it would wait until any session holding
the slot active would release it, instead of raising an error. But
users are already depending on the original behavior, so revert to it by
default and add a WAIT option to invoke the new behavior.
Per complaint from Simone Gotti, in
Discussion: https://postgr.es/m/CAEvsy6Wgdf90O6pUvg2wSVXL2omH5OPC-38OD4Zzgk-FXavj3Q@mail.gmail.com
Rescanning a GatherMerge led to leaking some memory in the executor's
query-lifespan context, because most of the node's working data structures
were simply abandoned and rebuilt from scratch. In practice, this might
never amount to much, given the cost of relaunching worker processes ---
but it's still pretty messy, so let's fix it.
We can rearrange things so that the tuple arrays are simply cleared and
reused, and we don't need to rebuild the TupleTableSlots either, just
clear them. One small complication is that because we might get a
different number of workers on each iteration, we can't keep the old
convention that the leader's gm_slots[] entry is the last one; the leader
might clobber a TupleTableSlot that we need for a worker in a future
iteration. Hence, adjust the logic so that the leader has slot 0 always,
while the active workers have slots 1..n.
Back-patch to v10 to keep all the existing versions of nodeGatherMerge.c
in sync --- because of the renumbering of the slots, there would otherwise
be a very large risk that any future backpatches in this module would
introduce bugs.
Discussion: https://postgr.es/m/8670.1504192177@sss.pgh.pa.us