In v13 and v14, the ENABLE/DISABLE TRIGGER USER variant malfunctioned
on cloned triggers, failing to find the clones because it thought they
were system triggers. Other variants of ENABLE/DISABLE TRIGGER would
improperly apply a superuserness check. Fix by adjusting the is-it-
a-system-trigger check to match reality in those branches. (As far
as I can find, this is the only place that got it wrong.)
There's no such bug in v15/HEAD, because we revised the catalog
representation of system triggers to be what this code was expecting.
However, add the test case to these branches anyway, because this area
is visibly pretty fragile. Also remove an obsoleted comment.
The recent v15/HEAD commit 6949b921d fixed a nearby bug. I now see
that my commit message for that was inaccurate: the behavior of
recursing to clone triggers is older than v15, but it didn't apply
to the case in v13/v14 because in those branches parent partitioned
tables have no pg_trigger entries for foreign-key triggers. But add
the test case from that commit to v13/v14, just to show what is
happening there.
Per bug #17886 from DzmitryH.
Discussion: https://postgr.es/m/17886-5406d5d828aa4aa3@postgresql.org
Up through v11 it was sensible to use the "oid" system column as
a foreign key column, but since that was removed there's no visible
usefulness in making any of the remaining system columns a foreign
key. Moreover, since the TupleTableSlot rewrites in v12, such cases
actively fail because of implicit assumptions that only user columns
appear in foreign keys. The lack of complaints about that seems
like good evidence that no one is trying to do it. Hence, rather
than trying to repair those assumptions (of which there are at least
two, maybe more), let's just forbid the case up front.
Per this patch, a system column in either the referenced or
referencing side of a foreign key will draw this error; however,
putting one in the referenced side would have failed later anyway,
since we don't allow unique indexes to be made on system columns.
Per bug #17877 from Alexander Lakhin. Back-patch to v12; the
case still appears to work in v11, so we shouldn't break it there.
Discussion: https://postgr.es/m/17877-4bcc658e33df6de1@postgresql.org
The totalrows/totaldeadrows outputs were left uninitialized in cases
where we find no analyzable child tables of a partitioned table. This
could lead to setting the partitioned table's pg_class.reltuples value
to garbage. It's not clear that that would have any very bad effects
in practice, but fix it anyway because it's making valgrind unhappy.
Reported and diagnosed by Alexander Lakhin (bug #17880).
Discussion: https://postgr.es/m/17880-9282037c923d856e@postgresql.org
find_composite_type_dependencies() ignored indexes, which is a poor
decision because an expression index could have a stored column of
a composite (or other container) type even when the underlying table
does not. Teach it to detect such cases and error out. We have to
work a bit harder than for other relations because the pg_depend entry
won't identify the specific index column of concern, but it's not much
new code.
This does not address bug #17872's original complaint that dropping
a column in such a type might lead to violations of the uniqueness
property that a unique index is supposed to ensure. That seems of
much less concern to me because it won't lead to crashes.
Per bug #17872 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17872-d0fbb799dc3fd85d@postgresql.org
Commit 4753ef37 changed vacuum_delay_point() to use the WaitLatch() API,
to fix the problem that vacuum could keep running for a very long time
after the postmaster died.
Unfortunately, that broke commit caf626b2's support for fractional
vacuum_cost_delay, which shipped in PostgreSQL 12. WaitLatch() works in
whole milliseconds.
For now, revert the change from commit 4753ef37, but add an explicit
check for postmaster death. That's an extra system call on systems
other than Linux and FreeBSD, but that overhead doesn't matter much
considering that we willingly went to sleep and woke up again. (In
later work, we might add higher resolution timeouts to the latch API so
that we could do this with our standard programming pattern, but that
wouldn't be back-patched.)
Back-patch to 14, where commit 4753ef37 arrived.
Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/CAAKRu_b-q0hXCBUCAATh0Z4Zi6UkiC0k2DFgoD3nC-r3SkR3tg%40mail.gmail.com
If MERGE attempts an UPDATE or DELETE on a table with BEFORE ROW
triggers, or a cross-partition UPDATE (with or without triggers), and
a concurrent UPDATE or DELETE happens, the merge code would fail.
In some cases this would lead to a crash, while in others it would
cause the wrong merge action to be executed, or no action at all. The
immediate cause of the crash was the trigger code calling
ExecGetUpdateNewTuple() as part of the EPQ mechanism, which fails
because during a merge ri_projectNew is NULL, since merge has its own
per-action projection information, which ExecGetUpdateNewTuple() knows
nothing about.
Fix by arranging for the trigger code to exit early, returning the
TM_Result and TM_FailureData information, if a concurrent modification
is detected, allowing the merge code to do the necessary EPQ handling
in its own way. Similarly, prevent the cross-partition update code
from doing any EPQ processing for a merge, allowing the merge code to
work out what it needs to do.
This leads to a number of simplifications in nodeModifyTable.c. Most
notably, the ModifyTableContext->GetUpdateNewTuple() callback is no
longer needed, and mergeGetUpdateNewTuple() can be deleted, since
there is no longer any requirement for get-update-new-tuple during a
merge. Similarly, ModifyTableContext->cpUpdateRetrySlot is no longer
needed. Thus ExecGetUpdateNewTuple() and the retry_slot handling of
ExecCrossPartitionUpdate() can be restored to how they were in v14,
before the merge code was added, and ExecMergeMatched() no longer
needs any special-case handling for cross-partition updates.
While at it, tidy up ExecUpdateEpilogue() a bit, making it handle
recheckIndexes locally, rather than passing it in as a parameter,
ensuring that it is freed properly. This dates back to when it was
split off from ExecUpdate() to support merge.
Per bug #17809 from Alexander Lakhin, and follow-up investigation of
bug #17792, also from Alexander Lakhin.
Back-patch to v15, where MERGE was introduced, taking care to preserve
backwards-compatibility of the trigger API in v15 for any extensions
that might use it.
Discussion:
https://postgr.es/m/17809-9e6650bef133f0fe%40postgresql.orghttps://postgr.es/m/17792-0f89452029662c36%40postgresql.org
The COPY documentation is quite clear that "COPY relation TO" copies
rows from only the named table, not any inheritance children it may
have. However, if you enabled row-level security on the table then
this stopped being true, because the code forgot to apply the ONLY
modifier in the "SELECT ... FROM relation" query that it constructs
in order to allow RLS predicates to be attached. Fix that.
Report and patch by Antonin Houska (comment adjustments and test case
by me). Back-patch to all supported branches.
Discussion: https://postgr.es/m/3472.1675251957@antos
Beginning in v15, if you apply ALTER TABLE ENABLE/DISABLE TRIGGER to
a partitioned table, it also affects the partitions' cloned versions
of the affected trigger(s). The initial implementation of this
located the clones by name, but that fails on foreign-key triggers
which have names incorporating their own OIDs. We can fix that, and
also make the behavior more bulletproof in the face of user-initiated
trigger renames, by identifying the cloned triggers by tgparentid.
Following the lead of earlier commits in this area, I took care not
to break ABI in the v15 branch, even though I rather doubt there
are any external callers of EnableDisableTrigger.
While here, update the documentation, which was not touched when
the semantics were changed.
Per bug #17817 from Alan Hodgson. Back-patch to v15; older versions
do not have this behavior.
Discussion: https://postgr.es/m/17817-31dfb7c2100d9f3d@postgresql.org
WAL_LOG does a scan of the template's pg_class to determine the set of
relations that need to be copied from a template database to the new
one. However, as coded in 9c08aea, this copy strategy would load the
pages of pg_class without considering it as a permanent relation,
causing the loaded pages to never be flushed when they should. Any
modification of the template's pg_class, mostly through DDLs, would then
be missed, causing corruptions.
STRATEGY = WAL_LOG is the default over FILE_COPY since it has been
introduced, so any changes done to pg_class on a database template would
be gone. Updates of database templates should be a rare thing, so the
impact of this bug should be hopefully limited. The pre-14 default
strategy FILE_COPY is safe, and can be used as a workaround.
Ryo Matsumura has found and analyzed the issue, and Nathan has written a
test able to reproduce the failure (with few tweaks from me).
Backpatch down to 15, where STRATEGY = WAL_LOG has been introduced.
Author: Nathan Bossart, Ryo Matsumura
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/TYCPR01MB6868677E499C9AD5123084B5E8A39@TYCPR01MB6868.jpnprd01.prod.outlook.com
Backpatch-through: 15
The motivation for this change is that when pg_dump dumps a
partitioned index that's marked REPLICA IDENTITY, it generates a
command sequence that applies REPLICA IDENTITY before the partitioned
index has been marked valid, causing restore to fail. We could
perhaps change pg_dump to not do it like that, but that would be
difficult and would not fix existing dump files with the problem.
There seems to be very little reason for the backend to disallow
this anyway --- the code ignores indisreplident when the index
isn't valid --- so instead let's fix it by allowing the case.
Commit 9511fb37a previously expressed a concern that allowing
indisreplident to be set on invalid indexes might allow us to
wind up in a situation where a table could have indisreplident
set on multiple indexes. I'm not sure I follow that concern
exactly, but in any case the only way that could happen is because
relation_mark_replica_identity is too trusting about the existing set
of markings being valid. Let's just rip out its early-exit code path
(which sure looks like premature optimization anyway; what are we
doing expending code to make redundant ALTER TABLE ... REPLICA
IDENTITY commands marginally faster and not-redundant ones marginally
slower?) and fix it to positively guarantee that no more than one
index is marked indisreplident.
The pg_dump failure can be demonstrated in all supported branches,
so back-patch all the way. I chose to back-patch 9511fb37a as well,
just to keep indisreplident handling the same in all branches.
Per bug #17756 from Sergey Belyashov.
Discussion: https://postgr.es/m/17756-dd50e8e0c8dd4a40@postgresql.org
Commit aa0105141 repeated one of the oldest mistakes in our book:
thinking that OID is the same as int32. It isn't of course, and
unsurprisingly the first person who came along with a database
OID above 2 billion broke it. Repair.
Per bug #17677 from Sergey Pankov. Back-patch to v15.
Discussion: https://postgr.es/m/17677-a99fa067d7ed71c9@postgresql.org
"Triggers on partitioned tables cannot have transition tables." is
incorrect as we allow statement-level triggers on partitioned tables to
have transition tables.
This has been wrong since commit 86f575948; back-patch to v11 where that
commit came in.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/CAPmGK17gk4vXLzz2iG%2BG4LWRWCoVyam70nZ3OuGm1hMJwDrhcg%40mail.gmail.com
Commit f56f8f8da6 added some code in CloneFkReferencing that's way too
lax about a Constraint node it manufactures, not initializing enough
struct members -- initially_valid in particular was forgotten. This
causes some FKs in partitions added by ALTER TABLE ATTACH PARTITION to
be marked as not validated. Set initially_valid true, which fixes the
bug.
While at it, make the struct initialization more complete. Very similar
code was added in two other places by the same commit; make them all
follow the same pattern for consistency, though no bugs are apparent
there.
This bug has never been reported: I only happened to notice while
working on commit 614a406b4f. The test case that was added there with
the improper result is repaired.
Backpatch to 12.
Discussion: https://postgr.es/m/20221005105523.bhuhkdx4olajboof@alvherre.pgsql
Per discussion, the existing routine name able to initialize a SRF
function with materialize mode is unpopular, so rename it. Equally, the
flags of this function are renamed, as of:
- SRF_SINGLE_USE_EXPECTED -> MAT_SRF_USE_EXPECTED_DESC
- SRF_SINGLE_BLESS -> MAT_SRF_BLESS
The previous function and flags introduced in 9e98583 are kept around
for compatibility purposes, so as any extension code already compiled
with v15 continues to work as-is. The declarations introduced here for
compatibility will be removed from HEAD in a follow-up commit.
The new names have been suggested by Andres Freund and Melanie
Plageman.
Discussion: https://postgr.es/m/20221013194820.ciktb2sbbpw7cljm@awork3.anarazel.de
Backpatch-through: 15
There are a number of bugs in this area. Two of them are fixed here,
namely:
1. get_relation_idx_constraint_oid does not restrict the type of
constraint that's returned, so with sufficient bad luck it can
return the OID of a foreign key constraint. This has the effect that
a primary key in a partition can end up as a child of a foreign key,
which makes no sense (it needs to be the child of the equivalent
primary key.)
Change the API contract so that only index-backed constraints are
returned, mimicking get_constraint_index().
2. Both CloneFkReferenced and CloneFkReferencing clone a
self-referencing foreign key, so the partition ends up with
a duplicate foreign key. Change the former function to ignore such
constraints.
Add some tests to verify that things are better now. (However, these
new tests show some additional misbehavior that will be fixed later --
namely that there's a constraint marked NOT VALID.)
Backpatch to 12, where these constraints are possible at all.
Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://postgr.es/m/20220603154232.1715b14c@karst
This prevents marking the argument string for translation for gettext,
and it also prevents the given string (which is already translated) from
being translated at runtime.
Also, mark the strings used as arguments to check_rolespec_name for
translation.
Backpatch all the way back as appropriate. None of this is caught by
any tests (necessarily so), so I verified it manually.
Commit 25936fd46 adjusted things so that the "storeslot" we use
for remapping trigger tuples would have adequate lifespan, but it
neglected to consider the lifespan of the tuple descriptor that
the slot depends on. It turns out that in at least some cases, the
tupdesc we are passing is a refcounted tupdesc, and the refcount for
the slot's reference can get assigned to a resource owner having
different lifespan than the slot does. That leads to an error like
"tupdesc reference 0x7fdef236a1b8 is not owned by resource owner
SubTransaction". Worse, because of a second oversight in the same
commit, we'd try to free the same tupdesc refcount again while
cleaning up after that error, leading to recursive errors and an
"ERRORDATA_STACK_SIZE exceeded" PANIC.
To fix the initial problem, let's just make a non-refcounted copy
of the tupdesc we're supposed to use. That seems likely to guard
against additional problems, since there's no strong reason for
this code to assume that what it's given is a refcounted tupdesc;
in which case there's an independent hazard of the tupdesc having
shorter lifespan than the slot does. (I didn't bother trying to
free said copy, since it should go away anyway when the (sub)
transaction context is cleaned up.)
The other issue can be fixed by making the code added to
AfterTriggerFreeQuery work like the rest of that function, ie be
sure that it doesn't try to free the same slot twice in the event
of recursive error cleanup.
While here, also clean up minor stylistic issues in the test case
added by 25936fd46: don't use "create or replace function", as any
name collision within the tests is likely to have ill effects
that that won't mask; and don't use function names as generic as
trigger_function1, especially if you're not going to drop them
at the end of the test stanza.
Per bug #17607 from Thomas Mc Kay. Back-patch to v12, as the
previous fix was.
Discussion: https://postgr.es/m/17607-bd8ccc81226f7f80@postgresql.org
We previously thought that allowing such cases can confuse users when they
specify DROP TABLES IN SCHEMA but that doesn't seem to be the case based
on discussion. This helps to uplift the restriction during
ALTER TABLE ... SET SCHEMA which used to ensure that we couldn't end up
with a publication having both a schema and the same schema's table.
To allow this, we need to forbid having any schema on a publication if
column lists on a table are specified (and vice versa). This is because
otherwise we still need a restriction during ALTER TABLE ... SET SCHEMA to
forbid cases where it could lead to a publication having both a schema and
the same schema's table with column list.
Based on suggestions by Peter Eisentraut.
Author: Hou Zhijie and Vignesh C
Reviewed-By: Peter Smith, Amit Kapila
Backpatch-through: 15, where it was introduced
Discussion: https://postgr.es/m/2729c9e2-9aac-8cda-f2f4-34f2bcc18f4e@enterprisedb.com
This may be a bit too subtle, but removing that word from there makes
this clause no longer a perfect parallel of the GRANT variant "ALL
TABLES IN SCHEMA": indeed, for publications what we record is the schema
itself, not the tables therein, which means that any tables added to the
schema in the future are also published. This is completely different
to what GRANT does, which is affect only the tables that exist when the
command is executed.
There isn't resounding support for this change, but there are a few
positive votes and no opposition. Because the time to 15 RC1 is very
short, let's get this out now.
Backpatch to 15.
Discussion: https://postgr.es/m/2729c9e2-9aac-8cda-f2f4-34f2bcc18f4e
Because of inadequate filtering, the check triggers were confusing the
search for action triggers in GetForeignKeyActionTriggers and vice-versa
in GetForeignKeyCheckTriggers; this confusion results in seemingly
random assertion failures, and can have real impact in non-asserting
builds depending on catalog order. Change these functions so that they
correctly ignore triggers that are not relevant to each side.
To reduce the odds of further problems, do not break out of the
searching loop in assertion builds. This break is likely to hide bugs;
without it, we would have detected this bug immediately.
This problem was introduced by f4566345cf, so backpatch to 15 where
that commit first appeared.
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/20220908172029.sejft2ppckbo6oh5@awork3.anarazel.de
Discussion: https://postgr.es/m/4104619.1662663056@sss.pgh.pa.us
During ALTER TABLE ATTACH PARTITION, if the name of a parent's foreign
key constraint is already used on the partition, the code tries to
choose another one before the FK attributes list has been populated,
so the resulting constraint name was "<relname>__fkey" instead of
"<relname>_<attrs>_fkey". Repair, and add a test case.
Backpatch to 12. In 11, the code to attach a partition was not smart
enough to cope with conflicting constraint names, so the problem doesn't
exist there.
Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Discussion: https://postgr.es/m/20220901184156.738ebee5@karst
The reverts the following and makes some associated cleanups:
commit f79b803dc: Common SQL/JSON clauses
commit f4fb45d15: SQL/JSON constructors
commit 5f0adec25: Make STRING an unreserved_keyword.
commit 33a377608: IS JSON predicate
commit 1a36bc9db: SQL/JSON query functions
commit 606948b05: SQL JSON functions
commit 49082c2cc: RETURNING clause for JSON() and JSON_SCALAR()
commit 4e34747c8: JSON_TABLE
commit fadb48b00: PLAN clauses for JSON_TABLE
commit 2ef6f11b0: Reduce running time of jsonb_sqljson test
commit 14d3f24fa: Further improve jsonb_sqljson parallel test
commit a6baa4bad: Documentation for SQL/JSON features
commit b46bcf7a4: Improve readability of SQL/JSON documentation.
commit 112fdb352: Fix finalization for json_objectagg and friends
commit fcdb35c32: Fix transformJsonBehavior
commit 4cd8717af: Improve a couple of sql/json error messages
commit f7a605f63: Small cleanups in SQL/JSON code
commit 9c3d25e17: Fix JSON_OBJECTAGG uniquefying bug
commit a79153b7a: Claim SQL standard compliance for SQL/JSON features
commit a1e7616d6: Rework SQL/JSON documentation
commit 8d9f9634e: Fix errors in copyfuncs/equalfuncs support for JSON node types.
commit 3c633f32b: Only allow returning string types or bytea from json_serialize
commit 67b26703b: expression eval: Fix EEOP_JSON_CONSTRUCTOR and EEOP_JSONEXPR size.
The release notes are also adjusted.
Backpatch to release 15.
Discussion: https://postgr.es/m/40d2c882-bcac-19a9-754d-4299e1d87ac7@postgresql.org
As such the current usage of & won't produce incorrect results but it
would be better to use && to short-circuit the evaluation of second
condition when the same is not required.
Author: Ranier Vilela
Reviewed-by: Tom Lane, Bharath Rupireddy
Backpatch-through: 15, where it was introduced
Discussion: https://postgr.es/m/CAEudQApL8QcoYwQuutkWKY_h7gBY8F0Xs34YKfc7-G0i83K_pw@mail.gmail.com
Consistently avoid trusting a sample of only one page at the point that
VACUUM determines a new reltuples for the target table (though only when
the table is larger than a single page). This is follow-up work to
commit 74388a1a, which added a heuristic to prevent reltuples from
becoming distorted by successive VACUUM operations that each scan only a
single heap page (which was itself more or less a bugfix for an issue in
commit 44fa8488, which simplified VACUUM's handling of scanned pages).
The original bugfix commit did not account for certain remaining cases
that where not affected by its "2% of total relpages" heuristic. This
happened with relations that are small enough that just one of its pages
exceeded the 2% threshold, yet still big enough for VACUUM to deem
skipping most of its pages via the visibility map worthwhile. reltuples
could still become distorted over time with such a table, at least in
scenarios where the VACUUM command is run repeatedly and without the
table itself ever changing.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wzk7d4m3oEbEWkWQKd+gz-eD_peBvdXVk1a_KBygXadFeg@mail.gmail.com
Backpatch: 15-, where the rules for scanned pages changed.
Initialize shared memory allocated for index stats to avoid a hard
crash. This was possible when parallel VACUUM became confused about the
current phase of index processing.
Oversight in commit 8e1fae1938, which refactored parallel VACUUM.
Author: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-By: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20220818133406.GL26426@telsasoft.com
Backpatch: 15-, the first version with the refactoring commit.
When creating a partitioned index, DefineIndex tries to identify
any existing indexes on the partitions that match the partitioned
index, so that it can absorb those as child indexes instead of
building new ones. Part of the matching is to compare IndexInfo
structs --- but that wasn't done quite right. We're comparing
the IndexInfo built within DefineIndex itself to one made from
existing catalog contents by BuildIndexInfo. Notably, while
BuildIndexInfo will run index expressions and predicates through
expression preprocessing, that has not happened to DefineIndex's
struct. The result is failure to match and subsequent creation
of duplicate indexes.
The easiest and most bulletproof fix is to build a new IndexInfo
using BuildIndexInfo, thereby guaranteeing that the processing done
is identical.
While here, let's also extract the opfamily and collation data
from the new partitioned index, removing ad-hoc logic that
duplicated knowledge about how those are constructed.
Per report from Christophe Pettus. Back-patch to v11 where
we invented partitioned indexes.
Richard Guo and Tom Lane
Discussion: https://postgr.es/m/8864BFAA-81FD-4BF9-8E06-7DEB8D4164ED@thebuild.com
If an error occurs before we close the fake relcache entry, the the
fake relcache entry will be destroyed by the SmgrRelation will
survive until end of transaction. Its smgr_owner pointer ends up
pointing to already-freed memory.
The original reason for using a fake relcache entry here was to try
to avoid reusing an SMgrRelation across a relevant invalidation. To
avoid that problem, just call smgropen() again each time we need a
reference to it. Hopefully someday we will come up with a more
elegant approach, but accessing uninitialized memory is bad so let's
do this for now.
Dilip Kumar, reviewed by Andres Freund and Tom Lane. Report by
Justin Pryzby.
Discussion: http://postgr.es/m/20220802175043.GA13682@telsasoft.com
Discussion: http://postgr.es/m/CAFiTN-vSFeE6_W9z698XNtFROOA_nSqUXWqLcG0emob_kJ+dEQ@mail.gmail.com
The grammar added for MERGE inadvertently made it accepted syntax in
places that were not prepared to deal with it -- namely COPY and inside
CTEs, but invoking these things with MERGE currently causes assertion
failures or weird misbehavior in non-assertion builds. Protect those
places by checking for it explicitly until somebody decides to implement
it.
Reported-by: Alexey Borzov <borz_off@cs.msu.su>
Discussion: https://postgr.es/m/17579-82482cd7b267b862@postgresql.org
Previously, if an extension script did CREATE OR REPLACE and there was
an existing object not belonging to the extension, it would overwrite
the object and adopt it into the extension. This is problematic, first
because the overwrite is probably unintentional, and second because we
didn't change the object's ownership. Thus a hostile user could create
an object in advance of an expected CREATE EXTENSION command, and would
then have ownership rights on an extension object, which could be
modified for trojan-horse-type attacks.
Hence, forbid CREATE OR REPLACE of an existing object unless it already
belongs to the extension. (Note that we've always forbidden replacing
an object that belongs to some other extension; only the behavior for
previously-free-standing objects changes here.)
For the same reason, also fail CREATE IF NOT EXISTS when there is
an existing object that doesn't belong to the extension.
Our thanks to Sven Klemm for reporting this problem.
Security: CVE-2022-2625
Using ATSimpleRecursion() in ATPrepCmd() to do so as bbb927b4db did is
not correct, because ATPrepCmd() can't distinguish between triggers that
may be cloned and those that may not, so would wrongly try to recurse
for the latter category of triggers.
So this commit restores the code in EnableDisableTrigger() that
86f575948c had added to do the recursion, which would do it only for
triggers that may be cloned, that is, row-level triggers. This also
changes tablecmds.c such that ATExecCmd() is able to pass the value of
ONLY flag down to EnableDisableTrigger() using its new 'recurse'
parameter.
This also fixes what seems like an oversight of 86f575948c that the
recursion to partition triggers would only occur if EnableDisableTrigger()
had actually changed the trigger. It is more apt to recurse to inspect
partition triggers even if the parent's trigger didn't need to be
changed: only then can we be certain that all descendants share the same
state afterwards.
Backpatch all the way back to 11, like bbb927b4db. Care is taken not
to break ABI compatibility (and that no catversion bump is needed.)
Co-authored-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Dmitry Koval <d.koval@postgrespro.ru>
Discussion: https://postgr.es/m/CA+HiwqG-cZT3XzGAnEgZQLoQbyfJApVwOTQaCaas1mhpf+4V5A@mail.gmail.com
Commit 9a974cbcba arranged to preserve
the relfilenode of user tables across pg_upgrade, but failed to notice
that pg_upgrade treats pg_largeobject as a user table and thus it needs
the same treatment. Otherwise, large objects will appear to vanish
after a pg_upgrade.
Commit d498e052b4 fixed this problem
by teaching pg_dump to UPDATE pg_class.relfilenode for pg_largeobject
and its index. However, because an UPDATE on the catalog rows doesn't
change anything on disk, this can leave stray files behind in the new
cluster. They will normally be empty, but it's a little bit untidy.
Hence, this commit arranges to do the same thing using DDL. Specifically,
it makes TRUNCATE work for the pg_largeobject catalog when in
binary-upgrade mode, and it then uses that command in binary-upgrade
dumps as a way of setting pg_class.relfilenode for pg_largeobject and
its index. That way, the old files are removed from the new cluster.
Discussion: http://postgr.es/m/CA+TgmoYYMXGUJO5GZk1-MByJGu_bB8CbOL6GJQC8=Bzt6x6vDg@mail.gmail.com
Crash recovery on standby may encounter missing directories
when replaying database-creation WAL records. Prior to this
patch, the standby would fail to recover in such a case;
however, the directories could be legitimately missing.
Consider the following sequence of commands:
CREATE DATABASE
DROP DATABASE
DROP TABLESPACE
If, after replaying the last WAL record and removing the
tablespace directory, the standby crashes and has to replay the
create database record again, crash recovery must be able to continue.
A fix for this problem was already attempted in 49d9cfc68b, but it
was reverted because of design issues. This new version is based
on Robert Haas' proposal: any missing tablespaces are created
during recovery before reaching consistency. Tablespaces
are created as real directories, and should be deleted
by later replay. CheckRecoveryConsistency ensures
they have disappeared.
The problems detected by this new code are reported as PANIC,
except when allow_in_place_tablespaces is set to ON, in which
case they are WARNING. Apart from making tests possible, this
gives users an escape hatch in case things don't go as planned.
Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Author: Asim R Praveen <apraveen@pivotal.io>
Author: Paul Guo <paulguo@gmail.com>
Reviewed-by: Anastasia Lubennikova <lubennikovaav@gmail.com> (older versions)
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> (older versions)
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Diagnosed-by: Paul Guo <paulguo@gmail.com>
Discussion: https://postgr.es/m/CAEET0ZGx9AvioViLf7nbR_8tH9-=27DN5xWJ2P9-ROH16e4JUA@mail.gmail.com
Detail and hint messages should be full sentences and should end with a
period, but some of the messages newly-introduced in v15 did not follow
that.
Author: Justin Pryzby
Reviewed-by: Álvaro Herrera
Discussion: https://postgr.es/m/20220719120948.GF12702@telsasoft.com
Backpatch-through: 15
We allow users to set the values of not-yet-loaded extension GUCs,
remembering those values in "placeholder" GUC entries. When/if
the extension is loaded later in the session, we need to verify that
the user had permissions to set the GUC. That was done correctly
before commit a0ffa885e, but as of that commit, we'd check the
permissions of the active role when the LOAD happens, not the role
that had set the value. (This'd be a security bug if it had made it
into a released version.)
In principle this is simple enough to fix: we just need to remember
the exact role OID that set each GUC value, and use that not
GetUserID() when verifying permissions. Maintaining that data in
the guc.c data structures is slightly tedious, but fortunately it's
all basically just copy-n-paste of the logic for tracking the
GucSource of each setting, as we were already doing.
Another oversight is that validate_option_array_item() hadn't
been taught to check for granted GUC privileges. This appears
to manifest only in that ALTER ROLE/DATABASE RESET ALL will
fail to reset settings that the user should be allowed to reset.
Patch by myself and Nathan Bossart, per report from Nathan Bossart.
Back-patch to v15 where the faulty code came in.
Discussion: https://postgr.es/m/20220706224727.GA2158260@nathanxps13
Justin Pryzby reported that some scenarios could cause gathering
of extended statistics to spend many seconds in an un-cancelable
qsort() operation. To fix, invent qsort_interruptible(), which is
just like qsort_arg() except that it will also do CHECK_FOR_INTERRUPTS
every so often. This bloats the backend by a couple of kB, which
seems like a good investment. (We considered just enabling
CHECK_FOR_INTERRUPTS in the existing qsort and qsort_arg functions,
but there are some callers for which that'd demonstrably be unsafe.
Opt-in seems like a better way.)
For now, just apply qsort_interruptible() in statistics collection.
There's probably more places where it could be useful, but we can
always change other call sites as we find problems.
Back-patch to v14. Before that we didn't have extended stats on
expressions, so that the problem was less severe. Also, this patch
depends on the sort_template infrastructure introduced in v14.
Tom Lane and Justin Pryzby
Discussion: https://postgr.es/m/20220509000108.GQ28830@telsasoft.com
Commit a117cebd63 used the original userid
for ACL checks located directly in DefineIndex(), but it still adopted
the table owner userid for more ACL checks than intended. That broke
dump/reload of indexes that refer to an operator class, collation, or
exclusion operator in a schema other than "public" or "pg_catalog".
Back-patch to v10 (all supported versions), like the earlier commit.
Nathan Bossart and Noah Misch
Discussion: https://postgr.es/m/f8a4105f076544c180a87ef0c4822352@stmuk.bayern.de