Without this, an additional change to the same pg_attribute row
within the same command will fail. This is possible at least with
ALTER TABLE ADD COLUMN on a multiple-inheritance-pathway structure.
(Another potential hazard is that immediately-following operations
might not see the missingval.)
Introduced by 95f650674, which split the former coding that
used a single pg_attribute update to change both atthasdef and
atthasmissing/attmissingval into two updates, but missed that
this should entail two CommandCounterIncrements as well. Like
that fix, back-patch through v13.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/025a3ffa-5eff-4a88-97fb-8f583b015965@gmail.com
Backpatch-through: 13
If a domain type has a default, adding a column of that type (without
any explicit DEFAULT clause) failed to install the domain's default
value in existing rows, instead leaving the new column null. This
is unexpected, and it used to work correctly before v11. The cause
is confusion in the atthasmissing mechanism about which default value
to install: we'd only consider installing an explicitly-specified
default, and then we'd decide that no table rewrite is needed.
To fix, take the responsibility for filling attmissingval out of
StoreAttrDefault, and instead put it into ATExecAddColumn's existing
logic that derives the correct value to fill the new column with.
Also, centralize the logic that determines the need for
default-related table rewriting there, instead of spreading it over
four or five places.
In the back branches, we'll leave the attmissingval-filling code
in StoreAttrDefault even though it's now dead, for fear that some
extension may be depending on that functionality to exist there.
A separate HEAD-only patch will clean up the now-useless code.
Reported-by: jian he <jian.universality@gmail.com>
Author: jian he <jian.universality@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxHFssPvkP1we7WMhPD_1kwgbG52o=kQgL+TnVoX5LOyCQ@mail.gmail.com
Backpatch-through: 13
When an UPDATE trigger referencing a new table and a DELETE trigger
referencing an old table are both present, MakeTransitionCaptureState()
returns an inconsistent result for UPDATE commands in its set of flags
and tuplestores holding the TransitionCaptureState for transition
tables.
As proved by the test added here, this issue causes a crash in v14 and
earlier versions (down to 11, actually, older versions do not support
triggers on partitioned tables) during cross-partition updates on a
partitioned table. v15 and newer versions are safe thanks to
7103ebb7aa.
This commit fixes the function so that it returns a consistent state
by using portions of the changes made in commit 7103ebb7aa for v13 and
v14. v15 and newer versions are slightly tweaked to match with the
older versions, mainly for consistency across branches.
Author: Kyotaro Horiguchi
Discussion: https://postgr.es/m/20250207.150238.968446820828052276.horikyota.ntt@gmail.com
Backpatch-through: 13
This patch fixes two distinct errors that both ultimately trace
to commit 71d60e2aa, which added the ats_modifiedcols field.
The more severe error is that ats_modifiedcols wasn't accounted for
in afterTriggerAddEvent's scanning loop that looks for a pre-existing
duplicate AfterTriggerSharedData. Thus, a new event could be
incorrectly matched to an AfterTriggerSharedData that has a different
value of ats_modifiedcols, resulting in the wrong tg_updatedcols
bitmap getting passed to the trigger whenever it finally gets fired.
We'd not noticed because (a) few triggers consult tg_updatedcols,
and (b) we had no tests exercising a case where such a trigger was
called as an AFTER trigger. In the test case added by this commit,
contrib/lo's trigger fails to remove a large object when expected
because (without this fix) it thinks the LO OID column hasn't changed.
The other problem was introduced by commit ce5aaea8c, which copied the
modified-columns bitmap into trigger-related storage. It made a copy
for every trigger event, whereas what we really want is to make a new
copy only when we make a new AfterTriggerSharedData entry. (We could
imagine adding extra logic to reduce the number of bitmap copies still
more, but it doesn't look worthwhile at the moment.) In a simple test
of an UPDATE of 10000000 rows with a single AFTER trigger, this thinko
roughly tripled the amount of memory consumed by the pending-triggers
data structures, from 160446744 to 480443440 bytes.
Fixing the first problem requires introducing a bms_equal() call into
afterTriggerAddEvent's scanning loop, which is slightly annoying from
a speed perspective. However, getting rid of the excessive bms_copy()
calls from the second problem balances that out; overall speed of
trigger operations is the same or slightly better, in my tests.
Discussion: https://postgr.es/m/3496294.1737501591@sss.pgh.pa.us
Backpatch-through: 13
Commit aac2c9b4fd mandated such locking
and attempted to fulfill that mandate, but it missed REASSIGN OWNED.
Hence, it remained possible to lose VACUUM's inplace update of
datfrozenxid if a REASSIGN OWNED processed that database at the same
time. This didn't affect the other inplace-updated catalog, pg_class.
For pg_class, REASSIGN OWNED calls ATExecChangeOwner() instead of the
generic AlterObjectOwner_internal(), and ATExecChangeOwner() fulfills
the locking mandate.
Like in GRANT, implement this by following the locking protocol for any
catalog subject to the generic AlterObjectOwner_internal(). It would
suffice to do this for IsInplaceUpdateOid() catalogs only. Back-patch
to v13 (all supported versions).
Kirill Reshke. Reported by Alexander Kukushkin.
Discussion: https://postgr.es/m/CAFh8B=mpKjAy4Cuun-HP-f_vRzh2HSvYFG3rhVfYbfEBUhBAGg@mail.gmail.com
Usually an entry in pg_amop or pg_amproc does not need a dependency on
its amoplefttype/amoprighttype/amproclefttype/amprocrighttype types,
because there is an indirect dependency via the argument types of its
referenced operator or procedure, or via the opclass it belongs to.
However, for some support procedures in some index AMs, the argument
types of the support procedure might not mention the column data type
at all. Also, the amop/amproc entry might be treated as "loose" in
the opfamily, in which case it lacks a dependency on any particular
opclass; or it might be a cross-type entry having a reference to a
datatype that is not its opclass' opcintype.
The upshot of all this is that there are cases where a datatype can
be dropped while leaving behind amop/amproc entries that mention it,
because there is no path in pg_depend showing that those entries
depend on that type. Such entries are harmless in normal activity,
because they won't get used, but they cause problems for maintenance
actions such as dropping the operator family. They also cause pg_dump
to produce bogus output. The previous commit put a band-aid on the
DROP OPERATOR FAMILY failure, but a real fix is needed.
To fix, add pg_depend entries showing that a pg_amop/pg_amproc entry
depends on its lefttype/righttype. To avoid bloating pg_depend too
much, skip this if the referenced operator or function has that type
as an input type. (I did not bother with considering the possible
indirect dependency via the opclass' opcintype; at least in the
reported case, that wouldn't help anyway.)
Probably, the reason this has escaped notice for so long is that
add-on datatypes and relevant opclasses/opfamilies are usually
packaged as extensions nowadays, so that there's no way to drop
a type without dropping the referencing opclasses/opfamilies too.
Still, in the absence of pg_depend entries there's nothing that
constrains DROP EXTENSION to drop the opfamily entries before the
datatype, so it seems possible for a DROP failure to occur anyway.
The specific case that was reported doesn't fail in v13, because
v13 prefers to attach the support procedure to the opclass not the
opfamily. But it's surely possible to construct other edge cases
that do fail in v13, so patch that too.
Per report from Yoran Heling. Back-patch to all supported branches.
Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
The current code calls array_eq() and does not provide FmgrInfo. This commit
provides initialization of FmgrInfo and uses C collation as the safe option
for text comparison because we don't know anything about the semantics of
opclass options.
Backpatch to 13, where opclass options were introduced.
Reported-by: Nicolas Maus
Discussion: https://postgr.es/m/18692-72ea398df3ec6712%40postgresql.org
Backpatch-through: 13
The SQL spec mandates that SET SESSION AUTHORIZATION implies
SET ROLE NONE. We tried to implement that within the lowest-level
functions that manipulate these settings, but that was a bad idea.
In particular, guc.c assumes that it doesn't matter in what order
it applies GUC variable updates, but that was not the case for these
two variables. This problem, compounded by some hackish attempts to
work around it, led to some security-grade issues:
* Rolling back a transaction that had done SET SESSION AUTHORIZATION
would revert to SET ROLE NONE, even if that had not been the previous
state, so that the effective user ID might now be different from what
it had been.
* The same for SET SESSION AUTHORIZATION in a function SET clause.
* If a parallel worker inspected current_setting('role'), it saw
"none" even when it should see something else.
Also, although the parallel worker startup code intended to cope
with the current role's pg_authid row having disappeared, its
implementation of that was incomplete so it would still fail.
Fix by fully separating the miscinit.c functions that assign
session_authorization from those that assign role. To implement the
spec's requirement, teach set_config_option itself to perform "SET
ROLE NONE" when it sets session_authorization. (This is undoubtedly
ugly, but the alternatives seem worse. In particular, there's no way
to do it within assign_session_authorization without incompatible
changes in the API for GUC assign hooks.) Also, improve
ParallelWorkerMain to directly set all the relevant user-ID variables
instead of relying on some of them to get set indirectly. That
allows us to survive not finding the pg_authid row during worker
startup.
In v16 and earlier, this includes back-patching 9987a7bf3 which
fixed a violation of GUC coding rules: SetSessionAuthorization
is not an appropriate place to be throwing errors from.
Security: CVE-2024-10978
... to fix bugs when the referenced table is partitioned.
The catalog representation we chose for foreign keys connecting
partitioned tables (in commit f56f8f8da6) is inconvenient, in the
sense that a standalone table has a different way to represent the
constraint when referencing a partitioned table, than when the same
table becomes a partition (and vice versa). Because of this, we need to
create additional catalog rows on detach (pg_constraint and pg_trigger),
and remove them on attach. We were doing some of those things, but not
all of them, leading to missing catalog rows in certain cases.
The worst problem seems to be that we are missing action triggers after
detaching a partition, which means that you could update/delete rows
from the referenced partitioned table that still had referencing rows on
that table, the server failing to throw the required errors.
!!!
Note that this means existing databases with FKs that reference
partitioned tables might have rows that break relational integrity, on
tables that were once partitions on the referencing side of the FK.
Another possible problem is that trying to reattach a table
that had been detached would fail indicating that internal triggers
cannot be found, which from the user's point of view is nonsensical.
In branches 15 and above, we fix this by creating a new helper function
addFkConstraint() which is in charge of creating a standalone
pg_constraint row, and repurposing addFkRecurseReferencing() and
addFkRecurseReferenced() so that they're only the recursive routine for
each side of the FK, and they call addFkConstraint() to create
pg_constraint at each partitioning level and add the necessary triggers.
These new routines can be used during partition creation, partition
attach and detach, and foreign key creation. This reduces redundant
code and simplifies the flow.
In branches 14 and 13, we have a much simpler fix that consists on
simply removing the constraint on detach. The reason is that those
branches are missing commit f4566345cf, which reworked the way this
works in a way that we didn't consider back-patchable at the time.
We opted to leave branch 12 alone, because it's different from branch 13
enough that the fix doesn't apply; and because it is going in EOL mode
very soon, patching it now might be worse since there's no way to undo
the damage if it goes wrong.
Existing databases might need to be repaired.
In the future we might want to rethink the catalog representation to
avoid this problem, but for now the code seems to do what's required to
make the constraints operate correctly.
Co-authored-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Guillaume Lelarge <guillaume@lelarge.info>
Reported-by: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Reported-by: Thomas Baehler (SBB CFF FFS) <thomas.baehler2@sbb.ch>
Discussion: https://postgr.es/m/20230420144344.40744130@karst
Discussion: https://postgr.es/m/20230705233028.2f554f73@karst
Discussion: https://postgr.es/m/GVAP278MB02787E7134FD691861635A8BC9032@GVAP278MB0278.CHEP278.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/18541-628a61bc267cd2d3@postgresql.org
If the query is rewritten into a NOTIFY command by a DO INSTEAD
rule, we'd get an assertion failure, or in non-assert builds
issue a rather confusing error message. Improve that.
Also fix a longstanding grammar mistake in a nearby error message.
Per bug #18664 from Alexander Lakhin. Back-patch to all supported
branches.
Tender Wang and Tom Lane
Discussion: https://postgr.es/m/18664-ffd0ebc2386598df@postgresql.org
The previous commit fixed some ways of losing an inplace update. It
remained possible to lose one when a backend working toward a
heap_update() copied a tuple into memory just before inplace update of
that tuple. In catalogs eligible for inplace update, use LOCKTAG_TUPLE
to govern admission to the steps of copying an old tuple, modifying it,
and issuing heap_update(). This includes MERGE commands. To avoid
changing most of the pg_class DDL, don't require LOCKTAG_TUPLE when
holding a relation lock sufficient to exclude inplace updaters.
Back-patch to v12 (all supported versions). In v13 and v12, "UPDATE
pg_class" or "UPDATE pg_database" can still lose an inplace update. The
v14+ UPDATE fix needs commit 86dc90056d,
and it wasn't worth reimplementing that fix without such infrastructure.
Reviewed by Nitin Motiani and (in earlier versions) Heikki Linnakangas.
Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
As previously-added tests demonstrated, heap_inplace_update() could
instead update an unrelated tuple of the same catalog. It could lose
the update. Losing relhasindex=t was a source of index corruption.
Inplace-updating commands like VACUUM will now wait for heap_update()
commands like GRANT TABLE and GRANT DATABASE. That isn't ideal, but a
long-running GRANT already hurts VACUUM progress more just by keeping an
XID running. The VACUUM will behave like a DELETE or UPDATE waiting for
the uncommitted change.
For implementation details, start at the systable_inplace_update_begin()
header comment and README.tuplock. Back-patch to v12 (all supported
versions). In back branches, retain a deprecated heap_inplace_update(),
for extensions.
Reported by Smolkin Grigory. Reviewed by Nitin Motiani, (in earlier
versions) Heikki Linnakangas, and (in earlier versions) Alexander
Lakhin.
Discussion: https://postgr.es/m/CAMp+ueZQz3yDk7qg42hk6-9gxniYbp-=bG2mgqecErqR5gGGOA@mail.gmail.com
Back-patch commits 4c9c359d38 and
24843297a9 to v13 and v12. Before those
commits, we held the modifiable copy of the relation's pg_class row
throughout a table_relation_copy_data(). That can last long enough to
copy MaxBlockNumber of data. A subsequent fix will hold LockTuple() for
the lifespan of that modifiable copy. By back-patching this first, we
avoid a needless long-duration LOCKTAG_TUPLE.
Discussion: https://postgr.es/m/20231027214946.79.nmisch@google.com
Add a comment explaining dropdb() can't rely on syscache. The issue with
flattened rows was fixed by commit 0f92b230f8, but better to have
a clear explanation why the systable scan is necessary. The other places
doing in-place updates on pg_database have the same comment.
Suggestion and patch by Yugo Nagata. Backpatch to 12, same as the fix.
Author: Yugo Nagata
Backpatch-through: 12
Discussion: https://postgr.es/m/CAJTYsWWNkCt+-UnMhg=BiCD3Mh8c2JdHLofPxsW3m2dkDFw8RA@mail.gmail.com
Commit c66a7d75e6 modified DROP DATABASE so that if interrupted, the
database is known to be in an invalid state and can only be dropped.
This is done by setting a flag using an in-place update, so that it's
not lost in case of rollback.
For databases with many ACLs, this may however fail like this:
ERROR: wrong tuple length
This happens because with many ACLs, the pg_database.datacl attribute
gets TOASTed. The dropdb() code reads the tuple from the syscache, which
means it's detoasted. But the in-place update expects the tuple length
to match the on-disk tuple.
Fixed by reading the tuple from the catalog directly, not from syscache.
Report and fix by Ayush Tiwari. Backpatch to 12. The DROP DATABASE fix
was backpatched to 11, but 11 is EOL at this point.
Reported-by: Ayush Tiwari
Author: Ayush Tiwari
Reviewed-by: Tomas Vondra
Backpatch-through: 12
Discussion: https://postgr.es/m/CAJTYsWWNkCt+-UnMhg=BiCD3Mh8c2JdHLofPxsW3m2dkDFw8RA@mail.gmail.com
Trying to attach a table as a partition which is already on the
referenced side of a foreign key on the partitioned table that it is
being attached to, leads to strange behavior: we try to clone the
foreign key from the parent to the partition, but this new FK points to
the partition itself, and the mix of pg_constraint rows and triggers
doesn't behave well.
Rather than trying to untangle the mess (which might be possible given
sufficient time), I opted to forbid the ATTACH. This doesn't seem a
problematic restriction, given that we already fail to create the
foreign key if you do it the other way around, that is, having the
partition first and the FK second.
Backpatch to all supported branches.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/18541-628a61bc267cd2d3@postgresql.org
This reverts commit 216201027d.
Some buildfarm animals are failing with "cannot change
"client_encoding" during a parallel operation". It looks like
assign_client_encoding is unhappy at being asked to roll back a
client_encoding setting after a parallel worker encounters a
failure. There must be more to it though: why didn't I see this
during local testing? In any case, it's clear that moving the
RestoreGUCState() call is not as side-effect-free as I thought.
Given that the bug f5f30c22e intended to fix has gone unreported
for years, it's not something that's urgent to fix; I'm not
willing to risk messing with it further with only days to our
next release wrap.
Parallel workers failed after a sequence like
BEGIN;
CREATE USER foo;
SET SESSION AUTHORIZATION foo;
because check_session_authorization could not see the uncommitted
pg_authid row for "foo". This is because we ran RestoreGUCState()
in a separate transaction using an ordinary just-created snapshot.
The same disease afflicts any other GUC that requires catalog lookups
and isn't forgiving about the lookups failing.
To fix, postpone RestoreGUCState() into the worker's main transaction
after we've set up a snapshot duplicating the leader's. This affects
check_transaction_isolation and check_transaction_deferrable, which
think they should only run during transaction start. Make them
act like check_transaction_read_only, which already knows it should
silently accept the value when InitializingParallelWorker.
Per bug #18545 from Andrey Rachitskiy. Back-patch to all
supported branches, because this has been wrong for awhile.
Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
When a partitioned table has an index that doesn't support a constraint,
but a partition has an equivalent index that does, then a DETACH
operation would misbehave: a crash in assertion-enabled systems (because
we fail to find the constraint in the parent that we expect to), or a
broken coninhcount value (-1) in production systems (because we blindly
believe that we've successfully detached the parent).
While we should reject an ATTACH of a partition with such an index, we
have failed to do so in existing releases, so adding an error in stable
releases might break the (unlikely) existing applications that rely on
this behavior. At this point I don't even want to reject them in
master, because it'd break pg_upgrade if such databases exist, and there
would be no easy way to fix existing databases without expensive index
rebuilds.
(Later on we could add ALTER TABLE ... ADD CONSTRAINT USING INDEX to
partitioned tables, which would allow the user to fix such patterns. At
that point we could add more restrictions to prevent the problem from
its root.)
Also, add a test case that leaves one table in this condition, so that
we can verify that pg_upgrade continues to work if we later decide to
change the policy on the master branch.
Backpatch to all supported branches.
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18500-62948b6fe5522f56@postgresql.org
ProcessIncomingNotify is called from the main processing loop that
normally runs in MessageContext. That outer-loop code assumes that
whatever it allocates will be cleaned up when we're done processing
the current client message --- but if we service a notify interrupt,
then whatever gets allocated before the next switch into
MessageContext will be permanently leaked in TopMemoryContext,
because CommitTransactionCommand sets CurrentMemoryContext to
TopMemoryContext. There are observable leaks associated with
(at least) encoding conversion of incoming queries and parameters
attached to Bind messages.
sinval catchup interrupts have a similar problem. There might be
others, but I've not identified any other clear cases.
To fix, take care to save and restore CurrentMemoryContext across
the Start/CommitTransactionCommand calls in these functions.
Per bug #18512 from wizardbrony. Commit to back branches only;
in HEAD, this was dealt with by the riskier but more thoroughgoing
approach in commit 1afe31f03.
Discussion: https://postgr.es/m/3478884.1718656625@sss.pgh.pa.us
Commit 5b562644fe added a comment that
SetRelationHasSubclass() callers must hold this lock. When commit
17f206fbc8 extended use of this column to
partitioned indexes, it didn't take the lock. As the latter commit
message mentioned, we currently never reset a partitioned index to
relhassubclass=f. That largely avoids harm from the lock omission. The
cause for fixing this now is to unblock introducing a rule about locks
required to heap_update() a pg_class row. This might cause more
deadlocks. It gives minor user-visible benefits:
- If an ALTER INDEX SET TABLESPACE runs concurrently with ALTER TABLE
ATTACH PARTITION or CREATE PARTITION OF, one transaction blocks
instead of failing with "tuple concurrently updated". (Many cases of
DDL concurrency still fail that way.)
- Match ALTER INDEX ATTACH PARTITION in choosing to lock the index.
While not user-visible today, we'll need this if we ever make something
set the flag to false for a partitioned index, like ANALYZE does today
for tables. Back-patch to v12 (all supported versions), the plan for
the commit relying on the new rule. In back branches, add
LockOrStrongerHeldByMe() instead of adding a LockHeldByMe() parameter.
Reviewed (in an earlier version) by Robert Haas.
Discussion: https://postgr.es/m/20240611024525.9f.nmisch@google.com
afterTriggerInvokeEvents and AfterTriggerExecute have always
treated it as an error if the trigger OID mentioned in a queued
after-trigger event can't be found. However, that fails to
account for the edge case where the trigger's been dropped in
the current transaction since queueing the event. There seems
no very good reason to disallow that case, so instead silently
do nothing if the trigger OID can't be found.
This does give up a little bit of bug-detection ability, but I don't
recall that these error messages have ever actually revealed a bug,
so it seems mostly theoretical. Alternatives such as marking
pending events DONE at the time of dropping a trigger would be
complicated and perhaps introduce bugs of their own.
Per bug #18517 from Alexander Lakhin. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/18517-af2d19882240902c@postgresql.org
Normally this case isn't even reachable by non-superusers, since
permissions checks prevent naming such a table. However, it is
possible to make it happen by altering a parent table whose child
is another session's temp table.
We definitely can't support any such ALTER that requires modifying
the contents of such a table, since we lack access to the other
session's temporary-buffer pool. But there seems no good reason
to allow it even if it'd only require changing catalog contents.
One reason not to allow it is that we'd rather not expose the
implementation-dependent behavior of whether a specific ALTER
requires touching the table contents. Another is that there may
be (in future, even if not today) optimizations that assume that
a session's own temp tables won't be modified by other sessions.
Hence, add a RELATION_IS_OTHER_TEMP() check to all the places
where ALTER TABLE currently does CheckTableNotInUse(). (I looked
through all other callers of CheckTableNotInUse(), and they seem
OK already.)
Per bug #18492 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18492-c7a2634bf4968763@postgresql.org
Most of the infrastructure for procedure arguments was already
okay with polymorphic output arguments, but it turns out that
CallStmtResultDesc() was a few bricks shy of a load here. It thought
all it needed to do was call build_function_result_tupdesc_t, but
that function specifically disclaims responsibility for resolving
polymorphic arguments. Failing to handle that doesn't seem to be
a problem for CALL in plpgsql, but CALL from plain SQL would get
errors like "cannot display a value of type anyelement", or even
crash outright.
In v14 and later we can simply examine the exposed types of the
CallStmt.outargs nodes to get the right type OIDs. But it's a lot
more complicated to fix in v12/v13, because those versions don't
have CallStmt.outargs, nor do they do expand_function_arguments
until ExecuteCallStmt runs. We have to duplicatively run
expand_function_arguments, and then re-determine which elements
of the args list are output arguments.
Per bug #18463 from Drew Kimball. Back-patch to all supported
versions, since it's busted in all of them.
Discussion: https://postgr.es/m/18463-f8cd77e12564d8a2@postgresql.org
Presently, when this function is called for an unlogged sequence on
a standby server, it will error out with a message like
ERROR: could not open file "base/5/16388": No such file or directory
Since the pg_sequences system view uses pg_sequence_last_value(),
it can error similarly. To fix, modify the function to return NULL
for unlogged sequences on standby servers. Since this bug is
present on all versions since v15, this approach is preferable to
making the ERROR nicer because we need to repair the pg_sequences
view without modifying its definition on released versions. For
consistency, this commit also modifies the function to return NULL
for other sessions' temporary sequences. The pg_sequences view
already appropriately filters out such sequences, so there's no bug
there, but we might as well offer some defense in case someone
invokes this function directly.
Unlogged sequences were first introduced in v15, but temporary
sequences are much older, so while the fix for unlogged sequences
is only back-patched to v15, the temporary sequence portion is
back-patched to all supported versions.
We could also remove the privilege check in the pg_sequences view
definition in v18 if we modify this function to return NULL for
sequences for which the current user lacks privileges, but that is
left as a future exercise for when v18 development begins.
Reviewed-by: Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/20240501005730.GA594666%40nathanxps13
Backpatch-through: 12
vac_update_datfrozenxid() did multiple loads of relfrozenxid and
relminmxid from buffer memory, and it assumed each would get the same
value. Not so if a concurrent vac_update_relstats() did an inplace
update. Commit 2d2e40e3be fixed the same
kind of bug in vac_truncate_clog(). Today's bug could cause the
rel-level field and XIDs in the rel's rows to precede the db-level
field. A cluster having such values should VACUUM affected tables.
Back-patch to v12 (all supported versions).
Discussion: https://postgr.es/m/20240423003956.e7.nmisch@google.com
Ordinary ALTER TABLE SET SCHEMA will also move any owned sequences
into the new schema. We failed to do likewise for foreign tables,
because AlterTableNamespaceInternal believed that only certain
relkinds could have indexes, owned sequences, or constraints.
We could simply add foreign tables to that relkind list, but it
seems likely that the same oversight could be made again in
future. Instead let's remove the relkind filter altogether.
These functions shouldn't cost much when there are no objects
that they need to process, and surely this isn't an especially
performance-critical case anyway.
Per bug #18407 from Vidushi Gupta. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18407-4fd07373d252c6a0@postgresql.org
When assertions are disabled, the built SQL statement is invalid and
you get a "syntax error". So this isn't a serious problem, but let's
avoid the assertion failure.
Backpatch to all supported versions.
Reviewed-by: Noah Misch
The internal commands in REFRESH MATERIALIZED VIEW CONCURRENTLY are
correctly executed in SECURITY_RESTRICTED_OPERATION mode, except for
creating the temporary "diff" table, because you cannot create
temporary tables in SRO mode. But creating the temporary "diff" table
is a pretty complex CTAS command that selects from another temporary
table created earlier in the command. If you can cajole that CTAS
command to execute code defined by the table owner, the table owner
can run code with the privileges of the user running the REFRESH
command.
The proof-of-concept reported to the security team relied on CREATE
RULE to convert the internally-built temp table to a view. That's not
possible since commit b23cd185fd, and I was not able to find a
different way to turn the SELECT on the temp table into code
execution, so as far as I know this is only exploitable in v15 and
below. That's a fiddly assumption though, so apply this patch to
master and all stable versions.
Thanks to Pedro Gallegos for the report.
Security: CVE-2023-5869
Reviewed-by: Noah Misch
This commit addresses a set of issues when changing token type mappings
in a text search configuration when using duplicated token names:
- ADD MAPPING would fail on insertion because of a constraint failure
after inserting the same mapping.
- ALTER MAPPING with an "overridden" configuration failed with "tuple
already updated by self" when the token mappings are removed.
- DROP MAPPING failed with "tuple already updated by self", like
previously, but in a different code path.
The code is refactored so the token names (with their numbers) are
handled as a List with unique members rather than an array with numbers,
ensuring that no duplicates mess up with the catalog inserts, updates
and deletes. The list is generated by getTokenTypes(), with the same
error handling as previously while duplicated tokens are discarded from
the list used to work on the catalogs.
Regression tests are expanded to cover much more ground for the cases
fixed by this commit, as there was no coverage for the code touched in
this commit. A bit more is done regarding the fact that a token name
not supported by a configuration's parser should result in an error even
if IF EXISTS is used in a DROP MAPPING clause. This is implied in the
code but there was no coverage for that, and it was very easy to miss.
These issues exist since at least their introduction in core with
140d4ebcb4, so backpatch all the way down.
Reported-by: Alexander Lakhin
Author: Tender Wang, Michael Paquier
Discussion: https://postgr.es/m/18310-1eb233c5908189c8@postgresql.org
Backpatch-through: 12
We seem to have only documented a foreign key can reference the columns of
a primary key or unique constraint. Here we adjust the documentation
to mention columns in a non-partial unique index can be mentioned too.
The header comment for transformFkeyCheckAttrs() also didn't mention
unique indexes, so fix that too. In passing make that header comment
reflect reality in the various other aspects where it deviated from it.
Bug: 18295
Reported-by: Gilles PARC
Author: Laurenz Albe, David Rowley
Discussion: https://www.postgresql.org/message-id/18295-0ed0fac5c9f7b17b%40postgresql.org
Backpatch-through: 12
This command, when used to add a column on a parent table with a complex
inheritance tree, tried to update multiple times the same tuple in
pg_attribute for a child table when incrementing attinhcount, causing
failures with "tuple already updated by self" because of a missing
CommandCounterIncrement() between two updates.
This exists for a rather long time, so backpatch all the way down.
Reported-by: Alexander Lakhin
Author: Tender Wang
Reviewed-by: Richard Guo
Discussion: https://postgr.es/m/18297-b04cd83a55b51e35@postgresql.org
Backpatch-through: 12
In v16 and up (since commit afbfc0298), large object ownership
checking has been broken because object_ownercheck() didn't take care
of the discrepancy between our object-address representation of large
objects (classId == LargeObjectRelationId) and the catalog where their
ownership info is actually stored (LargeObjectMetadataRelationId).
This resulted in failures such as "unrecognized class ID: 2613"
when trying to update blob properties as a non-superuser.
Poking around for related bugs, I found that AlterObjectOwner_internal
would pass the wrong classId to the PostAlterHook in the no-op code
path where the large object already has the desired owner. Also,
recordExtObjInitPriv checked for the wrong classId; that bug is only
latent because the stanza is dead code anyway, but as long as we're
carrying it around it should be less wrong. These bugs are quite old.
In HEAD, we can reduce the scope for future bugs of this ilk by
changing AlterObjectOwner_internal's API to let the translation happen
inside that function, rather than requiring callers to know about it.
A more bulletproof fix, perhaps, would be to start using
LargeObjectMetadataRelationId as the dependency and object-address
classId for blobs. However that has substantial risk of breaking
third-party code; even within our own code, it'd create hassles
for pg_dump which would have to cope with a version-dependent
representation. For now, keep the status quo.
Discussion: https://postgr.es/m/2650449.1702497209@sss.pgh.pa.us
When creating a partitioned index, the partition key must be a subset
of the index's columns. But this currently doesn't check that the
collations between the partition key and the index definition match.
So you can construct a unique index that fails to enforce uniqueness.
(This would most likely involve a nondeterministic collation, so it
would have to be crafted explicitly and is not something that would
just happen by accident.)
This patch adds the required collation check. As a result, any
previously allowed unique index that has a collation mismatch would no
longer be allowed to be created.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/3327cb54-f7f1-413b-8fdb-7a9dceebb938%40eisentraut.org
The DROP STATISTICS code failed to properly lock the table, leading to
ERROR: tuple concurrently deleted
when executed concurrently with ANALYZE.
Fixed by modifying RemoveStatisticsById() to acquire the same lock as
ANALYZE. This function is called only by DROP STATISTICS, as ANALYZE
calls RemoveStatisticsDataById() directly.
Reported by Justin Pryzby, fix by me. Backpatch through 12. The code was
like this since it was introduced in 10, but older releases are EOL.
Reported-by: Justin Pryzby
Reviewed-by: Tom Lane
Backpatch-through: 12
Discussion: https://postgr.es/m/ZUuk-8CfbYeq6g_u@pryzbyj2023
contain_mutable_functions and contain_volatile_functions give
reliable answers only after expression preprocessing (specifically
eval_const_expressions). Some places understand this, but some did
not get the memo --- which is not entirely their fault, because the
problem is documented only in places far away from those functions.
Introduce wrapper functions that allow doing the right thing easily,
and add commentary in hopes of preventing future mistakes from
copy-and-paste of code that's only conditionally safe.
Two actual bugs of this ilk are fixed here. We failed to preprocess
column GENERATED expressions before checking mutability, so that the
code could fail to detect the use of a volatile function
default-argument expression, or it could reject a polymorphic function
that is actually immutable on the datatype of interest. Likewise,
column DEFAULT expressions weren't preprocessed before determining if
it's safe to apply the attmissingval mechanism. A false negative
would just result in an unnecessary table rewrite, but a false
positive could allow the attmissingval mechanism to be used in a case
where it should not be, resulting in unexpected initial values in a
new column.
In passing, re-order the steps in ComputePartitionAttrs so that its
checks for invalid column references are done before applying
expression_planner, rather than after. The previous coding would
not complain if a partition expression contains a disallowed column
reference that gets optimized away by constant folding, which seems
to me to be a behavior we do not want.
Per bug #18097 from Jim Keener. Back-patch to all supported versions.
Discussion: https://postgr.es/m/18097-ebb179674f22932f@postgresql.org
Dropping a temp table could entail TOAST table access to clean out
toasted catalog entries, such as large pg_constraint.conbin strings
for complex CHECK constraints. If we did that via ON COMMIT DROP,
we triggered the assertion in init_toast_snapshot(), because
there was no provision for setting up a snapshot for the drop
actions. Fix that.
(I assume here that the adjacent truncation actions for ON COMMIT
DELETE ROWS don't have a similar problem: it doesn't seem like
nontransactional truncations would need to touch any toasted fields.
If that proves wrong, we could refactor a bit to have the same
snapshot acquisition cover that too.)
The test case added here does not fail before v15, because that
assertion was added in 277692220 which was not back-patched.
However, the race condition the assertion warns of surely
exists further back, so back-patch to all supported branches.
Per report from Richard Guo.
Discussion: https://postgr.es/m/CAMbWs4-x26=_QxxgdJyNbiCDzvtr2WV5ZDso_v-CukKEe6cBZw@mail.gmail.com
ANALYZE on a table with inheritance children analyzes all the child
tables in a loop. When stepping to next child table, it updated the
child rel ID value in the command progress stats, but did not reset
the 'sample_blks_total' and 'sample_blks_scanned' counters.
acquire_sample_rows() updates 'sample_blks_total' as soon as the scan
starts and 'sample_blks_scanned' after processing the first block, but
until then, pg_stat_progress_analyze would display a bogus combination
of the new child table relid with old counter values from the
previously processed child table. Fix by resetting 'sample_blks_total'
and 'sample_blks_scanned' to zero at the same time that
'current_child_table_relid' is updated.
Backpatch to v13, where pg_stat_progress_analyze view was introduced.
Reported-by: Justin Pryzby
Discussion: https://www.postgresql.org/message-id/20230122162345.GP13860%40telsasoft.com
Substituting such values in extension scripts facilitated SQL injection
when @extowner@, @extschema@, or @extschema:...@ appeared inside a
quoting construct (dollar quoting, '', or ""). No bundled extension was
vulnerable. Vulnerable uses do appear in a documentation example and in
non-bundled extensions. Hence, the attack prerequisite was an
administrator having installed files of a vulnerable, trusted,
non-bundled extension. Subject to that prerequisite, this enabled an
attacker having database-level CREATE privilege to execute arbitrary
code as the bootstrap superuser. By blocking this attack in the core
server, there's no need to modify individual extensions. Back-patch to
v11 (all supported versions).
Reported by Micah Gate, Valerie Woolard, Tim Carey-Smith, and Christoph
Berg.
Security: CVE-2023-39417
indisvalid is switched to true for partitioned indexes when all its
partitions have valid indexes when attaching a new partition, up to the
top-most parent if all its leaves are themselves valid when dealing with
multiple layers of partitions.
The copy of the tuple from pg_index used to switch indisvalid to true
came from the relation cache, which is incorrect. Particularly, in the
case reported by Shruthi Gowda, executing a series of commands in a
single transaction would cause the validation of partitioned indexes to
use an incorrect version of a pg_index tuple, as indexes are reloaded
after an invalidation request with RelationReloadIndexInfo(), a much
faster version than a full index cache rebuild. In this case, the
limited information updated in the cache leads to an incorrect version
of the tuple used. One of the symptoms reported was the following
error, with a replica identity update, for instance:
"ERROR: attempted to update invisible tuple"
This is incorrect since 8b08f7d, so backpatch all the way down.
Reported-by: Shruthi Gowda
Author: Michael Paquier
Reviewed-by: Shruthi Gowda, Dilip Kumar
Discussion: https://postgr.es/m/CAASxf_PBcxax0wW-3gErUyftZ0XrCs3Lrpuhq4-Z3Fak1DoW7Q@mail.gmail.com
Backpatch-through: 11
Until now, when DROP DATABASE got interrupted in the wrong moment, the removal
of the pg_database row would also roll back, even though some irreversible
steps have already been taken. E.g. DropDatabaseBuffers() might have thrown
out dirty buffers, or files could have been unlinked. But we continued to
allow connections to such a corrupted database.
To fix this, mark databases invalid with an in-place update, just before
starting to perform irreversible steps. As we can't add a new column in the
back branches, we use pg_database.datconnlimit = -2 for this purpose.
An invalid database cannot be connected to anymore, but can still be
dropped.
Unfortunately we can't easily add output to psql's \l to indicate that some
database is invalid, it doesn't fit in any of the existing columns.
Add tests verifying that a interrupted DROP DATABASE is handled correctly in
the backend and in various tools.
Reported-by: Evgeny Morozov <postgresql3@realityexists.net>
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20230509004637.cgvmfwrbht7xm7p6@awork3.anarazel.de
Discussion: https://postgr.es/m/20230314174521.74jl6ffqsee5mtug@awork3.anarazel.de
Backpatch: 11-, bug present in all supported versions
When vac_truncate_clog() encounters bogus datfrozenxid / datminmxid values, it
returns early. Unfortunately, until now, it did not release
WrapLimitsVacuumLock. If the backend later tries to acquire
WrapLimitsVacuumLock, the session / autovacuum worker hangs in an
uncancellable way. Similarly, other sessions will hang waiting for the
lock. However, if the backend holding the lock exited or errored out for some
reason, the lock was released.
The bug was introduced as a side effect of 566372b3d6.
It is interesting that there are no production reports of this problem. That
is likely due to a mix of bugs leading to bogus values having gotten less
common, process exit releasing locks and instances of hangs being hard to
debug for "normal" users.
Discussion: https://postgr.es/m/20230621221208.vhsqgduwfpzwxnpg@awork3.anarazel.de
As coded, the code would use as a base comparison the namespace OID from
the first object scanned in pg_depend when switching its namespace
dependency entry to the new one, and use it as a base of comparison for
any follow-up checks. It would also be used as the old namespace OID to
switch *from* for the extension's pg_depend entry. Hence, if the first
object scanned has a namespace different than the one stored in the
extension, we would finish by:
- Not checking that the extension objects map with the extension's
schema.
- Not switching the extension -> namespace dependency entry to the new
namespace provided by the user, making ALTER EXTENSION ineffective.
This issue exists since this command has been introduced in d9572c4 for
relocatable extension, so backpatch all the way down to 11. The test
case has been provided by Heikki, that I have tweaked a bit to show the
effects on pg_depend for the extension.
Reported-by: Heikki Linnakangas
Author: Michael Paquier, Heikki Linnakangas
Discussion: https://postgr.es/m/20eea594-a05b-4c31-491b-007b6fceef28@iki.fi
Backpatch-through: 11
Commit f24523672d fixed a memory leak by moving the modifiedCols bitmap
into the per-row memory context. In the case of AFTER UPDATE triggers,
the bitmap is however referenced from an event kept until the end of the
query, resulting in a use-after-free bug.
Fixed by copying the bitmap into the AfterTriggerEvents memory context,
which is the one where we keep the trigger events. There's only one
place that needs to do the copy, but the memory context may not exist
yet. Doing that in a separate function seems more readable.
Report by Alexander Pyhalov, fix by me. Backpatch to 13, where the
bitmap was added to the event by commit 71d60e2aa0.
Reported-by: Alexander Pyhalov
Backpatch-through: 13
Discussion: https://postgr.es/m/acddb17c89b0d6cb940eaeda18c08bbe@postgrespro.ru
The logic that introduced partitioned indexes missed a few things when
invalidating a partitioned index when these are created, still the code
is written to handle recursions:
1) If created from scratch because a mapping index could not be found,
the new index created could be itself invalid, if for example it was a
partitioned index with one of its leaves invalid.
2) A CCI was missing when indisvalid is set for a parent index, leading
to inconsistent trees when recursing across more than one level for a
partitioned index creation if an invalidation of the parent was
required.
This could lead to the creation of a partition index tree where some of
the partitioned indexes are marked as invalid, but some of the parents
are marked valid, which is not something that should happen (as
validatePartitionedIndex() defines, indisvalid is switched to true for a
partitioned index iff all its partitions are themselves valid).
This patch makes sure that indisvalid is set to false on a partitioned
index if at least one of its partition is invalid. The flag is set to
true if *all* its partitions are valid.
The regression test added in this commit abuses of a failed concurrent
index creation, marked as invalid, that maps with an index created on
its partitioned table afterwards.
Reported-by: Alexander Lakhin
Reviewed-by: Alexander Lakhin
Discussion: https://postgr.es/m/14987634-43c0-0cb3-e075-94d423607e08@gmail.com
Backpatch-through: 11