DROP INDEX needs to lock the index's table before the index itself,
else it will deadlock against ordinary queries that acquire the
relation locks in that order. This is correctly mechanized for
plain indexes by RangeVarCallbackForDropRelation; but in the case of
a partitioned index, we neglected to lock the child tables in advance
of locking the child indexes. We can fix that by traversing the
inheritance tree and acquiring the needed locks in RemoveRelations,
after we have acquired our locks on the parent partitioned table and
index.
While at it, do some refactoring to eliminate confusion between
the actual and expected relkind in RangeVarCallbackForDropRelation.
We can save a couple of syscache lookups too, by having that function
pass back info that RemoveRelations will need.
Back-patch to v11 where partitioned indexes were added.
Jimmy Yih, Gaurab Dey, Tom Lane
Discussion: https://postgr.es/m/BYAPR05MB645402330042E17D91A70C12BD5F9@BYAPR05MB6454.namprd05.prod.outlook.com
The syscache lookup may return NULL even for valid OID, for example due
to a concurrent DROP STATISTICS, so a HeapTupleIsValid is necessary.
Without it, it may fail with a segfault.
Reported by Alexander Lakhin, patch by me. Backpatch to 13, where ALTER
STATISTICS ... SET STATISTICS was introduced.
Backpatch-through: 13
Discussion: https://postgr.es/m/17372-bf3b6e947e35ae77%40postgresql.org
Commit 859b3003de disabled building of extended stats for inheritance
trees, to prevent updating the same catalog row twice. While that
resolved the issue, it also means there are no extended stats for
declaratively partitioned tables, because there are no data in the
non-leaf relations.
That also means declaratively partitioned tables were not affected by
the issue 859b3003de addressed, which means this is a regression
affecting queries that calculate estimates for the whole inheritance
tree as a whole (which includes e.g. GROUP BY queries).
But because partitioned tables are empty, we can invert the condition
and build statistics only for the case with inheritance, without losing
anything. And we can consider them when calculating estimates.
It may be necessary to run ANALYZE on partitioned tables, to collect
proper statistics. For declarative partitioning there should no prior
statistics, and it might take time before autoanalyze is triggered. For
tables partitioned by inheritance the statistics may include data from
child relations (if built 859b3003de), contradicting the current code.
Report and patch by Justin Pryzby, minor fixes and cleanup by me.
Backpatch all the way back to PostgreSQL 10, where extended statistics
were introduced (same as 859b3003de).
Author: Justin Pryzby
Reported-by: Justin Pryzby
Backpatch-through: 10
Discussion: https://postgr.es/m/20210923212624.GI831%40telsasoft.com
We disallow altering a column datatype within a regular table,
if the table's rowtype is used as a column type elsewhere,
because we lack code to go around and rewrite the other tables.
This restriction should apply to partitioned tables as well, but it
was not checked because ATRewriteTables and ATPrepAlterColumnType
were not on the same page about who should do it for which relkinds.
Per bug #17351 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/17351-6db1870f3f4f612a@postgresql.org
Commit 4ace45677 failed to fix the problem fully, because the
same issue of attempting to fetch a non-returnable index column
can occur when rechecking the indexqual after using a lossy index
operator. Moreover, it broke EXPLAIN for such indexquals (which
indicates a gap in our test cases :-().
Revert the code changes of 4ace45677 in favor of adding a new field
to struct IndexOnlyScan, containing a version of the indexqual that
can be executed against the index-returned tuple without using any
non-returnable columns. (The restrictions imposed by check_index_only
guarantee this is possible, although we may have to recompute indexed
expressions.) Support construction of that during setrefs.c
processing by marking IndexOnlyScan.indextlist entries as resjunk
if they can't be returned, rather than removing them entirely.
(We could alternatively require setrefs.c to look up the IndexOptInfo
again, but abusing resjunk this way seems like a reasonably safe way
to avoid needing to do that.)
This solution isn't great from an API-stability standpoint: if there
are any extensions out there that build IndexOnlyScan structs directly,
they'll be broken in the next minor releases. However, only a very
invasive extension would be likely to do such a thing. There's no
change in the Path representation, so typical planner extensions
shouldn't have a problem.
As before, back-patch to all supported branches.
Discussion: https://postgr.es/m/3179992.1641150853@sss.pgh.pa.us
Discussion: https://postgr.es/m/17350-b5bdcf476e5badbb@postgresql.org
Replica identities that depend directly on an index rely on a set of
properties, one of them being that all the columns defined in this index
have to be marked as NOT NULL. There was a hole in the logic with ALTER
TABLE DROP NOT NULL, where it was possible to remove the NOT NULL
property of a column part of an index used as replica identity, so block
it to avoid problems with logical decoding down the road.
The same check was already done columns part of a primary key, so the
fix is straight-forward.
Author: Haiying Tang, Hou Zhijie
Reviewed-by: Dilip Kumar, Michael Paquier
Discussion: https://postgr.es/m/OS0PR01MB6113338C102BEE8B2FFC5BD9FB619@OS0PR01MB6113.jpnprd01.prod.outlook.com
Backpatch-through: 10
When changing REPLICA IDENTITY INDEX to another one, the target table's
relcache was not being invalidated. This leads to skipping update/delete
operations during apply on the subscriber side as the columns required to
search corresponding rows won't get logged.
Author: Tang Haiying, Hou Zhijie
Reviewed-by: Euler Taveira, Amit Kapila
Backpatch-through: 10
Discussion: https://postgr.es/m/OS0PR01MB61133CA11630DAE45BC6AD95FB939@OS0PR01MB6113.jpnprd01.prod.outlook.com
Commit 1b5d797cd4f7 intended to relax the lock level used to rename
indexes, but inadvertently allowed *any* relation to be renamed with a
lowered lock level, as long as the command is spelled ALTER INDEX.
That's undesirable for other relation types, so retry the operation with
the higher lock if the relation turns out not to be an index.
After this fix, ALTER INDEX <sometable> RENAME will require access
exclusive lock, which it didn't before.
Author: Nathan Bossart <bossartn@amazon.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Onder Kalaci <onderk@microsoft.com>
Discussion: https://postgr.es/m/PH0PR21MB1328189E2821CDEC646F8178D8AE9@PH0PR21MB1328.namprd21.prod.outlook.com
The grammar of this command run on indexes with column names has always
been authorized by the parser, and it has never been documented.
Since 911e702, it is possible to define opclass parameters as of CREATE
INDEX, which actually broke the old case of ALTER INDEX/TABLE where
relation-level parameters n_distinct and n_distinct_inherited could be
defined for an index (see 76a47c0 and its thread where this point has
been touched, still remained unused). Attempting to do that in v13~
would cause the index to become unusable, as there is a new dedicated
code path to load opclass parameters instead of the relation-level ones
previously available. Note that it is possible to fix things with a
manual catalog update to bring the relation back online.
This commit disables this command for now as the use of column names for
indexes does not make sense anyway, particularly when it comes to index
expressions where names are automatically computed. One way to properly
support this case properly in the future would be to use column numbers
when it comes to indexes, in the same way as ALTER INDEX .. ALTER COLUMN
.. SET STATISTICS.
Partitioned indexes were already blocked, but not indexes. Some tests
are added for both cases.
There was some code in ANALYZE to enforce n_distinct to be used for an
index expression if the parameter was defined, but just remove it for
now until/if there is support for this (note that index-level parameters
never had support in pg_dump either, previously), so this was just dead
code.
Reported-by: Matthijs van der Vleuten
Author: Nathan Bossart, Michael Paquier
Reviewed-by: Vik Fearing, Dilip Kumar
Discussion: https://postgr.es/m/17220-15d684c6c2171a83@postgresql.org
Backpatch-through: 13
Failing to do that, any direct inserts/updates of those partitions
would fail to enforce the correct constraint, that is, one that
considers the new partition constraint of their parent table.
Backpatch to 10.
Reported by: Hou Zhijie <houzj.fnst@fujitsu.com>
Author: Amit Langote <amitlangote09@gmail.com>
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Nitin Jadhav <nitinjadhavpostgres@gmail.com>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/OS3PR01MB5718DA1C4609A25186D1FBF194089%40OS3PR01MB5718.jpnprd01.prod.outlook.com
An extension may want to call GetSecurityLabel() on a shared object
before the shared relcaches are fully initialized. For instance, a
ClientAuthentication_hook might want to retrieve the security label on
a role.
Discussion: https://postgr.es/m/ecb7af0b26e3be1d96d291c8453a86f1f82d9061.camel@j-davis.com
Backpatch-through: 9.6
Updates/Deletes on a partition were allowed even without replica identity
after the parent table was added to a publication. This would later lead
to an error on subscribers. The reason was that we were not invalidating
the partition's relcache and the publication information for partitions
was not getting rebuilt. Similarly, we were not invalidating the
partitions' relcache after dropping a partitioned table from a publication
which will prohibit Updates/Deletes on its partition without replica
identity even without any publication.
Reported-by: Haiying Tang
Author: Hou Zhijie and Vignesh C
Reviewed-by: Vignesh C and Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/OS0PR01MB6113D77F583C922F1CEAA1C3FBD29@OS0PR01MB6113.jpnprd01.prod.outlook.com
Before commit 84f5c2908, a STABLE function in a plpgsql CALL
statement's argument list would see an up-to-date snapshot,
because exec_stmt_call would push a new snapshot. I got rid of
that because the possibility of the snapshot disappearing within
COMMIT made it too hard to manage a snapshot across the CALL
statement. That's fine so far as the procedure itself goes,
but I forgot to think about the possibility of STABLE functions
within the CALL argument list. As things now stand, those'll
be executed with the Portal's snapshot as ActiveSnapshot,
keeping them from seeing updates more recent than Portal startup.
(VOLATILE functions don't have a problem because they take their
own snapshots; which indeed is also why the procedure itself
doesn't have a problem. There are no STABLE procedures.)
We can fix this by pushing a new snapshot transiently within
ExecuteCallStmt itself. Popping the snapshot before we get
into the procedure proper eliminates the management problem.
The possibly-useless extra snapshot-grab is slightly annoying,
but it's no worse than what happened before 84f5c2908.
Per bug #17199 from Alexander Nawratil. Back-patch to v11,
like the previous patch.
Discussion: https://postgr.es/m/17199-1ab2561f0d94af92@postgresql.org
Formerly, we sent signals for outgoing NOTIFY messages within
ProcessCompletedNotifies, which was also responsible for sending
relevant ones of those messages to our connected client. It therefore
had to run during the main-loop processing that occurs just before
going idle. This arrangement had two big disadvantages:
* Now that procedures allow intra-command COMMITs, it would be
useful to send NOTIFYs to other sessions immediately at COMMIT
(though, for reasons of wire-protocol stability, we still shouldn't
forward them to our client until end of command).
* Background processes such as replication workers would not send
NOTIFYs at all, since they never execute the client communication
loop. We've had requests to allow triggers running in replication
workers to send NOTIFYs, so that's a problem.
To fix these things, move transmission of outgoing NOTIFY signals
into AtCommit_Notify, where it will happen during CommitTransaction.
Also move the possible call of asyncQueueAdvanceTail there, to
ensure we don't bloat the async SLRU if a background worker sends
many NOTIFYs with no one listening.
We can also drop the call of asyncQueueReadAllNotifications,
allowing ProcessCompletedNotifies to go away entirely. That's
because commit 790026972 added a call of ProcessNotifyInterrupt
adjacent to PostgresMain's call of ProcessCompletedNotifies,
and that does its own call of asyncQueueReadAllNotifications,
meaning that we were uselessly doing two such calls (inside two
separate transactions) whenever inbound notify signals coincided
with an outbound notify. We need only set notifyInterruptPending
to ensure that ProcessNotifyInterrupt runs, and we're done.
The existing documentation suggests that custom background workers
should call ProcessCompletedNotifies if they want to send NOTIFY
messages. To avoid an ABI break in the back branches, reduce it
to an empty routine rather than removing it entirely. Removal
will occur in v15.
Although the problems mentioned above have existed for awhile,
I don't feel comfortable back-patching this any further than v13.
There was quite a bit of churn in adjacent code between 12 and 13.
At minimum we'd have to also backpatch 51004c717, and a good deal
of other adjustment would also be needed, so the benefit-to-risk
ratio doesn't look attractive.
Per bug #15293 from Michael Powers (and similar gripes from others).
Artur Zakirov and Tom Lane
Discussion: https://postgr.es/m/153243441449.1404.2274116228506175596@wrigleys.postgresql.org
We have long forbidden fetching backwards from a NO SCROLL cursor,
but the prohibition didn't extend to cases in which we rewind the
query altogether and then re-fetch forwards. I think the reason is
that this logic was mainly meant to protect plan nodes that can't
be run in the reverse direction. However, re-reading the query output
is problematic if the query is volatile (which includes SELECT FOR
UPDATE, not just queries with volatile functions): the re-read can
produce different results, which confuses the cursor navigation logic
completely. Another reason for disliking this approach is that some
code paths will either fetch backwards or rewind-and-fetch-forwards
depending on the distance to the target row; so that seemingly
identical use-cases may or may not draw the "cursor can only scan
forward" error. Hence, let's clean things up by disallowing rewind
as well as fetch-backwards in a NO SCROLL cursor.
Ordinarily we'd only make such a definitional change in HEAD, but
there is a third reason to consider this change now. Commit ba2c6d6ce
created some new user-visible anomalies for non-scrollable cursors
WITH HOLD, in that navigation in the cursor result got confused if the
cursor had been partially read before committing. The only good way
to resolve those anomalies is to forbid rewinding such a cursor, which
allows removal of the incorrect cursor state manipulations that
ba2c6d6ce added to PersistHoldablePortal.
To minimize the behavioral change in the back branches (including
v14), refuse to rewind a NO SCROLL cursor only when it has a holdStore,
ie has been held over from a previous transaction due to WITH HOLD.
This should avoid breaking most applications that have been sloppy
about whether to declare cursors as scrollable. We'll enforce the
prohibition across-the-board beginning in v15.
Back-patch to v11, as ba2c6d6ce was.
Discussion: https://postgr.es/m/3712911.1631207435@sss.pgh.pa.us
Some plan node types don't react well to being called again after
they've already returned NULL. PortalRunSelect() has long dealt
with this by calling the executor with NoMovementScanDirection
if it sees that we've already run the portal to the end. However,
commit ba2c6d6ce overlooked this point, so that persisting an
already-fully-fetched cursor would fail if it had such a plan.
Per report from Tomas Barton. Back-patch to v11, as the faulty
commit was. (I've omitted a test case because the type of plan
that causes a problem isn't all that stable.)
Discussion: https://postgr.es/m/CAPV2KRjd=ErgVGbvO2Ty20tKTEZZr6cYsYLxgN_W3eAo9pf5sw@mail.gmail.com
Updates/Deletes on a relation were allowed even without replica identity
after we define the publication for all tables. This would later lead to
an error on subscribers. The reason was that for such publications we were
not invalidating the relcache and the publication information for
relations was not getting rebuilt. Similarly, we were not invalidating the
relcache after dropping of such publications which will prohibit
Updates/Deletes without replica identity even without any publication.
Author: Vignesh C and Hou Zhijie
Reviewed-by: Hou Zhijie, Kyotaro Horiguchi, Amit Kapila
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/CALDaNm0pF6zeWqCA8TCe2sDuwFAy8fCqba=nHampCKag-qLixg@mail.gmail.com
Previously this was allowed, but the collation effectively vanished
into the ether because of the way lookup_collation() works: you could
not use the collation, nor even drop it. Seems better to give an
error up front than to leave the user wondering why it doesn't work.
(Because this test is in DefineCollation not CreateCollation, it does
not prevent pg_import_system_collations from creating ICU collations,
regardless of the initially-chosen encoding.)
Per bug #17170 from Andrew Bille. Back-patch to v10 where ICU support
was added.
Discussion: https://postgr.es/m/17170-95845cf3f0a9c36d@postgresql.org
If the system crashed between CREATE TABLESPACE and the next checkpoint,
the result could be some files in the tablespace unexpectedly containing
no rows. Affected files would be those for which the system did not
write WAL; see the wal_skip_threshold documentation. Before v13, a
different set of conditions governed the writing of WAL; see v12's
<sect2 id="populate-pitr">. (The v12 conditions were broader in some
ways and narrower in others.) Users may want to audit non-default
tablespaces for unexpected short files. The bug could have truncated an
index without affecting the associated table, and reindexing the index
would fix that particular problem.
This fixes the bug by making create_tablespace_directories() more like
TablespaceCreateDbspace(). create_tablespace_directories() was
recursively removing tablespace contents, reasoning that WAL redo would
recreate everything removed that way. That assumption holds for other
wal_level values. Under wal_level=minimal, the old approach could
delete files for which no other copy existed. Back-patch to 9.6 (all
supported versions).
Reviewed by Robert Haas and Prabhat Sahu. Reported by Robert Haas.
Discussion: https://postgr.es/m/CA+TgmoaLO9ncuwvr2nN-J4VEP5XyAcy=zKiHxQzBbFRxxGxm0w@mail.gmail.com
Commit 325f2ec555 introduced pg_class.relwrite to skip operations on
tables created as part of a heap rewrite during DDL. It links such
transient heaps to the original relation OID via this new field in
pg_class but forgot to do anything about toast tables. So, logical
decoding was not able to skip operations on internally created toast
tables. This leads to an error when we tried to decode the WAL for the
next operation for which it appeared that there is a toast data where
actually it didn't have any toast data.
To fix this, we set pg_class.relwrite for internally created toast tables
as well which allowed skipping operations on them during logical decoding.
Author: Bertrand Drouvot
Reviewed-by: David Zhang, Amit Kapila
Backpatch-through: 11, where it was introduced
Discussion: https://postgr.es/m/b5146fb1-ad9e-7d6e-f980-98ed68744a7c@amazon.com
If recordDependencyOnCurrentExtension is invoked on a pre-existing,
free-standing object during an extension update script, that object
will become owned by the extension. In our current code this is
possible in three cases:
* Replacing a "shell" type or operator.
* CREATE OR REPLACE overwriting an existing object.
* ALTER TYPE SET, ALTER DOMAIN SET, and ALTER OPERATOR SET.
The first of these cases is intentional behavior, as noted by the
existing comments for GenerateTypeDependencies. It seems like
appropriate behavior for CREATE OR REPLACE too; at least, the obvious
alternatives are not better. However, the fact that it happens during
ALTER is an artifact of trying to share code (GenerateTypeDependencies
and makeOperatorDependencies) between the CREATE and ALTER cases.
Since an extension script would be unlikely to ALTER an object that
didn't already belong to the extension, this behavior is not very
troubling for the direct target object ... but ALTER TYPE SET will
recurse to dependent domains, and it is very uncool for those to
become owned by the extension if they were not already.
Let's fix this by redefining the ALTER cases to never change extension
membership, full stop. We could minimize the behavioral change by
only changing the behavior when ALTER TYPE SET is recursing to a
domain, but that would complicate the code and it does not seem like
a better definition.
Per bug #17144 from Alex Kozhemyakin. Back-patch to v13 where ALTER
TYPE SET was added. (The other cases are older, but since they only
affect the directly-named object, there's not enough of a problem to
justify changing the behavior further back.)
Discussion: https://postgr.es/m/17144-e67d7a8f049de9af@postgresql.org
Rather than trying to pick table aliases that won't conflict with
any possible user-defined matview column name, adjust the queries'
syntax so that the aliases are only used in places where they can't be
mistaken for column names. Mostly this consists of writing "alias.*"
not just "alias", which adds clarity for humans as well as machines.
We do have the issue that "SELECT alias.*" acts differently from
"SELECT alias", but we can use the same hack ruleutils.c uses for
whole-row variables in SELECT lists: write "alias.*::compositetype".
We might as well revert to the original aliases after doing this;
they're a bit easier to read.
Like 75d66d10e, back-patch to all supported branches.
Discussion: https://postgr.es/m/2488325.1628261320@sss.pgh.pa.us
We don't allow to create replication slot_name as an empty string ('') via
SQL API pg_create_logical_replication_slot() but it is allowed to be set
via Alter Subscription command. This will lead to apply worker repeatedly
keep trying to stream data via slot_name '' and the user is not allowed to
create the slot with that name.
Author: Japin Li
Reviewed-By: Ranier Vilela, Amit Kapila
Backpatch-through: 10, where it was introduced
Discussion: https://postgr.es/m/MEYP282MB1669CBD98E721C77CA696499B61A9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM
When triggers are cloned from partitioned tables to their partitions,
the 'tgenabled' flag (origin/replica/always/disable) was not propagated.
Make it so that the flag on the trigger on partition is initially set to
the same value as on the partitioned table.
Add a test case to verify the behavior.
Backpatch to 11, where this appeared in commit 86f575948c77.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/20200930223450.GA14848@telsasoft.com
Some commands of ALTER TABLE could fail with the following error:
ERROR: "tab" is of the wrong type
This error is unexpected, as all the code paths leading to
ATWrongRelkindError() should use a supported set of relkinds to generate
correct error messages. This commit closes the gap with such mistakes,
by adding all the missing relkind combinations. Tests are added to
check all the problems found. Note that some combinations are not used,
but these are left around as it could have an impact on applications
relying on this code.
2ed532e has done a much larger refactoring on HEAD to make such error
messages easier to manage in the long-term, so nothing is needed there.
Author: Kyotaro Horiguchi
Reviewed-by: Peter Eisentraut, Ahsan Hadi, Michael Paquier
Discussion: https://postgr.es/m/20210216.181415.368926598204753659.horikyota.ntt@gmail.com
Backpatch-through: 11
Although we were careful to lock the object being added or dropped,
we failed to get any sort of lock on the extension itself. This
allowed the ALTER to proceed in parallel with a DROP EXTENSION,
which is problematic for a couple of reasons. If both commands
succeeded we'd be left with a dangling link in pg_depend, which
would cause problems later. Also, if the ALTER failed for some
reason, it might try to print the extension's name, and that could
result in a crash or (in older branches) a silly error message
complaining about extension "(null)".
Per bug #17098 from Alexander Lakhin. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/17098-b960f3616c861f83@postgresql.org
It's not really necessary for this function to open or lock the
relation associated with the pg_policy entry it's modifying. The
error checks it's making on the rel are if anything counterproductive
(e.g., if we don't want to allow installation of policies on system
catalogs, here is not the place to prevent that). In particular, it
seems just wrong to insist on an ownership check. That has the net
effect of forcing people to use superuser for DROP OWNED BY, which
surely is not an effect we want. Also there is no point in rebuilding
the dependencies of the policy expressions, which aren't being
changed. Lastly, locking the table also seems counterproductive; it's
not helping to prevent race conditions, since we failed to re-read the
pg_policy row after acquiring the lock. That means that concurrent
DDL would likely result in "tuple concurrently updated/deleted"
errors; which is the same behavior this code will produce, with less
overhead.
Per discussion of bug #17062. Back-patch to all supported versions,
as the failure cases this eliminates seem just as undesirable in 9.6
as in HEAD.
Discussion: https://postgr.es/m/1573181.1624220108@sss.pgh.pa.us
Ordinarily, a pg_policy.polroles array wouldn't list the same role
more than once; but CREATE POLICY does not prevent that. If we
perform DROP OWNED BY on a role that is listed more than once,
RemoveRoleFromObjectPolicy either suffered an assertion failure
or encountered a tuple-updated-by-self error. Rewrite it to cope
correctly with duplicate entries, and add a CommandCounterIncrement
call to prevent the other problem.
Per discussion, there's other cleanup that ought to happen here,
but this seems like the minimum essential fix.
Per bug #17062 from Alexander Lakhin. It's been broken all along,
so back-patch to all supported branches.
Discussion: https://postgr.es/m/17062-11f471ae3199ca23@postgresql.org
This works fine in the "simple Query" code path; but if the
statement is in the plan cache then it's corrupted for future
re-execution. Apply copyObject() to protect the original
tree from modification, as we've done elsewhere.
This narrow fix is applied only to the back branches. In HEAD,
the problem was fixed more generally by commit 7c337b6b5; but
that changed ProcessUtility's API, so it's infeasible to
back-patch.
Per bug #17053 from Charles Samborski.
Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us
Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org
The fast default code added in Release 11 omitted to check that the
table a fast default was being added to was a plain table. Thus one
could be added to a foreign table, which predicably blows up. Here we
perform that check.
In addition, on the back branches, since some of these might have
escaped into the wild, if we encounter a missing value for
an attribute of something other than a plain table we ignore it.
Fixes bug #17056
Backpatch to release 11,
Reviewed by: Andres Freund, Álvaro Herrera and Tom Lane
When stuffing a plan from the plancache into a Portal, one is
not supposed to risk throwing an error between GetCachedPlan and
PortalDefineQuery; if that happens, the plan refcount incremented
by GetCachedPlan will be leaked. I managed to break this rule
while refactoring code in 9dbf2b7d7. There is no visible
consequence other than some memory leakage, and since nobody is
very likely to trigger the relevant error conditions many times
in a row, it's not surprising we haven't noticed. Nonetheless,
it's a bug, so rearrange the order of operations to remove the
hazard.
Noted on the way to looking for a better fix for bug #17053.
This mistake is pretty old, so back-patch to all supported
branches.
PersistHoldablePortal has long assumed that it should store the
entire output of the query-to-be-persisted, which requires rewinding
and re-reading the output. This is problematic if the query is not
stable: we might get different row contents, or even a different
number of rows, which'd confuse the cursor state mightily.
In the case where the cursor is NO SCROLL, this is very easy to
solve: just store the remaining query output, without any rewinding,
and tweak the portal's cursor state to match. Aside from removing
the semantic problem, this could be significantly more efficient
than storing the whole output.
If the cursor is scrollable, there's not much we can do, but it
was already the case that scrolling a volatile query's result was
pretty unsafe. We can just document more clearly that getting
correct results from that is not guaranteed.
There are already prohibitions in place on using SCROLL with
FOR UPDATE/SHARE, which is one way for a SELECT query to have
non-stable results. We could imagine prohibiting SCROLL when
the query contains volatile functions, but that would be
expensive to enforce. Moreover, it could break applications
that work just fine, if they have functions that are in fact
stable but the user neglected to mark them so. So settle for
documenting the hazard.
While this problem has existed in some guise for a long time,
it got a lot worse in v11, which introduced the possibility
of persisting plpgsql cursors (perhaps implicit ones) even
when they violate the rules for what can be marked WITH HOLD.
Hence, I've chosen to back-patch to v11 but not further.
Per bug #17050 from Алексей Булгаков.
Discussion: https://postgr.es/m/17050-f77aa827dc85247c@postgresql.org
The internal SQL queries used by REFRESH MATERIALIZED VIEW CONCURRENTLY
include some aliases for its diff and temporary relations with
rather-generic names: diff, newdata, newdata2 and mv. Depending on the
queries used for the materialized view, using CONCURRENTLY could lead to
some internal failures if the query and those internal aliases conflict.
Those names have been chosen in 841c29c8. This commit switches instead
to a naming pattern which is less likely going to cause conflicts, based
on an idea from Thomas Munro, by appending _$ to those aliases. This is
not perfect as those new names could still conflict, but at least it has
the advantage to keep the code readable and simple while reducing the
likelihood of conflicts to be close to zero.
Reported-by: Mathis Rudolf
Author: Bharath Rupireddy
Reviewed-by: Bernd Helmle, Thomas Munro, Michael Paquier
Discussion: https://postgr.es/m/109c267a-10d2-3c53-b60e-720fcf44d9e8@credativ.de
Backpatch-through: 9.6
The error messages, docs, and one of the options were using
'parallel degree' to indicate parallelism used by vacuum command. We
normally use 'parallel workers' at other places so change it for parallel
vacuum accordingly.
Author: Bharath Rupireddy
Reviewed-by: Dilip Kumar, Amit Kapila
Backpatch-through: 13
Discussion: https://postgr.es/m/CALj2ACWz=PYrrFXVsEKb9J1aiX4raA+UBe02hdRp_zqDkrWUiw@mail.gmail.com
If we redirected a replicated tuple operation into a partition child
table, and then tried to fire AFTER triggers for that event, the
relation cache entry for the child table was already closed. This has
no visible ill effects as long as the entry is still there and still
valid, but an unluckily-timed cache flush could result in a crash or
other misbehavior.
To fix, postpone the ExecCleanupTupleRouting call (which is what
closes the child table) until after we've fired triggers. This
requires a bit of refactoring so that the cleanup function can
have access to the necessary state.
In HEAD, I took the opportunity to simplify some of worker.c's
function APIs based on use of the new ApplyExecutionData struct.
However, it doesn't seem safe/practical to back-patch that aspect,
at least not without a lot of analysis of possible interactions
with a04daa97a.
In passing, add an Assert to afterTriggerInvokeEvents to catch
such cases. This seems worthwhile because we've grown a number
of fairly unstructured ways of calling AfterTriggerEndQuery.
Back-patch to v13, where worker.c grew the ability to deal with
partitioned target tables.
Discussion: https://postgr.es/m/3382681.1621381328@sss.pgh.pa.us
We consider this supported (though I've got my doubts that it's a
good idea, because tableoid is not immutable). However, several
code paths failed to fill the field in soon enough, causing such
a GENERATED expression to see zero or the wrong value. This
occurred when ALTER TABLE adds a new GENERATED column to a table
with existing rows, and during regular INSERT or UPDATE on a
foreign table with GENERATED columns.
Noted during investigation of a report from Vitaly Ustinov.
Back-patch to v12 where GENERATED came in.
Discussion: https://postgr.es/m/CAM_DEiWR2DPT6U4xb-Ehigozzd3n3G37ZB1+867zbsEVtYoJww@mail.gmail.com
COMMIT/ROLLBACK necessarily destroys all snapshots within the session.
The original implementation of intra-procedure transactions just
cavalierly did that, ignoring the fact that this left us executing in
a rather different environment than normal. In particular, it turns
out that handling of toasted datums depends rather critically on there
being an outer ActiveSnapshot: otherwise, when SPI or the core
executor pop whatever snapshot they used and return, it's unsafe to
dereference any toasted datums that may appear in the query result.
It's possible to demonstrate "no known snapshots" and "missing chunk
number N for toast value" errors as a result of this oversight.
Historically this outer snapshot has been held by the Portal code,
and that seems like a good plan to preserve. So add infrastructure
to pquery.c to allow re-establishing the Portal-owned snapshot if it's
not there anymore, and add enough bookkeeping support that we can tell
whether it is or not.
We can't, however, just re-establish the Portal snapshot as part of
COMMIT/ROLLBACK. As in normal transaction start, acquiring the first
snapshot should wait until after SET and LOCK commands. Hence, teach
spi.c about doing this at the right time. (Note that this patch
doesn't fix the problem for any PLs that try to run intra-procedure
transactions without using SPI to execute SQL commands.)
This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD,
rename that to allow_nonatomic.
replication/logical/worker.c also needs some fixes, because it wasn't
careful to hold a snapshot open around AFTER trigger execution.
That code doesn't use a Portal, which I suspect someday we're gonna
have to fix. But for now, just rearrange the order of operations.
This includes back-patching the recent addition of finish_estate()
to centralize the cleanup logic there.
This also back-patches commit 2ecfeda3e into v13, to improve the
test coverage for worker.c (it was that test that exposed that
worker.c's snapshot management is wrong).
Per bug #15990 from Andreas Wicht. Back-patch to v11 where
intra-procedure COMMIT was added.
Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org
This patch replaces use of the global "wrconn" variable in
AlterSubscription_refresh with a local variable of the same name, making
it consistent with other functions in subscriptioncmds.c (e.g.
DropSubscription).
The global wrconn is only meant to be used for logical apply/tablesync worker.
Abusing it this way is known to cause trouble if an apply worker
manages to do a subscription refresh, such as reported by Jeremy Finzel
and diagnosed by Andres Freund back in November 2020, at
https://www.postgresql.org/message-id/20201111215820.qihhrz7fayu6myfi@alap3.anarazel.de
Backpatch to 10. In branch master, also move the connection establishment
to occur outside the PG_TRY block; this way we can remove a test for NULL in
PG_FINALLY, and it also makes the code more consistent with similar code in
the same file.
Author: Peter Smith <peter.b.smith@fujitsu.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/CAHut+Pu7Jv9L2BOEx_Z0UtJxfDevQSAUW2mJqWU+CtmDrEZVAg@mail.gmail.com
When ALTER TABLE .. ALTER CONSTRAINT changes deferrability properties
changed in a partitioned table, we failed to propagate those changes
correctly to partitions and to triggers. Repair by adding a recursion
mechanism to affect all derived constraints and all derived triggers.
(In particular, recurse to partitions even if their respective parents
are already in the desired state: it is possible for the partitions to
have been altered individually.) Because foreign keys involve tables in
two sides, we cannot use the standard ALTER TABLE recursion mechanism,
so we invent our own by following pg_constraint.conparentid down.
When ALTER TABLE .. ALTER CONSTRAINT is invoked on the derived
pg_constraint object that's automaticaly created in a partition as a
result of a constraint added to its parent, raise an error instead of
pretending to work and then failing to modify all the affected triggers.
Before this commit such a command would be allowed but failed to affect
all triggers, so it would silently misbehave. (Restoring dumps of
existing databases is not affected, because pg_dump does not produce
anything for such a derived constraint anyway.)
Add some tests for the case.
Backpatch to 11, where foreign key support was added to partitioned
tables by commit 3de241dba86f. (A related change is commit f56f8f8da6af
in pg12 which added support for FKs *referencing* partitioned tables;
this is what forces us to use an ad-hoc recursion mechanism for this.)
Diagnosed by Tom Lane from bug report from Ron L Johnson. As of this
writing, no reviews were offered.
Discussion: https://postgr.es/m/75fe0761-a291-86a9-c8d8-4906da077469@gmail.com
Discussion: https://postgr.es/m/3144850.1607369633@sss.pgh.pa.us
The OID of the constraint is used instead of the OID of the trigger --
an easy mistake to make. Apparently the object-alter hooks are not very
well tested :-(
Backpatch to 12, where this typo was introduced by 578b229718e8
Discussion: https://postgr.es/m/20210503231633.GA6994@alvherre.pgsql
When running ALTER TABLE t2 INHERIT t1, we must check that columns in
t2 that correspond to a generated column in t1 are also generated and
have the same generation expression. Otherwise, this would allow
creating setups that a normal CREATE TABLE sequence would not allow.
Discussion: https://www.postgresql.org/message-id/22de27f6-7096-8d96-4619-7b882932ca25@2ndquadrant.com
Most GUC check hooks that inspect database state have special checks
that prevent them from throwing hard errors for state-dependent issues
when source == PGC_S_TEST. This allows, for example,
"ALTER DATABASE d SET default_text_search_config = foo" when the "foo"
configuration hasn't been created yet. Without this, we have problems
during dump/reload or pg_upgrade, because pg_dump has no idea about
possible dependencies of GUC values and can't ensure a safe restore
ordering.
However, check_role() and check_session_authorization() hadn't gotten
the memo about that, and would throw hard errors anyway. It's not
entirely clear what is the use-case for "ALTER ROLE x SET role = y",
but we've now heard two independent complaints about that bollixing
an upgrade, so apparently some people are doing it.
Hence, fix these two functions to act more like other check hooks
with similar needs. (But I did not change their insistence on
being inside a transaction, as it's still not apparent that setting
either GUC from the configuration file would be wise.)
Also fix check_temp_buffers, which had a different form of the disease
of making state-dependent checks without any exception for PGC_S_TEST.
A cursory survey of other GUC check hooks did not find any more issues
of this ilk. (There are a lot of interdependencies among
PGC_POSTMASTER and PGC_SIGHUP GUCs, which may be a bad idea, but
they're not relevant to the immediate concern because they can't be
set via ALTER ROLE/DATABASE.)
Per reports from Charlie Hornsby and Nathan Bossart. Back-patch
to all supported branches.
Discussion: https://postgr.es/m/HE1P189MB0523B31598B0C772C908088DB7709@HE1P189MB0523.EURP189.PROD.OUTLOOK.COM
Discussion: https://postgr.es/m/20160711223641.1426.86096@wrigleys.postgresql.org
I introduced this duplicate code in commit 8b08f7d4820f for no good
reason. Remove it, and backpatch to 11 where it was introduced.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
vacuumlazy.c sometimes fails to update pg_class entries for each index
(to ensure that pg_class.reltuples is current), even though analyze.c
assumed that that must have happened during VACUUM ANALYZE. There are
at least a couple of reasons for this. For example, vacuumlazy.c could
fail to update pg_class when the index AM indicated that its statistics
are merely an estimate, per the contract for amvacuumcleanup() routines
established by commit e57345975cf back in 2006.
Stop assuming that pg_class must have been updated with accurate
statistics within VACUUM ANALYZE -- update pg_class for indexes at the
same time as the table relation in all cases. That way VACUUM ANALYZE
will never fail to keep pg_class.reltuples reasonably accurate.
The only downside of this approach (compared to the old approach) is
that it might inaccurately set pg_class.reltuples for indexes whose heap
relation ends up with the same inaccurate value anyway. This doesn't
seem too bad. We already consistently called vac_update_relstats() (to
update pg_class) for the heap/table relation twice during any VACUUM
ANALYZE -- once in vacuumlazy.c, and once in analyze.c. We now make
sure that we call vac_update_relstats() at least once (though often
twice) for each index.
This is follow up work to commit 9f3665fb, which dealt with issues in
btvacuumcleanup(). Technically this fixes an unrelated issue, though.
btvacuumcleanup() no longer provides an accurate num_index_tuples value
following commit 9f3665fb (when there was no btbulkdelete() call during
the VACUUM operation in question), but hashvacuumcleanup() has worked in
the same way for many years now.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAH2-WzknxdComjhqo4SUxVFk_Q1171GJO2ZgHZ1Y6pion6u8rA@mail.gmail.com
Backpatch: 13-, just like commit 9f3665fb.
"SELECT pg_import_system_collations(0)" caused an assertion failure.
With a random nonzero argument --- or indeed with zero, in non-assert
builds --- it would happily make pg_collation entries with garbage
values of collnamespace. These are harmless as far as I can tell
(unless maybe the OID happens to become used for a schema, later on?).
In any case this isn't a security issue, since the function is
superuser-only. But it seems like a gotcha for unwary DBAs, so let's
add a check that the given OID belongs to some schema.
Back-patch to v10 where this function was introduced.
AfterTriggerSaveEvent() wrongly allocates the slot in execution-span
memory context, whereas the correct thing is to allocate it in
a transaction-span context, because that's where the enclosing
AfterTriggersTableData instance belongs into.
Backpatch to 12 (the test back to 11, where it works well with no code
changes, and it's good to have to confirm that the case was previously
well supported); this bug seems introduced by commit ff11e7f4b9ae.
Reported-by: Bertrand Drouvot <bdrouvot@amazon.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/39a71864-b120-5a5c-8cc5-c632b6f16761@amazon.com
If a cross-partition UPDATE violates a constraint on the target partition,
and the columns in the new partition are in different physical order than
in the parent, the error message can reveal columns that the user does not
have SELECT permission on. A similar bug was fixed earlier in commit
804b6b6db4.
The cause of the bug is that the callers of the
ExecBuildSlotValueDescription() function got confused when constructing
the list of modified columns. If the tuple was routed from a parent, we
converted the tuple to the parent's format, but the list of modified
columns was grabbed directly from the child's RTE entry.
ExecUpdateLockMode() had a similar issue. That lead to confusion on which
columns are key columns, leading to wrong tuple lock being taken on tables
referenced by foreign keys, when a row is updated with INSERT ON CONFLICT
UPDATE. A new isolation test is added for that corner case.
With this patch, the ri_RangeTableIndex field is no longer set for
partitions that don't have an entry in the range table. Previously, it was
set to the RTE entry of the parent relation, but that was confusing.
NOTE: This modifies the ResultRelInfo struct, replacing the
ri_PartitionRoot field with ri_RootResultRelInfo. That's a bit risky to
backpatch, because it breaks any extensions accessing the field. The
change that ri_RangeTableIndex is not set for partitions could potentially
break extensions, too. The ResultRelInfos are visible to FDWs at least,
and this patch required small changes to postgres_fdw. Nevertheless, this
seem like the least bad option. I don't think these fields widely used in
extensions; I don't think there are FDWs out there that uses the FDW
"direct update" API, other than postgres_fdw. If there is, you will get a
compilation error, so hopefully it is caught quickly.
Backpatch to 11, where support for both cross-partition UPDATEs, and unique
indexes on partitioned tables, were added.
Reviewed-by: Amit Langote
Security: CVE-2021-3393
If a multi-byte character is escaped with a backslash in TEXT mode input,
and the encoding is one of the client-only encodings where the bytes after
the first one can have an ASCII byte "embedded" in the char, we didn't
skip the character correctly. After a backslash, we only skipped the first
byte of the next character, so if it was a multi-byte character, we would
try to process its second byte as if it was a separate character. If it
was one of the characters with special meaning, like '\n', '\r', or
another '\\', that would cause trouble.
One such exmple is the byte sequence '\x5ca45c2e666f6f' in Big5 encoding.
That's supposed to be [backslash][two-byte character][.][f][o][o], but
because the second byte of the two-byte character is 0x5c, we incorrectly
treat it as another backslash. And because the next character is a dot, we
parse it as end-of-copy marker, and throw an "end-of-copy marker corrupt"
error.
Backpatch to all supported versions.
Reviewed-by: John Naylor, Kyotaro Horiguchi
Discussion: https://www.postgresql.org/message-id/a897f84f-8dca-8798-3139-07da5bb38728%40iki.fi
In trying to protect the user from inconsistent behavior, commit
487e9861d0cf "Enable BEFORE row-level triggers for partitioned tables"
tried to prevent BEFORE UPDATE FOR EACH ROW triggers from moving the row
from one partition to another. However, it turns out that the
restriction is wrong in two ways: first, it fails spuriously, preventing
valid situations from working, as in bug #16794; and second, they don't
protect from any misbehavior, because tuple routing would cope anyway.
Fix by removing that restriction.
We keep the same restriction on BEFORE INSERT FOR EACH ROW triggers,
though. It is valid and useful there. In the future we could remove it
by having tuple reroute work for inserts as it does for updates.
Backpatch to 13.
Author: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reported-by: Phillip Menke <pg@pmenke.de>
Discussion: https://postgr.es/m/16794-350a655580fbb9ae@postgresql.org