We should run the expression subtrees of PartitionedRelPruneInfo
structs through fix_scan_expr. Failure to do so means that
AlternativeSubPlans within those expressions won't be cleaned up
properly, resulting in "unrecognized node type" errors since v14.
It seems fairly likely that at least some of the other steps done
by fix_scan_expr are important here as well, resulting in as-yet-
undetected bugs. Therefore, I've chosen to back-patch this to
all supported branches including v13, even though the known
symptom doesn't manifest in v13.
Per bug #18778 from Alexander Lakhin.
Discussion: https://postgr.es/m/18778-24cd399df6c806af@postgresql.org
These facilities were originally in the recovery TAP test
039_end_of_wal.pl. A follow-up bug fix with a TAP test doing similar
WAL manipulations requires them, and all these had better not be
duplicated due to their complexity. The routine names are tweaked to
use "wal" more consistently, similarly to the existing "advance_wal".
In v14 and v13, the new routines are moved to PostgresNode.pm.
039_end_of_wal.pl is updated to use the refactored routines, without
changing its coverage.
Reviewed-by: Alexander Kukushkin
Discussion: https://postgr.es/m/CAFh8B=mozC+e1wGJq0H=0O65goZju+6ab5AU7DEWCSUA2OtwDg@mail.gmail.com
Backpatch-through: 13
In the name of ABI stability (that is, to avoid a library major
version bump for libpq), libpq still exports a version of pqsignal()
that we no longer want to use ourselves. However, since that has
the same link name as the function exported by src/port/pqsignal.c,
there is a link ordering dependency determining which version will
actually get used by code that uses libpq as well as libpgport.a.
It now emerges that the wrong version has been used by pgbench and
psql since commit 06843df4a rearranged their link commands. This
can result in odd failures in pgbench with the -T switch, since its
SIGALRM handler will now not be marked SA_RESTART. psql may have
some edge-case problems in \watch, too.
Since we don't want to depend on link ordering effects anymore,
let's fix this in the same spirit as b6c7cfac8: use macros to change
the actual link names of the competing functions. We cannot change
legacy-pqsignal.c's exported name of course, so the victim has to be
src/port/pqsignal.c.
In master, rename its exported name to be pqsignal_fe in frontend or
pqsignal_be in backend. (We could perhaps have gotten away with using
the same symbol in both cases, but since the FE and BE versions now
work a little differently, it seems advisable to use different names.)
In back branches, rename to pqsignal_fe in frontend but keep it as
pqsignal in backend. The frontend change could affect third-party
code that is calling pqsignal from libpgport.a or libpgport_shlib.a,
but only if the code is compiled against port.h from a different minor
release than libpgport. Since we don't support using libpgport as a
shared library, it seems unlikely that there will be such a problem.
I left the backend symbol unchanged to avoid an ABI break for
extensions. This means that the link ordering hazard still exists
for any extension that links against libpq. However, none of our own
extensions use both pqsignal() and libpq, and we're not making things
any worse for third-party extensions that do.
Report from Andy Fan, diagnosis by Fujii Masao, patch by me.
Back-patch to all supported branches, as 06843df4a was.
Discussion: https://postgr.es/m/87msfz5qv2.fsf@163.com
The ecpg command includes code to warn about unsupported COPY FROM STDIN
statements in input files. However, since commit 3d009e45bd,
this functionality has been broken due to a bug introduced in that commit,
causing ecpg to fail to detect the statement.
This commit resolves the issue, restoring ecpg's ability to detect
COPY FROM STDIN and issue a warning as intended.
Back-patch to all supported versions.
Author: Ryo Kanbayashi
Reviewed-by: Hayato Kuroda, Tom Lane
Discussion: https://postgr.es/m/CANOn0Ez_t5uDCUEV8c1YORMisJiU5wu681eEVZzgKwOeiKhkqQ@mail.gmail.com
If a new catalog tuple is inserted that belongs to a catcache list
entry, and cache invalidation happens while the list entry is being
built, the list entry might miss the newly inserted tuple.
To fix, change the way we detect concurrent invalidations while a
catcache entry is being built. Keep a stack of entries that are being
built, and apply cache invalidation to those entries in addition to
the real catcache entries. This is similar to the in-progress list in
relcache.c.
Back-patch to all supported versions.
Reviewed-by: Noah Misch
Discussion: https://www.postgresql.org/message-id/2234dc98-06fe-42ed-b5db-ac17384dc880@iki.fi
This function expects an "int64" as result and stores the number of
pages to add to the index scan bitmap as an "int", multiplying its final
result by 10. For a relation large enough, this can theoretically
overflow if counting more than (INT32_MAX / 10) pages, knowing that the
number of pages is upper-bounded by MaxBlockNumber.
To avoid the overflow, this commit redefines "totalpages", used to
calculate the result, to be an "int64" rather than an "int".
Reported-by: Evgeniy Gorbanyov
Author: James Hunter
Discussion: https://www.postgresql.org/message-id/07704817-6fa0-460c-b1cf-cd18f7647041@basealt.ru
Backpatch-through: 13
Commit 27a1f8d108 missed updating the max HBA option count to
account for the new option added. Fix by bumping the counter
and adjust the relevant comment to match. Backpatch down to
all supported branches like the erroneous commit.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/286764.1736697356@sss.pgh.pa.us
Backpatch-through: v13
When deparsing an XMLTABLE() expression, XML namespace names were not
quoted. However, since they are parsed as ColLabel tokens, some names
require double quotes to ensure that they are properly interpreted.
Fix by using quote_identifier() in the deparsing code.
Back-patch to all supported versions.
Dean Rasheed, reviewed by Tom Lane.
Discussion: https://postgr.es/m/CAEZATCXTpAS%3DncfLNTZ7YS6O5puHeLg_SUYAit%2Bcs7wsrd9Msg%40mail.gmail.com
PLy_spi_execute_plan (PLyPlan.execute) and PLy_cursor_plan
(plpy.cursor) use PLy_output_convert to convert Python values
into Datums that can be passed to the query-to-execute. But they
failed to pay much attention to its warning that it can leave "cruft
generated along the way" behind. Repeated use of these methods can
result in a substantial memory leak for the duration of the calling
plpython function.
To fix, make a temporary memory context to invoke PLy_output_convert
in. This also lets us get rid of the rather fragile code that was
here for retail pfree's of the converted Datums. Indeed, we don't
need the PLyPlanObject.values field anymore at all, though I left it
in place in the back branches in the name of ABI stability.
Mat Arye and Tom Lane, per report from Mat Arye. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/CADsUR0DvVgnZYWwnmKRK65MZg7YLUSTDLV61qdnrwtrAJgU6xw@mail.gmail.com
The ldapscheme option was missed when inspecing the HbaLine for
assembling rows for the pg_hba_file_rules function. Backpatch
to all supported versions.
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reported-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Bug: 18769
Discussion: https://postgr.es/m/18769-dd8610cbc0405172@postgresql.org
Backpatch-through: v13
This error message stated the privileges required to add a member
to a group even if the user was trying to drop a member:
postgres=> alter group a drop user b;
ERROR: permission denied to alter role
DETAIL: Only roles with the ADMIN option on role "a" may add members.
Since the required privileges for both operations are the same, we
can fix this by modifying the message to mention both adding and
dropping members:
postgres=> alter group a drop user b;
ERROR: permission denied to alter role
DETAIL: Only roles with the ADMIN option on role "a" may add or drop members.
Author: ChangAo Chen
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/tencent_FAA0D00E3514AAF0BBB6322542A6094FEF05%40qq.com
Backpatch-through: 16
walmethods.c used off_t to navigate around a pg_wal.tar file that could
exceed 2GB, which doesn't work on Windows and would fail with misleading
errors. Use pgoff_t instead.
Back-patch to all supported branches.
Author: Davinder Singh <davinder.singh@enterprisedb.com>
Reported-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Discussion: https://postgr.es/m/CAKZiRmyM4YnokK6Oenw5JKwAQ3rhP0YTz2T-tiw5dAQjGRXE3Q%40mail.gmail.com
Change our ftruncate() macro to use the 64-bit variant of chsize(), and
add a new macro to redirect lseek() to _lseeki64().
Back-patch to all supported releases, in preparation for a bug fix.
Tested-by: Davinder Singh <davinder.singh@enterprisedb.com>
Discussion: https://postgr.es/m/CAKZiRmyM4YnokK6Oenw5JKwAQ3rhP0YTz2T-tiw5dAQjGRXE3Q%40mail.gmail.com
Commit 66aaabe7 (branches 13 - 17 only) was not acceptable to the Oracle
Developer Studio compiler on build farm animal wrasse. It accidentally
used a C++ style return statement to wrap a void function. None of the
usual compilers complained, but it is right, that is not allowed in C.
Fix.
Reported-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/Z33vgfVgvOnbFLN9%40paquier.xyz
It's possible that external code is calling smgrtruncate(). Any
external callers might like to consider the recent changes to
RelationTruncate(), but commit 38c579b0 should not have changed the
function prototype in the back-branches, per ABI stability policy.
Restore smgrtruncate()'s traditional argument list in the back-branches,
but make it a wrapper for a new function smgrtruncate2(). The three
callers in core can use smgrtruncate2() directly. In master (18-to-be),
smgrtruncate2() is effectively renamed to smgrtruncate(), so this wart
is cleaned up.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKG%2BThae6x6%2BjmQiuALQBT2Ae1ChjMh1%3DkMvJ8y_SBJZrvA%40mail.gmail.com
Slightly faulty logic in the original jsonb code (commit d9134d0a355)
results in an empty top level array sorting less than a json null. We
can't change the sort order now since it would affect btree indexes over
jsonb, so document the anomaly.
Backpatch to all live branches (13 .. 17)
In master, also add a code comment noting the anomaly.
Reported-by: Yan Chengpen
Reviewed-by: Jian He
Discussion: https://postgr.es/m/OSBPR01MB45199DD8DA2D1CECD50518188E272@OSBPR01MB4519.jpnprd01.prod.outlook.com
When looking up statistical data about an expression, we do not need
to concern ourselves with the outer joins that could null the
Vars/PHVs contained in the expression. Accounting for nullingrels in
the expression could cause estimate_num_groups to count the same Var
multiple times if it's marked with different nullingrels. This is
incorrect, and could lead to "ERROR: corrupt MVNDistinct entry" when
searching for multivariate n-distinct.
Furthermore, the nullingrels could prevent us from matching an
expression to expressional index columns or to the expressions in
extended statistics, leading to inaccurate estimates.
To fix, strip out all the nullingrels from the expression before we
look up statistical data about it. There is one ensuing plan change
in the regression tests, but it looks reasonable and does not
compromise its original purpose.
This patch could result in plan changes, but it fixes an actual bug,
so back-patch to v16 where the outer-join-aware-Var infrastructure was
introduced.
Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-2Z4k+nFTiZe0Qbu5n8juUWenDAtMzi98bAZQtwHx0-w@mail.gmail.com
pgoutput caches the attribute map of a relation, that is free()'d only
when validating a RelationSyncEntry. However, this code path is not
taken when calling any of the SQL functions able to do some logical
decoding, like pg_logical_slot_{get,peek}_changes(), leaking some memory
into CacheMemoryContext on repeated calls.
To address this, a relation's attribute map is allocated in
PGOutputData's cachectx, free()'d at the end of the execution of these
SQL functions when logical decoding ends. This is available down to 15.
v13 and v14 have a similar leak, which will be dealt with later.
Reported-by: Masahiko Sawada
Author: Vignesh C
Reviewed-by: Hou Zhijie
Discussion: https://postgr.es/m/CAD21AoDkAhQVSukOfH3_reuF-j4EU0-HxMqU3dU+bSTxsqT14Q@mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm1hewNAsZ_e6FF52a=9drmkRJxtEPrzCB6-9mkJyeBBqA@mail.gmail.com
Backpatch-through: 15
Before 728bd991c3c4, that has improved the support for 2PC files during
recovery, the initial logic scanning files in pg_twophase was done so as
files in the future of the transaction ID horizon were checked first,
followed by a check if a transaction ID is aborted or committed which
could involve a pg_xact lookup. After this commit, these checks have
been done in reverse order.
Files detected as in the future do not have a state that can be checked
in pg_xact, hence this caused recovery to fail abruptly should an
orphaned 2PC file in the future of the transaction ID horizon exist in
pg_twophase at the beginning of recovery.
A test is added to check for this scenario, using an empty 2PC with a
transaction ID large enough to be in the future when running the test.
This test is added in 16 and older versions for now. 17 and newer
versions are impacted by a second bug caused by the addition of the
epoch in the 2PC file names. An equivalent test will be added in these
branches in a follow-up commit, once the second set of issues reported
are fixed.
Author: Vitaly Davydov, Michael Paquier
Discussion: https://postgr.es/m/11e597-676ab680-8d-374f23c0@145466129
Backpatch-through: 13
Cause parallel workers to not check datallowconn, rolcanlogin, and
ACL_CONNECT privileges. The leader already checked these things
(except for rolcanlogin which might have been checked for a different
role). Re-checking can accomplish little except to induce unexpected
failures in applications that might not even be aware that their query
has been parallelized. We already had the principle that parallel
workers rely on their leader to pass a valid set of authorization
information, so this change just extends that a bit further.
Also, modify the ReservedConnections, datconnlimit and rolconnlimit
logic so that these limits are only enforced against regular backends,
and only regular backends are counted while checking if the limits
were already reached. Previously, background processes that had an
assigned database or role were subject to these limits (with rather
random exclusions for autovac workers and walsenders), and the set of
existing processes that counted against each limit was quite haphazard
as well. The point of these limits, AFAICS, is to ensure the
availability of PGPROC slots for regular backends. Since all other
types of processes have their own separate pools of PGPROC slots, it
makes no sense either to enforce these limits against them or to count
them while enforcing the limit.
While edge-case failures of these sorts have been possible for a
long time, the problem got a good deal worse with commit 5a2fed911
(CVE-2024-10978), which caused parallel workers to make some of these
checks using the leader's current role where before we had used its
AuthenticatedUserId, thus allowing parallel queries to fail after
SET ROLE. The previous behavior was fairly accidental and I have
no desire to return to it.
This patch includes reverting 73c9f91a1, which was an emergency hack
to suppress these same checks in some cases. It wasn't complete,
as shown by a recent bug report from Laurenz Albe. We can also revert
fd4d93d26 and 492217301, which hacked around the same problems in one
regression test.
In passing, remove the special case for autovac workers in
CheckMyDatabase; it seems cleaner to have AutoVacWorkerMain pass
the INIT_PG_OVERRIDE_ALLOW_CONNS flag, now that that does what's
needed.
Like 5a2fed911, back-patch to supported branches (which sadly no
longer includes v12).
Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
Commit aac2c9b4fde889d13f859c233c2523345e72d32b mandated such locking
and attempted to fulfill that mandate, but it missed REASSIGN OWNED.
Hence, it remained possible to lose VACUUM's inplace update of
datfrozenxid if a REASSIGN OWNED processed that database at the same
time. This didn't affect the other inplace-updated catalog, pg_class.
For pg_class, REASSIGN OWNED calls ATExecChangeOwner() instead of the
generic AlterObjectOwner_internal(), and ATExecChangeOwner() fulfills
the locking mandate.
Like in GRANT, implement this by following the locking protocol for any
catalog subject to the generic AlterObjectOwner_internal(). It would
suffice to do this for IsInplaceUpdateOid() catalogs only. Back-patch
to v13 (all supported versions).
Kirill Reshke. Reported by Alexander Kukushkin.
Discussion: https://postgr.es/m/CAFh8B=mpKjAy4Cuun-HP-f_vRzh2HSvYFG3rhVfYbfEBUhBAGg@mail.gmail.com
This fixes "unresolved external symbol" errors with extensions that
use functions from libpgport that need special CFLAGS to
compile. Currently, that includes the CRC-32 functions.
Commit 2571c1d5cc did this for libcommon, but I missed that libpqport
has the same issue.
Reported-by: Tom Lane
Backpatch-through: 16, where Meson was introduced
Discussion: https://www.postgresql.org/message-id/CAOdR5yF0krWrxycA04rgUKCgKugRvGWzzGLAhDZ9bzNv8g0Lag@mail.gmail.com
The pgoutput module caches publication names in a list and frees it upon
invalidation. However, the code forgot to free the actual publication
names within the list elements, as publication names are pstrdup()'d in
GetPublication(). This would cause memory to leak in
CacheMemoryContext, bloating it over time as this context is not
cleaned.
This is a problem for WAL senders running for a long time, as an
accumulation of invalidation requests would bloat its cache memory
usage. A second case, where this leak is easier to see, involves a
backend calling SQL functions like pg_logical_slot_{get,peek}_changes()
which create a new decoding context with each execution. More
publications create more bloat.
To address this, this commit adds a new memory context within the
logical decoding context and resets it each time the publication names
cache is invalidated, based on a suggestion from Amit Kapila. This
ensures that the lifespan of the publication names aligns with that of
the logical decoding context.
Contrary to the HEAD-only commit f0c569d71515 that has changed
PGOutputData to track this new child memory context, the context is
tracked with a static variable whose state is reset with a MemoryContext
reset callback attached to PGOutputData->context, so as ABI
compatibility is preserved in stable branches. This approach is based
on an suggestion from Amit Kapila.
Analyzed-by: Michael Paquier, Jeff Davis
Author: Masahiko Sawada
Reviewed-by: Amit Kapila, Michael Paquier, Euler Taveira, Hou Zhijie
Discussion: https://postgr.es/m/Z0khf9EVMVLOc_YY@paquier.xyz
Backpatch-through: 13
GetSnapshotData() set TransactionXmin = MyProc->xmin, but when
SnapshotResetXmin() advanced MyProc->xmin, it did not advance
TransactionXmin correspondingly. That meant that TransactionXmin could
be older than MyProc->xmin, and XIDs between than TransactionXmin and
the real MyProc->xmin could be vacuumed away. One known consequence is
in pg_subtrans lookups: we might try to look up the status of an XID
that was already truncated away.
Back-patch to all supported versions.
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/d27a046d-a1e4-47d1-a95c-fbabe41debb4@iki.fi
RelationTruncate() does three things, while holding an
AccessExclusiveLock and preventing checkpoints:
1. Logs the truncation.
2. Drops buffers, even if they're dirty.
3. Truncates some number of files.
Step 2 could previously be canceled if it had to wait for I/O, and step
3 could and still can fail in file APIs. All orderings of these
operations have data corruption hazards if interrupted, so we can't give
up until the whole operation is done. When dirty pages were discarded
but the corresponding blocks were left on disk due to ERROR, old page
versions could come back from disk, reviving deleted data (see
pgsql-bugs #18146 and several like it). When primary and standby were
allowed to disagree on relation size, standbys could panic (see
pgsql-bugs #18426) or revive data unknown to visibility management on
the primary (theorized).
Changes:
* WAL is now unconditionally flushed first
* smgrtruncate() is now called in a critical section, preventing
interrupts and causing PANIC on file API failure
* smgrtruncate() has a new parameter for existing fork sizes,
because it can't call smgrnblocks() itself inside a critical section
The changes apply to RelationTruncate(), smgr_redo() and
pg_truncate_visibility_map(). That last is also brought up to date with
other evolutions of the truncation protocol.
The VACUUM FileTruncate() failure mode had been discussed in older
reports than the ones referenced below, with independent analysis from
many people, but earlier theories on how to fix it were too complicated
to back-patch. The more recently invented cancellation bug was
diagnosed by Alexander Lakhin. Other corruption scenarios were spotted
by me while iterating on this patch and earlier commit 75818b3a.
Back-patch to all supported releases.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reported-by: rootcause000@gmail.com
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/18146-04e908c662113ad5%40postgresql.org
Discussion: https://postgr.es/m/18426-2d18da6586f152d6%40postgresql.org
If the non-recursive part of a recursive CTE ended up using
TTSOpsBufferHeapTuple as the table slot type, then a duplicate value
could cause an Assert failure in CheckOpSlotCompatibility() when
checking the hash table for the duplicate value. The expected slot type
for the deform step was TTSOpsMinimalTuple so the Assert failed when the
TTSOpsBufferHeapTuple slot was used.
This is a long-standing bug which we likely didn't notice because it
seems much more likely that the non-recursive term would have required
projection and used a TTSOpsVirtual slot, which CheckOpSlotCompatibility
is ok with.
There doesn't seem to be any harm done here other than the Assert
failure. Both TTSOpsMinimalTuple and TTSOpsBufferHeapTuple slot types
require tuple deformation, so the EEOP_*_FETCHSOME ExprState step would
have properly existed in the ExprState.
The solution is to pass NULL for the ExecBuildGroupingEqual's 'lops'
parameter. This means the ExprState's EEOP_*_FETCHSOME step won't
expect a fixed slot type. This makes CheckOpSlotCompatibility() happy as
no checking is performed when the ExprEvalStep is not expecting a fixed
slot type.
Reported-by: Richard Guo
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAMbWs4-8U9q2LAtf8+ghV11zeUReA3AmrYkxzBEv0vKnDxwkKA@mail.gmail.com
Backpatch-through: 13, all supported versions
If a dshash table grows very large (e.g., the dshash table for
cumulative statistics when there are millions of tables), resizing
it may fail with an error like:
ERROR: invalid DSA memory alloc request size 1073741824
To fix, permit dshash resizing to allocate more than 1 GB by
providing the DSA_ALLOC_HUGE flag.
Reported-by: Andreas Scherbaum
Author: Matthias van de Meent
Reviewed-by: Cédric Villemain, Michael Paquier, Andres Freund
Discussion: https://postgr.es/m/80a12d59-0d5e-4c54-866c-e69cd6536471%40pgug.de
Backpatch-through: 13
The test failed if you ran the regression tests with TEMP_CONFIG with
recovery_min_apply_delay = '500ms'. Fix the race condition by waiting
for transaction to be applied in the replica, like in a few other
tests.
The failing test was introduced in commit cbfbda7841. Backpatch to all
supported versions like that commit (except v12, which is no longer
supported).
Reported-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/09e2a70a-a6c2-4b5c-aeae-040a7449c9f2@gmail.com
An \if command appearing within a false (not-to-be-executed) \if
branch was incorrectly treated the same as \elif. This could allow
statements within the inner \if to be executed when they should
not be. Also the missing inner \if stack entry would result in an
assertion failure (in assert-enabled builds) when the final \endif
is reached.
Report and patch by Michail Nikolaev. Back-patch to all
supported branches.
Discussion: https://postgr.es/m/CANtu0oiA1ke=SP6tauhNqkUdv5QFsJtS1p=aOOf_iU+EhyKkjQ@mail.gmail.com
The @extschema:name@ feature added by 72a5b1fc8 allows us to
make earthdistance's references to the cube extension fully
search-path-secure, so long as all those references are
resolved at extension installation time not runtime.
To do that, we must convert earthdistance's SQL functions to
the new SQL-standard style; but we wanted to do that anyway.
The functions can be updated in our customary style by running
CREATE OR REPLACE FUNCTION in an extension update script.
However, there's still problems in the "CREATE DOMAIN earth"
command: its references to cube functions could be captured
by hostile objects in earthdistance's installation schema,
if that's not where the cube extension is. Worse, the reference
to the cube type itself as the domain's base could be captured,
and that's not something we could fix after-the-fact in the
update script.
What I've done about that is to change the "CREATE DOMAIN earth"
command in the base script earthdistance--1.1.sql. Ordinarily,
changing a released extension script is forbidden; but I think
it's okay here since the results of successful (non-trojaned)
script execution will be identical to before.
A good deal of care is still needed to make the extension's scripts
proof against search-path-based attacks. We have to make sure that
all the function and operator invocations have exact argument-type
matches, to forestall attacks based on supplying a better match.
Fortunately earthdistance isn't very big, so I've just gone through
it and inspected each call to be sure of that. The only actual code
changes needed were to spell all floating-point constants in the style
'-1'::float8, rather than depending on runtime type conversions and/or
negations. (I'm not sure that the shortcuts previously used were
attackable, but removing run-time effort is a good thing anyway.)
I believe that this fixes earthdistance enough that we could
mark it trusted and remove the warnings about it that were
added by 7eeb1d986; but I've not done that here.
The primary reason for dealing with this now is that we've
received reports of pg_upgrade failing for databases that use
earthdistance functions in contexts like generated columns.
That's a consequence of 2af07e2f7 having restricted the search_path
used while evaluating such expressions. The only way to fix that
is to make the earthdistance functions independent of run-time
search_path. This patch is very much nicer than the alternative of
attaching "SET search_path" clauses to earthdistance's functions:
it is more secure and doesn't create a run-time penalty. Therefore,
I've chosen to back-patch this to v16 where @extschema:name@
was added. It won't help unless users update to 16.7 and issue
"ALTER EXTENSION earthdistance UPDATE" before upgrading to 17,
but at least there's now a way to deal with the problem without
manual intervention in the dump/restore process.
Tom Lane and Ronan Dunklau
Discussion: https://postgr.es/m/3316564.aeNJFYEL58@aivenlaptop
Discussion: https://postgr.es/m/6a6439f1-8039-44e2-8fb9-59028f7f2014@mailbox.org
If an owned sequence is considered interesting, force its owning
table to be marked interesting too. This ensures, in particular,
that we'll fetch the owning table's column names so we have the
data needed for ALTER TABLE ... ADD GENERATED. Previously there were
edge cases where pg_dump could get SIGSEGV due to not having filled in
the column names. (The known case is where the owning table has been
made part of an extension while its identity sequence is not a member;
but there may be others.)
Also, if it's an identity sequence, force its dumped-components mask
to exactly match the owning table: dump definition only if we're
dumping the table's definition, dump data only if we're dumping the
table's data, etc. This generalizes the code introduced in commit
b965f2617 that set the sequence's dump mask to NONE if the owning
table's mask is NONE. That's insufficient to prevent failures,
because for example the table's mask might only request dumping ACLs,
which would lead us to still emit ALTER TABLE ADD GENERATED even
though we didn't create the table. It seems better to treat an
identity sequence as though it were an inseparable aspect of the
table, matching the treatment used in the backend's dependency logic.
Perhaps this policy needs additional refinement, but let's wait to
see some field use-cases before changing it further.
While here, add a comment in pg_dump.h warning against writing tests
like "if (dobj->dump == DUMP_COMPONENT_NONE)", which was a bug in this
case. There is one other example in getPublicationNamespaces, which
if it's not a bug is at least remarkably unclear and under-documented.
Changing that requires a separate discussion, however.
Per report from Artur Zakirov. Back-patch to all supported branches.
Discussion: https://postgr.es/m/CAKNkYnwXFBf136=u9UqUxFUVagevLQJ=zGd5BsLhCsatDvQsKQ@mail.gmail.com
On failure, the pg_upgrade log files are automatically appended to the
test log file, but the information reported was inconsistent.
A header, with the log file name, was reported with note(), while the
log contents and a footer used print(), making it harder to diagnose
failures when these are split into console output and test log file
because the pg_upgrade log file path in the header may not be included
in the test log file.
The output is now consolidated so as the header uses print() rather than
note(). An extra note() is added to inform that the contents of a
pg_upgrade log file are appended to the test log file.
The diffs from the regression test suite and dump files all use print()
to show their contents on failure.
Author: Joel Jacobson
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/49f7e64a-b9be-4a90-a9fe-210a7740405e@app.fastmail.com
Backpatch-through: 15
Since commit 97550c0711972a9856b5db751539bbaf2f88884c, these failed with
"PANIC: proc_exit() called in child process" due to uninitialized or
stale MyProcPid. That was reachable if close() failed in
ClosePostmasterPorts() or setlocale(category, "C") failed, both
unlikely. Back-patch to v13 (all supported versions).
Discussion: https://postgr.es/m/20241208034614.45.nmisch@google.com
The GUC assign and check hooks used "assign_timezone_abbreviations",
which was incorrect.
Issue noticed while browsing this area of the code, introduced in
0a20ff54f5e6.
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/Z1eV6Y8yk77GZhZI@paquier.xyz
Backpatch-through: 16
This routine documented that "iterations" would use a default value if
set to 0 by the caller. However, the iteration should always be set by
the caller to a value strictly more than 0, as documented by an
assertion.
Oversight in b577743000cd, that has made the iteration count of SCRAM
configurable.
Author: Matheus Alcantara
Discussion: https://postgr.es/m/ac858943-4743-44cd-b4ad-08a0c10cbbc8@gmail.com
Backpatch-through: 16
It looks like the example case was once modified to increase the number
of rows but the EXPLAIN ANALYZE output wasn't updated to reflect that.
Also adjust the text which discusses the index sizes. With the example
table size, the bloom index isn't quite 8 times more space efficient
than the btree indexes.
Discussion: https://postgr.es/m/CAApHDvovx8kQ0=HTt85gFDAwmTJHpCgiSvRmQZ_6u_g-vQYM_w@mail.gmail.com
Backpatch-through: 13, all supported versions
Follow-up commit to a9d58bfe8a3a. Backpatch down to v16 where
this was added in order to keep the code consistent for future
backpatches.
Author: Tofig Aliev <t.aliev@postgrespro.ru>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/bba4313fdde9db46db279f96f3b748b1@postgrespro.ru
Backpatch-through: 16
Our parallel-mode code only works when we are executing a query
in full, so ExecutePlan must disable parallel mode when it is
asked to do partial execution. The previous logic for this
involved passing down a flag (variously named execute_once or
run_once) from callers of ExecutorRun or PortalRun. This is
overcomplicated, and unsurprisingly some of the callers didn't
get it right, since it requires keeping state that not all of
them have handy; not to mention that the requirements for it were
undocumented. That led to assertion failures in some corner
cases. The only state we really need for this is the existing
QueryDesc.already_executed flag, so let's just put all the
responsibility in ExecutePlan. (It could have been done in
ExecutorRun too, leading to a slightly shorter patch -- but if
there's ever more than one caller of ExecutePlan, it seems better
to have this logic in the subroutine than the callers.)
This makes those ExecutorRun/PortalRun parameters unnecessary.
In master it seems okay to just remove them, returning the
API for those functions to what it was before parallelism.
Such an API break is clearly not okay in stable branches,
but for them we can just leave the parameters in place after
documenting that they do nothing.
Per report from Yugo Nagata, who also reviewed and tested
this patch. Back-patch to all supported branches.
Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
pgstat_write_statsfile() discards any entries marked as dropped from
being written to the stats file at shutdown, and also included an
assertion based on the same condition.
The intention of the assertion is to track that no pgstats entries
should be left around as terminating backends should drop any entries
they still hold references on before the stats file is written by the
checkpointer, and it not worth taking down the server in this case if
there is a bug making that possible.
Let's improve the comment of this area to document clearly what's
intended.
Based on a discussion with Bertrand Drouvot and Anton A. Melnikov.
Author: Bertrand Drouvot
Discussion: https://postgr.es/m/a13e8cdf-b97a-4ecb-8f42-aaa367974e29@postgrespro.ru
Backpatch-through: 15
818119afccd3 has introduced the "generation" concept in pgstats entries,
incremented a counter when a pgstats entry is reinitialized, but it did
not count on the fact that backends still holding local references to
such entries need to be refreshed if the cache age is outdated. The
previous logic only updated local references when an entry was dropped,
but it needs also to consider entries that are reinitialized.
This matters for replication slot stats (as well as custom pgstats kinds
in 18~), where concurrent drops and creates of a slot could cause
incorrect stats to be locally referenced. This would lead to an
assertion failure at shutdown when writing out the stats file, as the
backend holding an outdated local reference would not be able to drop
during its shutdown sequence the stats entry that should be dropped, as
the last process holding a reference to the stats entry. The
checkpointer was then complaining about such an entry late in the
shutdown sequence, after the shutdown checkpoint is finished with the
control file updated, causing the stats file to not be generated. In
non-assert builds, the entry would just be skipped with the stats file
written.
Note that only logical replication slots use statistics.
A test case based on TAP is added to test_decoding, where a persistent
connection peeking at a slot's data is kept with concurrent drops and
creates of the same slot. This is based on the isolation test case that
Anton has sent. As it requires a node shutdown with a check to make
sure that the stats file is written with this specific sequence of
events, TAP is used instead.
Reported-by: Anton A. Melnikov
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/56bf8ff9-dd8c-47b2-872a-748ede82af99@postgrespro.ru
Backpatch-through: 15
When short-circuiting WindowAgg node evaluation on the top-level
WindowAgg node using quals on monotonic window functions, because the
WindowAgg run condition can mean there's no need to evaluate subsequent
window function results in the same partition once the run condition
becomes false, it was possible that the executor would use stale results
from the previous invocation of the window function in some cases.
A fix for this was partially done by a5832722, but that commit only
fixed the issue for non-top-level WindowAgg nodes. I mistakenly thought
that the top-level WindowAgg didn't have this issue, but Jayesh's example
case clearly shows that's incorrect. At the time, I also thought that
this only affected 32-bit systems as all window functions which then
supported run conditions returned BIGINT, however, that's wrong as
ExecProject is still called and that could cause evaluation of any other
window function belonging to the same WindowAgg node, one of which may
return a byref type.
The only queries affected by this are WindowAggs with a "Run Condition"
which contains at least one window function with a byref result type,
such as lead() or lag() on a byref column. The window clause must also
contain a PARTITION BY clause (without a PARTITION BY, execution of the
WindowAgg stops immediately when the run condition becomes false and
there's no risk of using the stale results).
Reported-by: Jayesh Dehankar
Discussion: https://postgr.es/m/193261e2c4d.3dd3cd7c1842.871636075166132237@zohocorp.com
Backpatch-through: 15, where WindowAgg run conditions were added
Usually an entry in pg_amop or pg_amproc does not need a dependency on
its amoplefttype/amoprighttype/amproclefttype/amprocrighttype types,
because there is an indirect dependency via the argument types of its
referenced operator or procedure, or via the opclass it belongs to.
However, for some support procedures in some index AMs, the argument
types of the support procedure might not mention the column data type
at all. Also, the amop/amproc entry might be treated as "loose" in
the opfamily, in which case it lacks a dependency on any particular
opclass; or it might be a cross-type entry having a reference to a
datatype that is not its opclass' opcintype.
The upshot of all this is that there are cases where a datatype can
be dropped while leaving behind amop/amproc entries that mention it,
because there is no path in pg_depend showing that those entries
depend on that type. Such entries are harmless in normal activity,
because they won't get used, but they cause problems for maintenance
actions such as dropping the operator family. They also cause pg_dump
to produce bogus output. The previous commit put a band-aid on the
DROP OPERATOR FAMILY failure, but a real fix is needed.
To fix, add pg_depend entries showing that a pg_amop/pg_amproc entry
depends on its lefttype/righttype. To avoid bloating pg_depend too
much, skip this if the referenced operator or function has that type
as an input type. (I did not bother with considering the possible
indirect dependency via the opclass' opcintype; at least in the
reported case, that wouldn't help anyway.)
Probably, the reason this has escaped notice for so long is that
add-on datatypes and relevant opclasses/opfamilies are usually
packaged as extensions nowadays, so that there's no way to drop
a type without dropping the referencing opclasses/opfamilies too.
Still, in the absence of pg_depend entries there's nothing that
constrains DROP EXTENSION to drop the opfamily entries before the
datatype, so it seems possible for a DROP failure to occur anyway.
The specific case that was reported doesn't fail in v13, because
v13 prefers to attach the support procedure to the opclass not the
opfamily. But it's surely possible to construct other edge cases
that do fail in v13, so patch that too.
Per report from Yoran Heling. Back-patch to all supported branches.
Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
Yoran Heling reported a case where a data type could be dropped
while references to its OID remain behind in pg_amproc. This
causes getObjectDescription to fail, which blocks dropping the
operator family (since our DROP code likes to construct descriptions
of everything it's dropping). The proper fix for this requires
adding more pg_depend entries. But to allow DROP to go through with
already-corrupt catalogs, tweak getObjectDescription to print "???"
for the type instead of failing when it processes such an entry.
I changed the logic for pg_amop similarly, for consistency,
although it is not known that the problem can manifest in pg_amop.
Per report from Yoran Heling. Back-patch to all supported
branches (although the problem may be unreachable in v13).
Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
These format codes produce or consume strings of digits, so they
should be labeled with is_digit = true, but they were not.
This has effect in only one place, where is_next_separator()
is checked to see if the preceding format code should slurp up
all the available digits. Thus, with a format such as '...SSFF3'
with remaining input '12345', the 'SS' code would consume all
five digits (and then complain about seconds being out of range)
when it should eat only two digits.
Per report from Nick Davies. This bug goes back to d589f9446
where the FFn codes were introduced, so back-patch to v13.
Discussion: https://postgr.es/m/AM8PR08MB6356AC979252CFEA78B56678B6312@AM8PR08MB6356.eurprd08.prod.outlook.com
Previously, it set only DELAY_CHKPT_COMPLETE. That was important,
because it meant that if the XLOG_SMGR_TRUNCATE record preceded a
XLOG_CHECKPOINT_ONLINE record in the WAL, then the truncation would also
happen on disk before the XLOG_CHECKPOINT_ONLINE record was
written.
However, it didn't guarantee that the sync request for the truncation
was processed before the XLOG_CHECKPOINT_ONLINE record was written. By
setting DELAY_CHKPT_START, we guarantee that if an XLOG_SMGR_TRUNCATE
record is written to WAL before the redo pointer of a concurrent
checkpoint, the sync request queued by that operation must be processed
by that checkpoint, rather than being left for the following one.
This is a refinement of commit 412ad7a5563. Back-patch to all supported
releases, like that commit.
Author: Robert Haas <robertmhaas@gmail.com>
Reported-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2B-2rjGZC2kwqr2NMLBcEBp4uf59QT1advbWYF_uc%2B0Aw%40mail.gmail.com
The loops over cursor argument variables neglected to ever advance
"prevvar". The code would accidentally do the right thing anyway
when removing the first or second list entry, but if it had to
remove the third or later entry then it would also remove all
entries between there and the first entry. AFAICS this would
only matter for cursors that reference out-of-scope variables,
which is a weird Informix compatibility hack; between that and
the lack of impact for short lists, it's not so surprising that
nobody has complained. Nonetheless it's a pretty obvious bug.
It would have been more obvious if these loops used a more standard
coding style for chasing the linked lists --- this business with the
"prev" pointer sometimes pointing at the current list entry is
confusing and overcomplicated. So rather than just add a minimal
band-aid, I chose to rewrite the loops in the same style we use
elsewhere, where the "prev" pointer is NULL until we are dealing with
a non-first entry and we save the "next" pointer at the top of the
loop. (Two of the four loops touched here are not actually buggy,
but it seems better to make them all look alike.)
Coverity discovered this problem, but not until 2b41de4a5 added code
to free no-longer-needed arguments structs. With that, the incorrect
link updates are possibly touching freed memory, and it complained
about that. Nonetheless the list corruption hazard is ancient, so
back-patch to all supported branches.