1
0
mirror of https://github.com/postgres/postgres.git synced 2025-05-29 16:21:20 +03:00

43764 Commits

Author SHA1 Message Date
Thomas Munro
af109e3399 Provide 64-bit ftruncate() and lseek() on Windows.
Change our ftruncate() macro to use the 64-bit variant of chsize(), and
add a new macro to redirect lseek() to _lseeki64().

Back-patch to all supported releases, in preparation for a bug fix.

Tested-by: Davinder Singh <davinder.singh@enterprisedb.com>
Discussion: https://postgr.es/m/CAKZiRmyM4YnokK6Oenw5JKwAQ3rhP0YTz2T-tiw5dAQjGRXE3Q%40mail.gmail.com
2025-01-09 15:00:23 +13:00
Thomas Munro
45aef9f6bb Fix C error reported by Oracle compiler.
Commit 66aaabe7 (branches 13 - 17 only) was not acceptable to the Oracle
Developer Studio compiler on build farm animal wrasse.  It accidentally
used a C++ style return statement to wrap a void function.  None of the
usual compilers complained, but it is right, that is not allowed in C.
Fix.

Reported-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/Z33vgfVgvOnbFLN9%40paquier.xyz
2025-01-08 17:02:30 +13:00
Thomas Munro
66aaabe7a1 Restore smgrtruncate() prototype in back-branches.
It's possible that external code is calling smgrtruncate().  Any
external callers might like to consider the recent changes to
RelationTruncate(), but commit 38c579b0 should not have changed the
function prototype in the back-branches, per ABI stability policy.

Restore smgrtruncate()'s traditional argument list in the back-branches,
but make it a wrapper for a new function smgrtruncate2().  The three
callers in core can use smgrtruncate2() directly.  In master (18-to-be),
smgrtruncate2() is effectively renamed to smgrtruncate(), so this wart
is cleaned up.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA%2BhUKG%2BThae6x6%2BjmQiuALQBT2Ae1ChjMh1%3DkMvJ8y_SBJZrvA%40mail.gmail.com
2025-01-08 10:43:40 +13:00
Nathan Bossart
e43537cdc3 Use PqMsg_* macros in postgres.c.
Commit f4b54e1ed9, which introduced macros for protocol characters,
missed updating a couple of places in postgres.c.

Author: Dave Cramer
Reviewed-by: Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CADK3HHJUVBPoVOmFesPB-fN8_dYt%2BQELV2UB6jxOW2Z40qF-qw%40mail.gmail.com
Backpatch-through: 17
2025-01-07 15:34:19 -06:00
Álvaro Herrera
d7905aa1da
Fix error message wording
The originals are ambiguous and a bit out of style.

Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/202412141243.efesjyyvzxsz@alvherre.pgsql
2025-01-07 20:07:32 +01:00
Heikki Linnakangas
e998901b88 Remove duplicate definitions in proc.h
These are also present in procnumber.h

Reported-by: Peter Eisentraut
Discussion: https://www.postgresql.org/message-id/bd04d675-4672-4f87-800a-eb5d470c15fc@eisentraut.org
2025-01-06 11:56:28 +02:00
Richard Guo
297b280abd Ignore nullingrels when looking up statistics
When looking up statistical data about an expression, we do not need
to concern ourselves with the outer joins that could null the
Vars/PHVs contained in the expression.  Accounting for nullingrels in
the expression could cause estimate_num_groups to count the same Var
multiple times if it's marked with different nullingrels.  This is
incorrect, and could lead to "ERROR:  corrupt MVNDistinct entry" when
searching for multivariate n-distinct.

Furthermore, the nullingrels could prevent us from matching an
expression to expressional index columns or to the expressions in
extended statistics, leading to inaccurate estimates.

To fix, strip out all the nullingrels from the expression before we
look up statistical data about it.  There is one ensuing plan change
in the regression tests, but it looks reasonable and does not
compromise its original purpose.

This patch could result in plan changes, but it fixes an actual bug,
so back-patch to v16 where the outer-join-aware-Var infrastructure was
introduced.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-2Z4k+nFTiZe0Qbu5n8juUWenDAtMzi98bAZQtwHx0-w@mail.gmail.com
2025-01-02 17:59:32 +09:00
Michael Paquier
836435424b Fix memory leak in pgoutput with relation attribute map
pgoutput caches the attribute map of a relation, that is free()'d only
when validating a RelationSyncEntry.  However, this code path is not
taken when calling any of the SQL functions able to do some logical
decoding, like pg_logical_slot_{get,peek}_changes(), leaking some memory
into CacheMemoryContext on repeated calls.

To address this, a relation's attribute map is allocated in
PGOutputData's cachectx, free()'d at the end of the execution of these
SQL functions when logical decoding ends.  This is available down to 15.
v13 and v14 have a similar leak, which will be dealt with later.

Reported-by: Masahiko Sawada
Author: Vignesh C
Reviewed-by: Hou Zhijie
Discussion: https://postgr.es/m/CAD21AoDkAhQVSukOfH3_reuF-j4EU0-HxMqU3dU+bSTxsqT14Q@mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm1hewNAsZ_e6FF52a=9drmkRJxtEPrzCB6-9mkJyeBBqA@mail.gmail.com
Backpatch-through: 15
2024-12-30 13:33:58 +09:00
Michael Paquier
c3de0f9eed Fix failures with incorrect epoch handling for 2PC files at recovery
At the beginning of recovery, an orphaned two-phase file in an epoch
different than the one defined in the checkpoint record could not be
removed based on the assumptions that AdjustToFullTransactionId() relies
on, assuming that all files would be either from the current epoch or
from the previous epoch.

If the checkpoint epoch was 0 while the 2PC file was orphaned and in the
future, AdjustToFullTransactionId() would underflow the epoch used to
build the 2PC file path.  In non-assert builds, this would create a
WARNING message referring to a 2PC file with an epoch of "FFFFFFFF" (or
UINT32_MAX), as an effect of the underflow calculation, leaving the
orphaned file around.

Some tests are added with dummy 2PC files in the past and the future,
checking that these are properly removed.

Issue introduced by 5a1dfde8334b, that has switched two-phase state
files to use FullTransactionIds.

Reported-by: Vitaly Davydov
Author: Michael Paquier
Reviewed-by: Vitaly Davydov
Discussion: https://postgr.es/m/13b5b6-676c3080-4d-531db900@47931709
Backpatch-through: 17
2024-12-30 09:58:09 +09:00
Michael Paquier
03c46e1f26 Fix handling of orphaned 2PC files in the future at recovery
Before 728bd991c3c4, that has improved the support for 2PC files during
recovery, the initial logic scanning files in pg_twophase was done so as
files in the future of the transaction ID horizon were checked first,
followed by a check if a transaction ID is aborted or committed which
could involve a pg_xact lookup.  After this commit, these checks have
been done in reverse order.

Files detected as in the future do not have a state that can be checked
in pg_xact, hence this caused recovery to fail abruptly should an
orphaned 2PC file in the future of the transaction ID horizon exist in
pg_twophase at the beginning of recovery.

A test is added to check for this scenario, using an empty 2PC with a
transaction ID large enough to be in the future when running the test.
This test is added in 16 and older versions for now.  17 and newer
versions are impacted by a second bug caused by the addition of the
epoch in the 2PC file names.  An equivalent test will be added in these
branches in a follow-up commit, once the second set of issues reported
are fixed.

Author: Vitaly Davydov, Michael Paquier
Discussion: https://postgr.es/m/11e597-676ab680-8d-374f23c0@145466129
Backpatch-through: 13
2024-12-30 08:06:40 +09:00
Tom Lane
15b4c46c32 Exclude parallel workers from connection privilege/limit checks.
Cause parallel workers to not check datallowconn, rolcanlogin, and
ACL_CONNECT privileges.  The leader already checked these things
(except for rolcanlogin which might have been checked for a different
role).  Re-checking can accomplish little except to induce unexpected
failures in applications that might not even be aware that their query
has been parallelized.  We already had the principle that parallel
workers rely on their leader to pass a valid set of authorization
information, so this change just extends that a bit further.

Also, modify the ReservedConnections, datconnlimit and rolconnlimit
logic so that these limits are only enforced against regular backends,
and only regular backends are counted while checking if the limits
were already reached.  Previously, background processes that had an
assigned database or role were subject to these limits (with rather
random exclusions for autovac workers and walsenders), and the set of
existing processes that counted against each limit was quite haphazard
as well.  The point of these limits, AFAICS, is to ensure the
availability of PGPROC slots for regular backends.  Since all other
types of processes have their own separate pools of PGPROC slots, it
makes no sense either to enforce these limits against them or to count
them while enforcing the limit.

While edge-case failures of these sorts have been possible for a
long time, the problem got a good deal worse with commit 5a2fed911
(CVE-2024-10978), which caused parallel workers to make some of these
checks using the leader's current role where before we had used its
AuthenticatedUserId, thus allowing parallel queries to fail after
SET ROLE.  The previous behavior was fairly accidental and I have
no desire to return to it.

This patch includes reverting 73c9f91a1, which was an emergency hack
to suppress these same checks in some cases.  It wasn't complete,
as shown by a recent bug report from Laurenz Albe.  We can also revert
fd4d93d26 and 492217301, which hacked around the same problems in one
regression test.

In passing, remove the special case for autovac workers in
CheckMyDatabase; it seems cleaner to have AutoVacWorkerMain pass
the INIT_PG_OVERRIDE_ALLOW_CONNS flag, now that that does what's
needed.

Like 5a2fed911, back-patch to supported branches (which sadly no
longer includes v12).

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 16:08:50 -05:00
Tom Lane
14141bbbc1 Reserve a PGPROC slot and semaphore for the slotsync worker process.
The need for this was missed in commit 93db6cbda, with the result
being that if we launch a slotsync worker it would consume one of
the PGPROCs in the max_connections pool.  That could lead to inability
to launch the worker, or to subsequent failures of connection requests
that should have succeeded according to the configured settings.

Rather than create some one-off infrastructure to support this,
let's group the slotsync worker with the existing autovac launcher
in a new category of "special worker" processes.  These are kind of
like auxiliary processes, but they cannot use that infrastructure
because they need to be able to run transactions.

For the moment, make these processes share the PGPROC freelist
used for autovac workers (which previously supplied the autovac
launcher too).  This is partly to avoid an ABI change in v17,
and partly because it seems silly to have a freelist with
at most two members.  This might be worth revisiting if we grow
enough workers in this category.

Tom Lane and Hou Zhijie.  Back-patch to v17.

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 12:30:42 -05:00
Tom Lane
a46311ed72 Try to avoid semaphore-related test failures on NetBSD/OpenBSD.
These two platforms have a remarkably tight default limit on the
number of SysV semaphores in the system: SEMMNS is only 60
out-of-the-box.  Unless manual action is taken to raise that,
we'll only be able to allocate 3 sets of 16 usable semaphores
each, leading to initdb setting max_connections to just 20.
That's problematic because the core regression tests expect
to be able to launch 20 concurrent sessions, leaving us with
no headroom.  This seems to be the cause of intermittent
buildfarm failures on some machines.

While there's no getting around the fact that you'd better raise
SEMMNS for production use on these platforms, it does seem desirable
for "make check" to pass reliably without that.  We can make that
happen, at least for awhile longer, with two small changes:

* Change sysv_sema.c's SEMAS_PER_SET to 19, so that we can eat up
all of the available semas not just most of them.

* Change initdb to make the smallest max_connections value it will
consider be 25 not 20.

This is a back-patch of recent HEAD commit 38da05346 into v17.
The motivation for doing this now is that an upcoming bug-fix
patch will give the new-in-17 slotsync worker process its own
reserved PGPROC and hence also semaphore.  With that patch but
without this change, v17 would fail to start at all under the
default SEMMNS on these platforms.

Discussion: https://postgr.es/m/db2773a2-aca0-43d0-99c1-060efcd9954e@gmail.com
Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 11:48:13 -05:00
Noah Misch
fa61313770 In REASSIGN OWNED of a database, lock the tuple as mandated.
Commit aac2c9b4fde889d13f859c233c2523345e72d32b mandated such locking
and attempted to fulfill that mandate, but it missed REASSIGN OWNED.
Hence, it remained possible to lose VACUUM's inplace update of
datfrozenxid if a REASSIGN OWNED processed that database at the same
time.  This didn't affect the other inplace-updated catalog, pg_class.
For pg_class, REASSIGN OWNED calls ATExecChangeOwner() instead of the
generic AlterObjectOwner_internal(), and ATExecChangeOwner() fulfills
the locking mandate.

Like in GRANT, implement this by following the locking protocol for any
catalog subject to the generic AlterObjectOwner_internal().  It would
suffice to do this for IsInplaceUpdateOid() catalogs only.  Back-patch
to v13 (all supported versions).

Kirill Reshke.  Reported by Alexander Kukushkin.

Discussion: https://postgr.es/m/CAFh8B=mpKjAy4Cuun-HP-f_vRzh2HSvYFG3rhVfYbfEBUhBAGg@mail.gmail.com
2024-12-28 07:16:26 -08:00
Heikki Linnakangas
d8b0c64116 meson: Export all libcommon functions in Windows builds
This fixes "unresolved external symbol" errors with extensions that
use functions from libpgport that need special CFLAGS to
compile. Currently, that includes the CRC-32 functions.

Commit 2571c1d5cc did this for libcommon, but I missed that libpqport
has the same issue.

Reported-by: Tom Lane
Backpatch-through: 16, where Meson was introduced
Discussion: https://www.postgresql.org/message-id/CAOdR5yF0krWrxycA04rgUKCgKugRvGWzzGLAhDZ9bzNv8g0Lag@mail.gmail.com
2024-12-25 19:22:33 +02:00
Heikki Linnakangas
c80acbc6fa meson: Export all libcommon functions in Windows builds
This fixes "unresolved external symbol" errors with extensions that
use functions from libcommon. This was reported with pgvector.

Reported-by: Andrew Kane
Author: Vladlen Popolitov
Backpatch-through: 16, where Meson was introduced
Discussion: https://www.postgresql.org/message-id/CAOdR5yF0krWrxycA04rgUKCgKugRvGWzzGLAhDZ9bzNv8g0Lag@mail.gmail.com
2024-12-25 18:14:26 +02:00
Michael Paquier
bbe68c13ab Fix memory leak in pgoutput with publication list cache
The pgoutput module caches publication names in a list and frees it upon
invalidation.  However, the code forgot to free the actual publication
names within the list elements, as publication names are pstrdup()'d in
GetPublication().  This would cause memory to leak in
CacheMemoryContext, bloating it over time as this context is not
cleaned.

This is a problem for WAL senders running for a long time, as an
accumulation of invalidation requests would bloat its cache memory
usage.  A second case, where this leak is easier to see, involves a
backend calling SQL functions like pg_logical_slot_{get,peek}_changes()
which create a new decoding context with each execution.  More
publications create more bloat.

To address this, this commit adds a new memory context within the
logical decoding context and resets it each time the publication names
cache is invalidated, based on a suggestion from Amit Kapila.  This
ensures that the lifespan of the publication names aligns with that of
the logical decoding context.

Contrary to the HEAD-only commit f0c569d71515 that has changed
PGOutputData to track this new child memory context, the context is
tracked with a static variable whose state is reset with a MemoryContext
reset callback attached to PGOutputData->context, so as ABI
compatibility is preserved in stable branches.  This approach is based
on an suggestion from Amit Kapila.

Analyzed-by: Michael Paquier, Jeff Davis
Author: Masahiko Sawada
Reviewed-by: Amit Kapila, Michael Paquier, Euler Taveira, Hou Zhijie
Discussion: https://postgr.es/m/Z0khf9EVMVLOc_YY@paquier.xyz
Backpatch-through: 13
2024-12-23 12:48:06 +09:00
Heikki Linnakangas
7cfdb4d1e7 Update TransactionXmin when MyProc->xmin is updated
GetSnapshotData() set TransactionXmin = MyProc->xmin, but when
SnapshotResetXmin() advanced MyProc->xmin, it did not advance
TransactionXmin correspondingly. That meant that TransactionXmin could
be older than MyProc->xmin, and XIDs between than TransactionXmin and
the real MyProc->xmin could be vacuumed away. One known consequence is
in pg_subtrans lookups: we might try to look up the status of an XID
that was already truncated away.

Back-patch to all supported versions.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/d27a046d-a1e4-47d1-a95c-fbabe41debb4@iki.fi
2024-12-21 23:42:52 +02:00
Thomas Munro
0350b876b0 Fix corruption when relation truncation fails.
RelationTruncate() does three things, while holding an
AccessExclusiveLock and preventing checkpoints:

1. Logs the truncation.
2. Drops buffers, even if they're dirty.
3. Truncates some number of files.

Step 2 could previously be canceled if it had to wait for I/O, and step
3 could and still can fail in file APIs.  All orderings of these
operations have data corruption hazards if interrupted, so we can't give
up until the whole operation is done.  When dirty pages were discarded
but the corresponding blocks were left on disk due to ERROR, old page
versions could come back from disk, reviving deleted data (see
pgsql-bugs #18146 and several like it).  When primary and standby were
allowed to disagree on relation size, standbys could panic (see
pgsql-bugs #18426) or revive data unknown to visibility management on
the primary (theorized).

Changes:

 * WAL is now unconditionally flushed first
 * smgrtruncate() is now called in a critical section, preventing
   interrupts and causing PANIC on file API failure
 * smgrtruncate() has a new parameter for existing fork sizes,
   because it can't call smgrnblocks() itself inside a critical section

The changes apply to RelationTruncate(), smgr_redo() and
pg_truncate_visibility_map().  That last is also brought up to date with
other evolutions of the truncation protocol.

The VACUUM FileTruncate() failure mode had been discussed in older
reports than the ones referenced below, with independent analysis from
many people, but earlier theories on how to fix it were too complicated
to back-patch.  The more recently invented cancellation bug was
diagnosed by Alexander Lakhin.  Other corruption scenarios were spotted
by me while iterating on this patch and earlier commit 75818b3a.

Back-patch to all supported releases.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reported-by: rootcause000@gmail.com
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/18146-04e908c662113ad5%40postgresql.org
Discussion: https://postgr.es/m/18426-2d18da6586f152d6%40postgresql.org
2024-12-20 23:57:18 +13:00
Peter Geoghegan
9e85b20da7 Avoid nbtree index scan SAOP scanBehind confusion.
Consistently reset so->scanBehind at the beginning of nbtree array
advancement, even during sktrig_required=false calls (calls where array
advancement is triggered by an unsatisfied non-required array scan key).
Otherwise, it's possible for queries to fail to return all relevant
tuples to the scan given a low-order required scan key that was
previously deemed "satisfied" by a truncated high key attribute value.
This only happened at the point where a later non-required array scan
key needed to be "advanced" once on the next leaf page (that is, once
the right sibling of the truncated high key page was reached).

The underlying issue was that later code within _bt_advance_array_keys
assumed that the so->scanBehind flag must have been set using the
current page's high key (not the previous page's high key).  Any later
successful recheck call to _bt_check_compare would therefore spuriously
be prevented from making _bt_advance_array_keys return true, based on
the faulty belief that the truncated attribute must be from the scan's
current tuple (i.e. the non-pivot tuple at the start of the next page).
_bt_advance_array_keys would return false for the tuple, ultimately
resulting in _bt_checkkeys failing to return a matching tuple.

Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkJKncfqyAUTeuB5GgRhT1vhsWO2q11dbZNqKmvjopP_g@mail.gmail.com
Backpatch: 17-, where commit 5bf748b8 first appears.
2024-12-19 11:08:53 -05:00
David Rowley
7b8d45d278 Fix Assert failure in WITH RECURSIVE UNION queries
If the non-recursive part of a recursive CTE ended up using
TTSOpsBufferHeapTuple as the table slot type, then a duplicate value
could cause an Assert failure in CheckOpSlotCompatibility() when
checking the hash table for the duplicate value.  The expected slot type
for the deform step was TTSOpsMinimalTuple so the Assert failed when the
TTSOpsBufferHeapTuple slot was used.

This is a long-standing bug which we likely didn't notice because it
seems much more likely that the non-recursive term would have required
projection and used a TTSOpsVirtual slot, which CheckOpSlotCompatibility
is ok with.

There doesn't seem to be any harm done here other than the Assert
failure.  Both TTSOpsMinimalTuple and TTSOpsBufferHeapTuple slot types
require tuple deformation, so the EEOP_*_FETCHSOME ExprState step would
have properly existed in the ExprState.

The solution is to pass NULL for the ExecBuildGroupingEqual's 'lops'
parameter.  This means the ExprState's EEOP_*_FETCHSOME step won't
expect a fixed slot type.  This makes CheckOpSlotCompatibility() happy as
no checking is performed when the ExprEvalStep is not expecting a fixed
slot type.

Reported-by: Richard Guo
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/CAMbWs4-8U9q2LAtf8+ghV11zeUReA3AmrYkxzBEv0vKnDxwkKA@mail.gmail.com
Backpatch-through: 13, all supported versions
2024-12-19 13:12:18 +13:00
Tom Lane
04b8601981 Fix memory leak in pg_restore with zstd-compressed data.
EndCompressorZstd() neglected to free everything.  This was
most visible with a lot of large objects in the dump.

Per report from Tomasz Szypowski.  Back-patch to v16
where this code came in.

Discussion: https://postgr.es/m/DU0PR04MB94193D038A128EF989F922D199042@DU0PR04MB9419.eurprd04.prod.outlook.com
2024-12-17 22:31:26 -05:00
Nathan Bossart
18452b70ac Accommodate very large dshash tables.
If a dshash table grows very large (e.g., the dshash table for
cumulative statistics when there are millions of tables), resizing
it may fail with an error like:

	ERROR: invalid DSA memory alloc request size 1073741824

To fix, permit dshash resizing to allocate more than 1 GB by
providing the DSA_ALLOC_HUGE flag.

Reported-by: Andreas Scherbaum
Author: Matthias van de Meent
Reviewed-by: Cédric Villemain, Michael Paquier, Andres Freund
Discussion: https://postgr.es/m/80a12d59-0d5e-4c54-866c-e69cd6536471%40pgug.de
Backpatch-through: 13
2024-12-17 15:24:45 -06:00
Tomas Vondra
42eae257cf Update comments about index parallel builds
Commit b43757171470 allowed parallel builds for BRIN, but left behind
two comments claiming only btree indexes support parallel builds.

Reported by Egor Rogov, along with similar issues in SGML docs.
Backpatch to 17, where parallel builds for BRIN were introduced.

Reported-by: Egor Rogov
Backpatch-through: 17
Discussion: https://postgr.es/m/114e2d5d-125e-07d8-94aa-5ad175fb7443@postgrespro.ru
2024-12-17 15:48:29 +01:00
Michael Paquier
94e7e77a95 pg_combinebackup: Fix PITR comparison test in 002_compare_backups
The test was creating both the dumps to compare from the same database
on the same node, so it would never detect any mismatches when comparing
the logical dumps of the two servers.

Fixing this issue has revealed that there is a difference in the dumps:
the tablespaces paths are different.  This commit uses compare_text()
with a custom comparison function to erase the difference (slightly
tweaked to be able to work with WIN32 and non-WIN32 paths).  This way,
the non-relevant parts of the tablespace path are ignored from the check
with the basic structure of the query string still compared.

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87h67653ns.fsf@wibble.ilmari.org
Backpatch-through: 17
2024-12-17 09:24:18 +09:00
Heikki Linnakangas
d1253c67fa Make 009_twophase.pl test pass with recovery_min_apply_delay set
The test failed if you ran the regression tests with TEMP_CONFIG with
recovery_min_apply_delay = '500ms'. Fix the race condition by waiting
for transaction to be applied in the replica, like in a few other
tests.

The failing test was introduced in commit cbfbda7841. Backpatch to all
supported versions like that commit (except v12, which is no longer
supported).

Reported-by: Alexander Lakhin
Discussion: https://www.postgresql.org/message-id/09e2a70a-a6c2-4b5c-aeae-040a7449c9f2@gmail.com
2024-12-16 15:59:27 +02:00
Tom Lane
ff9dc96f37 pgbench: fix misprocessing of some nested \if constructs.
An \if command appearing within a false (not-to-be-executed) \if
branch was incorrectly treated the same as \elif.  This could allow
statements within the inner \if to be executed when they should
not be.  Also the missing inner \if stack entry would result in an
assertion failure (in assert-enabled builds) when the final \endif
is reached.

Report and patch by Michail Nikolaev.  Back-patch to all
supported branches.

Discussion: https://postgr.es/m/CANtu0oiA1ke=SP6tauhNqkUdv5QFsJtS1p=aOOf_iU+EhyKkjQ@mail.gmail.com
2024-12-15 14:14:15 -05:00
Tom Lane
ad950ea98e Fix possible crash in pg_dump with identity sequences.
If an owned sequence is considered interesting, force its owning
table to be marked interesting too.  This ensures, in particular,
that we'll fetch the owning table's column names so we have the
data needed for ALTER TABLE ... ADD GENERATED.  Previously there were
edge cases where pg_dump could get SIGSEGV due to not having filled in
the column names.  (The known case is where the owning table has been
made part of an extension while its identity sequence is not a member;
but there may be others.)

Also, if it's an identity sequence, force its dumped-components mask
to exactly match the owning table: dump definition only if we're
dumping the table's definition, dump data only if we're dumping the
table's data, etc.  This generalizes the code introduced in commit
b965f2617 that set the sequence's dump mask to NONE if the owning
table's mask is NONE.  That's insufficient to prevent failures,
because for example the table's mask might only request dumping ACLs,
which would lead us to still emit ALTER TABLE ADD GENERATED even
though we didn't create the table.  It seems better to treat an
identity sequence as though it were an inseparable aspect of the
table, matching the treatment used in the backend's dependency logic.
Perhaps this policy needs additional refinement, but let's wait to
see some field use-cases before changing it further.

While here, add a comment in pg_dump.h warning against writing tests
like "if (dobj->dump == DUMP_COMPONENT_NONE)", which was a bug in this
case.  There is one other example in getPublicationNamespaces, which
if it's not a bug is at least remarkably unclear and under-documented.
Changing that requires a separate discussion, however.

Per report from Artur Zakirov.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/CAKNkYnwXFBf136=u9UqUxFUVagevLQJ=zGd5BsLhCsatDvQsKQ@mail.gmail.com
2024-12-13 14:21:36 -05:00
Nathan Bossart
d09fbf645e Revert "Don't truncate database and user names in startup packets."
This reverts commit 562bee0fc13dc95710b8db6a48edad2f3d052f2e.

We received a report from the field about this change in behavior,
so it seems best to revert this commit and to add proper
multibyte-aware truncation as a follow-up exercise.

Fixes bug #18711.

Reported-by: Adam Rauch
Reviewed-by: Tom Lane, Bertrand Drouvot, Bruce Momjian, Thomas Munro
Discussion: https://postgr.es/m/18711-7503ee3e449d2c47%40postgresql.org
Backpatch-through: 17
2024-12-12 15:52:04 -06:00
Michael Paquier
8afff7d6db Improve reporting of pg_upgrade log files on test failure
On failure, the pg_upgrade log files are automatically appended to the
test log file, but the information reported was inconsistent.

A header, with the log file name, was reported with note(), while the
log contents and a footer used print(), making it harder to diagnose
failures when these are split into console output and test log file
because the pg_upgrade log file path in the header may not be included
in the test log file.

The output is now consolidated so as the header uses print() rather than
note().  An extra note() is added to inform that the contents of a
pg_upgrade log file are appended to the test log file.

The diffs from the regression test suite and dump files all use print()
to show their contents on failure.

Author: Joel Jacobson
Reviewed-by: Daniel Gustafsson
Discussion: https://postgr.es/m/49f7e64a-b9be-4a90-a9fe-210a7740405e@app.fastmail.com
Backpatch-through: 15
2024-12-11 08:48:52 +09:00
Noah Misch
4bd9de3f41 Fix elog(FATAL) before PostmasterMain() or just after fork().
Since commit 97550c0711972a9856b5db751539bbaf2f88884c, these failed with
"PANIC:  proc_exit() called in child process" due to uninitialized or
stale MyProcPid.  That was reachable if close() failed in
ClosePostmasterPorts() or setlocale(category, "C") failed, both
unlikely.  Back-patch to v13 (all supported versions).

Discussion: https://postgr.es/m/20241208034614.45.nmisch@google.com
2024-12-10 13:52:02 -08:00
Michael Paquier
67ef403d0e Fix comments of GUC hooks for timezone_abbreviations
The GUC assign and check hooks used "assign_timezone_abbreviations",
which was incorrect.

Issue noticed while browsing this area of the code, introduced in
0a20ff54f5e6.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/Z1eV6Y8yk77GZhZI@paquier.xyz
Backpatch-through: 16
2024-12-10 13:02:24 +09:00
Michael Paquier
416d1bd66c Fix outdated comment of scram_build_secret()
This routine documented that "iterations" would use a default value if
set to 0 by the caller.  However, the iteration should always be set by
the caller to a value strictly more than 0, as documented by an
assertion.

Oversight in b577743000cd, that has made the iteration count of SCRAM
configurable.

Author: Matheus Alcantara
Discussion: https://postgr.es/m/ac858943-4743-44cd-b4ad-08a0c10cbbc8@gmail.com
Backpatch-through: 16
2024-12-10 12:54:14 +09:00
Masahiko Sawada
a12b0f3a9d Include necessary header files in radixtree.h.
When #include'ing radixtree.h with RT_SHMEM, it could happen to raise
compiler errors due to missing some declarations of types and
functions.

This commit also removes the inclusion of postgres.h since it's
against our usual convention.

Backpatch to v17, where radixtree.h was introduced.

Reviewed-by: Heikki Linnakangas, Álvaro Herrera
Discussion: https://postgr.es/m/CAD21AoCU9YH%2Bb9Rr8YRw7UjmB%3D1zh8GKQkWNiuN9mVhMvkyrRg%40mail.gmail.com
Backpatch-through: 17
2024-12-09 13:07:03 -08:00
Daniel Gustafsson
9add1bbfa6 Fix small memory leaks in GUC checks
Follow-up commit to a9d58bfe8a3a.  Backpatch down to v16 where
this was added in order to keep the code consistent for future
backpatches.

Author: Tofig Aliev <t.aliev@postgrespro.ru>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/bba4313fdde9db46db279f96f3b748b1@postgrespro.ru
Backpatch-through: 16
2024-12-09 20:58:23 +01:00
Tom Lane
556f7b7bc1 Simplify executor's determination of whether to use parallelism.
Our parallel-mode code only works when we are executing a query
in full, so ExecutePlan must disable parallel mode when it is
asked to do partial execution.  The previous logic for this
involved passing down a flag (variously named execute_once or
run_once) from callers of ExecutorRun or PortalRun.  This is
overcomplicated, and unsurprisingly some of the callers didn't
get it right, since it requires keeping state that not all of
them have handy; not to mention that the requirements for it were
undocumented.  That led to assertion failures in some corner
cases.  The only state we really need for this is the existing
QueryDesc.already_executed flag, so let's just put all the
responsibility in ExecutePlan.  (It could have been done in
ExecutorRun too, leading to a slightly shorter patch -- but if
there's ever more than one caller of ExecutePlan, it seems better
to have this logic in the subroutine than the callers.)

This makes those ExecutorRun/PortalRun parameters unnecessary.
In master it seems okay to just remove them, returning the
API for those functions to what it was before parallelism.
Such an API break is clearly not okay in stable branches,
but for them we can just leave the parameters in place after
documenting that they do nothing.

Per report from Yugo Nagata, who also reviewed and tested
this patch.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
2024-12-09 14:38:19 -05:00
Michael Paquier
bb93b33d7e Improve comment about dropped entries in pgstat.c
pgstat_write_statsfile() discards any entries marked as dropped from
being written to the stats file at shutdown, and also included an
assertion based on the same condition.

The intention of the assertion is to track that no pgstats entries
should be left around as terminating backends should drop any entries
they still hold references on before the stats file is written by the
checkpointer, and it not worth taking down the server in this case if
there is a bug making that possible.

Let's improve the comment of this area to document clearly what's
intended.

Based on a discussion with Bertrand Drouvot and Anton A. Melnikov.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/a13e8cdf-b97a-4ecb-8f42-aaa367974e29@postgrespro.ru
Backpatch-through: 15
2024-12-09 14:35:44 +09:00
Michael Paquier
dc5f905418 Fix invalidation of local pgstats references for entry reinitialization
818119afccd3 has introduced the "generation" concept in pgstats entries,
incremented a counter when a pgstats entry is reinitialized, but it did
not count on the fact that backends still holding local references to
such entries need to be refreshed if the cache age is outdated.  The
previous logic only updated local references when an entry was dropped,
but it needs also to consider entries that are reinitialized.

This matters for replication slot stats (as well as custom pgstats kinds
in 18~), where concurrent drops and creates of a slot could cause
incorrect stats to be locally referenced.  This would lead to an
assertion failure at shutdown when writing out the stats file, as the
backend holding an outdated local reference would not be able to drop
during its shutdown sequence the stats entry that should be dropped, as
the last process holding a reference to the stats entry.  The
checkpointer was then complaining about such an entry late in the
shutdown sequence, after the shutdown checkpoint is finished with the
control file updated, causing the stats file to not be generated.  In
non-assert builds, the entry would just be skipped with the stats file
written.

Note that only logical replication slots use statistics.

A test case based on TAP is added to test_decoding, where a persistent
connection peeking at a slot's data is kept with concurrent drops and
creates of the same slot.  This is based on the isolation test case that
Anton has sent.  As it requires a node shutdown with a check to make
sure that the stats file is written with this specific sequence of
events, TAP is used instead.

Reported-by: Anton A. Melnikov
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/56bf8ff9-dd8c-47b2-872a-748ede82af99@postgrespro.ru
Backpatch-through: 15
2024-12-09 10:46:03 +09:00
David Rowley
9d5ce4f1a0 Fix possible crash during WindowAgg evaluation
When short-circuiting WindowAgg node evaluation on the top-level
WindowAgg node using quals on monotonic window functions, because the
WindowAgg run condition can mean there's no need to evaluate subsequent
window function results in the same partition once the run condition
becomes false, it was possible that the executor would use stale results
from the previous invocation of the window function in some cases.

A fix for this was partially done by a5832722, but that commit only
fixed the issue for non-top-level WindowAgg nodes.  I mistakenly thought
that the top-level WindowAgg didn't have this issue, but Jayesh's example
case clearly shows that's incorrect.  At the time, I also thought that
this only affected 32-bit systems as all window functions which then
supported run conditions returned BIGINT, however, that's wrong as
ExecProject is still called and that could cause evaluation of any other
window function belonging to the same WindowAgg node, one of which may
return a byref type.

The only queries affected by this are WindowAggs with a "Run Condition"
which contains at least one window function with a byref result type,
such as lead() or lag() on a byref column.  The window clause must also
contain a PARTITION BY clause (without a PARTITION BY, execution of the
WindowAgg stops immediately when the run condition becomes false and
there's no risk of using the stale results).

Reported-by: Jayesh Dehankar
Discussion: https://postgr.es/m/193261e2c4d.3dd3cd7c1842.871636075166132237@zohocorp.com
Backpatch-through: 15, where WindowAgg run conditions were added
2024-12-09 14:24:07 +13:00
Tom Lane
ec7b89cc53 Ensure that pg_amop/amproc entries depend on their lefttype/righttype.
Usually an entry in pg_amop or pg_amproc does not need a dependency on
its amoplefttype/amoprighttype/amproclefttype/amprocrighttype types,
because there is an indirect dependency via the argument types of its
referenced operator or procedure, or via the opclass it belongs to.
However, for some support procedures in some index AMs, the argument
types of the support procedure might not mention the column data type
at all.  Also, the amop/amproc entry might be treated as "loose" in
the opfamily, in which case it lacks a dependency on any particular
opclass; or it might be a cross-type entry having a reference to a
datatype that is not its opclass' opcintype.

The upshot of all this is that there are cases where a datatype can
be dropped while leaving behind amop/amproc entries that mention it,
because there is no path in pg_depend showing that those entries
depend on that type.  Such entries are harmless in normal activity,
because they won't get used, but they cause problems for maintenance
actions such as dropping the operator family.  They also cause pg_dump
to produce bogus output.  The previous commit put a band-aid on the
DROP OPERATOR FAMILY failure, but a real fix is needed.

To fix, add pg_depend entries showing that a pg_amop/pg_amproc entry
depends on its lefttype/righttype.  To avoid bloating pg_depend too
much, skip this if the referenced operator or function has that type
as an input type.  (I did not bother with considering the possible
indirect dependency via the opclass' opcintype; at least in the
reported case, that wouldn't help anyway.)

Probably, the reason this has escaped notice for so long is that
add-on datatypes and relevant opclasses/opfamilies are usually
packaged as extensions nowadays, so that there's no way to drop
a type without dropping the referencing opclasses/opfamilies too.
Still, in the absence of pg_depend entries there's nothing that
constrains DROP EXTENSION to drop the opfamily entries before the
datatype, so it seems possible for a DROP failure to occur anyway.

The specific case that was reported doesn't fail in v13, because
v13 prefers to attach the support procedure to the opclass not the
opfamily.  But it's surely possible to construct other edge cases
that do fail in v13, so patch that too.

Per report from Yoran Heling.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
2024-12-07 15:56:28 -05:00
Tom Lane
5b44a317ae Make getObjectDescription robust against dangling amproc type links.
Yoran Heling reported a case where a data type could be dropped
while references to its OID remain behind in pg_amproc.  This
causes getObjectDescription to fail, which blocks dropping the
operator family (since our DROP code likes to construct descriptions
of everything it's dropping).  The proper fix for this requires
adding more pg_depend entries.  But to allow DROP to go through with
already-corrupt catalogs, tweak getObjectDescription to print "???"
for the type instead of failing when it processes such an entry.

I changed the logic for pg_amop similarly, for consistency,
although it is not known that the problem can manifest in pg_amop.

Per report from Yoran Heling.  Back-patch to all supported
branches (although the problem may be unreachable in v13).

Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
2024-12-07 14:28:16 -05:00
Tom Lane
765f76d8cd Fix is_digit labeling of to_timestamp's FFn format codes.
These format codes produce or consume strings of digits, so they
should be labeled with is_digit = true, but they were not.
This has effect in only one place, where is_next_separator()
is checked to see if the preceding format code should slurp up
all the available digits.  Thus, with a format such as '...SSFF3'
with remaining input '12345', the 'SS' code would consume all
five digits (and then complain about seconds being out of range)
when it should eat only two digits.

Per report from Nick Davies.  This bug goes back to d589f9446
where the FFn codes were introduced, so back-patch to v13.

Discussion: https://postgr.es/m/AM8PR08MB6356AC979252CFEA78B56678B6312@AM8PR08MB6356.eurprd08.prod.outlook.com
2024-12-07 13:12:32 -05:00
John Naylor
83ce20d671 Fix use-after-free in parallel_vacuum_reset_dead_items
parallel_vacuum_reset_dead_items used a local variable to hold a
pointer from the passed vacrel, purely as a shorthand. This pointer
was later freed and a new allocation was made and stored to the
struct. Then the local pointer was mistakenly referenced again.

This apparently happened not to break anything since the freed chunk
would have been put on the context's freelist, so it was accidentally
the same pointer anyway, in which case the DSA handle was correctly
updated. The minimal fix is to change two places so they access
dead_items through the vacrel. This coding style is a maintenance
hazard, so while at it get rid of most other similar usages, which
were inconsistently used anyway.

Analysis and patch by Vallimaharajan G, with further defensive coding
by me

Backpath to v17, when TidStore came in

Discussion: https://postgr.es/m/1936493cc38.68cb2ef27266.7456585136086197135@zohocorp.com
2024-12-04 16:59:12 +07:00
Álvaro Herrera
9abdc1841e
Fix synchronized_standby_slots GUC check hook
The validate_sync_standby_slots subroutine requires an LWLock, so it
cannot run in processes without PGPROC; skip it there to avoid a crash.

This replaces the current test for ReplicationSlotCtl being not null,
which appears to be a solution for the same problem but less general.
I also rewrote a related comment that mentioned ReplicationSlotCtl in
StandbySlotsHaveCaughtup.

This code came in with commit bf279ddd1c28; backpatch to 17.

Reported-by: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Discussion: https://postgr.es/m/202411281216.sutbxtr6idnn@alvherre.pgsql
2024-12-03 17:50:57 +01:00
Álvaro Herrera
5ffbbcfa16
Drop "Lock" suffix from LWLock wait event names
Commit da952b415f44 unintentially reverted the SQL-visible part of
commit 14a910109126, which breaks queries joining pg_wait_events with
pg_stat_acivity.  Remove the suffix again.

Backpatch to 17.

Reported-by: Christophe Courtois <christophe.courtois@dalibo.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/18728-450924477056a339%40postgresql.org
Discussion: https://postgr.es/m/Z01w1+LihtRiS0Te@ip-10-97-1-34.eu-west-3.compute.internal
2024-12-03 15:50:03 +01:00
Thomas Munro
d4ffbf47b2 RelationTruncate() must set DELAY_CHKPT_START.
Previously, it set only DELAY_CHKPT_COMPLETE. That was important,
because it meant that if the XLOG_SMGR_TRUNCATE record preceded a
XLOG_CHECKPOINT_ONLINE record in the WAL, then the truncation would also
happen on disk before the XLOG_CHECKPOINT_ONLINE record was
written.

However, it didn't guarantee that the sync request for the truncation
was processed before the XLOG_CHECKPOINT_ONLINE record was written. By
setting DELAY_CHKPT_START, we guarantee that if an XLOG_SMGR_TRUNCATE
record is written to WAL before the redo pointer of a concurrent
checkpoint, the sync request queued by that operation must be processed
by that checkpoint, rather than being left for the following one.

This is a refinement of commit 412ad7a5563.  Back-patch to all supported
releases, like that commit.

Author: Robert Haas <robertmhaas@gmail.com>
Reported-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKG%2B-2rjGZC2kwqr2NMLBcEBp4uf59QT1advbWYF_uc%2B0Aw%40mail.gmail.com
2024-12-03 10:19:47 +13:00
Tom Lane
a963abd549 Fix broken list-munging in ecpg's remove_variables().
The loops over cursor argument variables neglected to ever advance
"prevvar".  The code would accidentally do the right thing anyway
when removing the first or second list entry, but if it had to
remove the third or later entry then it would also remove all
entries between there and the first entry.  AFAICS this would
only matter for cursors that reference out-of-scope variables,
which is a weird Informix compatibility hack; between that and
the lack of impact for short lists, it's not so surprising that
nobody has complained.  Nonetheless it's a pretty obvious bug.

It would have been more obvious if these loops used a more standard
coding style for chasing the linked lists --- this business with the
"prev" pointer sometimes pointing at the current list entry is
confusing and overcomplicated.  So rather than just add a minimal
band-aid, I chose to rewrite the loops in the same style we use
elsewhere, where the "prev" pointer is NULL until we are dealing with
a non-first entry and we save the "next" pointer at the top of the
loop.  (Two of the four loops touched here are not actually buggy,
but it seems better to make them all look alike.)

Coverity discovered this problem, but not until 2b41de4a5 added code
to free no-longer-needed arguments structs.  With that, the incorrect
link updates are possibly touching freed memory, and it complained
about that.  Nonetheless the list corruption hazard is ancient, so
back-patch to all supported branches.
2024-12-01 14:15:37 -05:00
Tom Lane
78883cd905 Avoid mislabeling of lateral references, redux.
As I'd feared, commit 5c9d8636d was still a few bricks shy of a load.
We can't just leave pulled-up lateral-reference Vars with no new
nullingrels: we have to carefully compute what subset of the
to-be-replaced Var's nullingrels apply to them, else we still get
"wrong varnullingrels" errors.  This is a bit tedious, but it looks
like we can use the nullingrel data this patch computes for other
purposes, enabling better optimization.  We don't want to inject
unnecessary plan changes into stable branches though, so leave that
idea for a later HEAD-only patch.

Patch by me, but thanks to Richard Guo for devising a test case that
broke 5c9d8636d, and for preliminary investigation about how to fix
it.  As before, back-patch to v16.

Discussion: https://postgr.es/m/E1tGn4j-0003zi-MP@gemulon.postgresql.org
2024-11-30 12:42:20 -05:00
Tom Lane
72822a99d4 Avoid mislabeling of lateral references when pulling up a subquery.
If we are pulling up a subquery that's under an outer join, and
the subquery's target list contains a strict expression that uses
both a subquery variable and a lateral-reference variable, it's okay
to pull up the expression without wrapping it in a PlaceHolderVar.
That's safe because if the subquery variable is forced to NULL
by the outer join, the expression result will come out as NULL too,
so we don't have to force that outcome by evaluating the expression
below the outer join.  It'd be correct to wrap in a PHV, but that can
lead to very significantly worse plans, since we'd then have to use
a nestloop plan to pass down the lateral reference to where the
expression will be evaluated.

However, when we do that, we should not mark the lateral reference
variable as being nulled by the outer join, because it isn't after
we pull up the expression in this way.  So the marking logic added
by cb8e50a4a was incorrect in this detail, leading to "wrong
varnullingrels" errors from the consistency-checking logic in
setrefs.c.  It seems to be sufficient to just not mark lateral
references at all in this case.  (I have a nagging feeling that more
complexity may be needed in cases where there are several levels of
outer join, but some attempts to break it with that didn't succeed.)

Per report from Bertrand Mamasam.  Back-patch to v16, as the previous
patch was.

Discussion: https://postgr.es/m/CACZ67_UA_EVrqiFXJu9XK50baEpH=ofEPJswa2kFxg6xuSw-ww@mail.gmail.com
2024-11-28 17:33:16 -05:00
Peter Eisentraut
4527b9e26d psql: Add tab completion for COPY (MERGE ...
The underlying feature for this was added in PostgreSQL 17.

Author: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACJufxEmNjxvf1deR1zBrJbjAMeCooooLRzZ+yaaBuqDKh_6-Q@mail.gmail.com
2024-11-28 12:09:53 +01:00