1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-30 11:03:19 +03:00
Commit Graph

45582 Commits

Author SHA1 Message Date
eb124c3d6d Add TAP tests to check replication slot advance during the checkpoint
The new tests verify that logical and physical replication slots are still
valid after an immediate restart on checkpoint completion when the slot was
advanced during the checkpoint.

This commit introduces two new injection points to make these tests possible:

* checkpoint-before-old-wal-removal - triggered in the checkpointer process
  just before old WAL segments cleanup;
* logical-replication-slot-advance-segment - triggered in
  LogicalConfirmReceivedLocation() when restart_lsn was changed enough to
  point to the next WAL segment.

Discussion: https://postgr.es/m/flat/1d12d2-67235980-35-19a406a0%4063439497
Author: Vitaly Davydov <v.davydov@postgrespro.ru>
Author: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 17
2025-06-14 03:55:21 +03:00
ca307d5cec Keep WAL segments by slot's last saved restart LSN
The patch fixes the issue with the unexpected removal of old WAL segments
after checkpoint, followed by an immediate restart.  The issue occurs when
a slot is advanced after the start of the checkpoint and before old WAL
segments are removed at the end of the checkpoint.

The patch introduces a new in-memory state for slots: last_saved_restart_lsn,
which is used to calculate the oldest LSN for removing WAL segments. This
state is updated every time with the current restart_lsn at the moment when
the slot is saved to disk.

This fix changes the shared memory layout.  It's applied to HEAD only because
we don't have to preserve ABI compatibility during the beta stage.  Another
fix that doesn't affect the ABI is committed to back branches.

Discussion: https://postgr.es/m/1d12d2-67235980-35-19a406a0%4063439497
Author: Vitaly Davydov <v.davydov@postgrespro.ru>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
2025-06-14 03:36:04 +03:00
c45a1dba0d nbtree: _bt_readnextpage doesn't affect markPos.
_bt_readnextpage expects so->currPos.buf to be InvalidBuffer (and for
the position's page to be unlocked) when called.  However, it does not
expect there to be no pins held on any page.  In particular, so->markPos
might hold a separate pin, both before and after the call.  Fix some
comments that seemed to suggest otherwise.

Follow-up commit to commit 7c319f54, which made _bt_killitems drop pins
it acquired itself.
2025-06-13 19:58:47 -04:00
a0c7b76537 Comment fixups from 626df47ad9.
Reported-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CAHut+PspbHQmRCBL1c-opoJeTUKUaFFfUQJd2rhDZqwUrWCi7w@mail.gmail.com
2025-06-13 10:02:24 -07:00
29aaeceee2 psql: Reword help message and docs for WATCH_INTERVAL
Reword the documentation around the default value to make interaction
between WATCH_INTERVAL and the \watch command clearer.  While there,
also remove a stray parenthesis left over from a previous version of
the patch.

Reported-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/c34a650b-6f8b-4da7-9ebb-b6df03ce009d@eisentraut.org
2025-06-13 15:13:09 +02:00
6e951f279b psql: Forbid use of COPY and \copy while in a pipeline
Running COPY within a pipeline can break protocol synchronization in
multiple ways.  psql is limited in terms of result processing if mixing
COPY commands with normal queries while controlling a pipeline with the
new meta-commands, as an effect of the following reasons:
- In COPY mode, the backend ignores additional Sync messages and will
not send a matching ReadyForQuery expected by the frontend.  Doing a
\syncpipeline just after COPY will leave the frontend waiting for a
ReadyForQuery message that won't be sent, leaving psql out-of-sync.
- libpq automatically sends a Sync with the Copy message which is not
tracked in the command queue, creating an unexpected synchronisation
point that psql cannot really know about.  While it is possible to track
such activity for a \copy, this cannot really be done sanely with plain
COPY queries.  Backend failures during a COPY would leave the pipeline
in an aborted state while the backend would be in a clean state, ready
to process commands.

At the end, fixing those issues would require modifications in how libpq
handles pipeline and COPY.  So, rather than implementing workarounds in
psql to shortcut the libpq internals (with command queue handling for
one), and because meta-commands for pipelines in psql are a new feature
with COPY in a pipeline having a limited impact compared to other
queries, this commit forbids the use of COPY within a pipeline to avoid
possible break of protocol synchronisation within psql.  If there is a
use-case for COPY support within pipelines in libpq, this could always
be added in the future, if necessary.

Most of the changes of this commit impacts the tests for psql pipelines,
removing the tests related to COPY.  Some TAP tests still exist for COPY
TO/FROM and \copy to/from, to check that that connections are aborted
when this operation is attempted.

Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/AC468509-06E8-4E2A-A4B1-63046A4AC6AB@postgrespro.ru
2025-06-13 10:15:17 +09:00
2c76c6ac47 Replace %llu by PRIu64 in AIO io_uring code
This is a continuation of 15a79c7311, cleaning up the AIO io_uring
code that has been committed after that while still using %llu.

The code changed here is new in v18, so cleaning things now means less
conflicts if this area of the code changes on backpatch once the 18
stable branch is created.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/aEZcGCnYFq642q8k@paquier.xyz
2025-06-13 08:59:47 +09:00
84914e964b pg_restore: Fix wrong descriptions of --with-{schema,data,statistics} options.
Commit bde2fb797a added the --with-schema, --with-data, and --with-statistics
options to pg_restore. These options control whether to restore schema, data,
or statistics if present in the archive. However, the help message and
documentation incorrectly described them as affecting what gets dumped.

This commit corrects those descriptions to clarify that the options control
restoration, not dumping.

Bug: #18952
Reported-by: TAKATSUKA Haruka <harukat@sraoss.co.jp>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: TAKATSUKA Haruka <harukat@sraoss.co.jp>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/18952-be40a620f8b1e755@postgresql.org
2025-06-12 23:25:21 +09:00
0f65f3eec4 Fix squashing algorithm for query texts
The algorithm to squash lists of constants added by commit 62d712ecfd
was a bit too simplistic; we wanted to avoid adding unnecessary
complexity, but cases like direct function calls of typecasting
functions (and others) were missed, and bogus SQL syntax was being shown
in pg_stat_statements normalized query text field.  To fix normalization
for those cases, we need the parser to transmit information about were
each list of constant values starts and ends, so add that to a couple of
nodes.  Also add a few more test cases to make sure we're doing the
right thing.

The patch initially submitted by Sami added a new private struct in
gram.y to carry the start/end information for A_Expr, but I (Álvaro)
decided that a better fix was to remove the parser indirection via the
in_expr production, and instead create separate components in the a_expr
rule.  I'm surprised that this works and doesn't require more changes,
but I assume (without checking) that the grammar used to be more complex
and got simplified at some point.

Bump catversion.

Author: Sami Imseih <samimseih@gmail.com>
Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0tRXoPG2y6bMgBCWNDt0Tn=unRerbzYM=oW0syi1=C1OA@mail.gmail.com
2025-06-12 14:21:21 +02:00
f85f6ab051 Revert support for improved tracking of nested queries
This commit reverts the two following commits:
- 499edb0974, track more precisely query locations for nested
statements.
- 06450c7b8c, a follow-up fix of 499edb0974 with query locations.
The test introduced in this commit is not reverted.  This is proving
useful to track a problem that only pgaudit was able to detect.

These prove to have issues with the tracking of SELECT statements, when
these use multiple parenthesis which is something supported by the
grammar.  Incorrect location and lengths are causing pg_stat_statements
to become confused, failing its job in query normalization with
potential out-of-bound writes because the location and the length may
not match with what can be handled.  A lot of the query patterns
discussed when this issue was reported have no test coverage in the main
regression test suite, or the recovery test 027_stream_regress.pl would
have caught the problems as pg_stat_statements is loaded by the node
running the regression tests.  A first step would be to improve the test
coverage to stress more the query normalization logic.

A different portion of this work was done in 45e0ba30fc, with the
addition of tests for nested queries.  These can be left in the tree.
They are useful to track the way inner queries are currently tracked by
PGSS with non-top-level entries, and will be useful when reconsidering
in the future the work reverted here.

Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru>
Discussion: https://postgr.es/m/18947-cdd2668beffe02bf@postgresql.org
2025-06-12 10:08:55 +09:00
dd2ce37927 Revert "nbtree: Remove useless row compare arg."
This reverts commit 54c6ea8c81.

Further analysis has shown that the forcenonrequired row compare
behavior is in fact necessary, despite the new restrictions on
RowCompares imposed by _bt_set_startikey following commit 5f4d98d4.

Discussion: https://postgr.es/m/CAH2-Wzm3bKcz3TbHGem3_+SinEyG=VZVPbApQghp7YiZj+MM3g@mail.gmail.com
2025-06-11 18:16:15 -04:00
e1458f2f1b Revert a few small patches that were intended for version 19.
- 4c787a24e7
- 78bd364ee3
- 7a6880fadc
- 8898082a5d

Suggested-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+TgmoZ=J=PVNZUNKaxULu+KUVSt3Y-aJ1DZ9Y3Co6mu0z62jA@mail.gmail.com
Discussion: https://postgr.es/m/60e8c6d0a6c08e67f15dbbe9e53df0119c710065.camel@j-davis.com
2025-06-11 15:10:12 -07:00
b774ad4933 Add tab completion for REJECT_LIMIT option.
This addresses an oversight in commit 4ac2a9bec, which introduced the
REJECT_LIMIT option to the COPY command.

Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/ac23e824d1d602f113a89c91ee56fb23@oss.nttdata.com
2025-06-11 11:44:25 -07:00
7c319f5491 Make _bt_killitems drop pins it acquired itself.
Teach nbtree's _bt_killitems to leave the so->currPos page that it sets
LP_DEAD items on in whatever state it was in when _bt_killitems was
called.  In particular, make sure that so->dropPin scans don't acquire a
pin whose reference is saved in so->currPos.buf.

Allowing _bt_killitems to change so->currPos.buf like this is wrong.
The immediate consequence of allowing it is that code in _bt_steppage
(that copies so->currPos into so->markPos) will behave as if the scan is
a !so->dropPin scan.  so->markPos will therefore retain the buffer pin
indefinitely, even though _bt_killitems only needs to acquire a pin
(along with a lock) for long enough to mark known-dead items LP_DEAD.

This issue came to light following a report of a failure of an assertion
from recent commit e6eed40e.  The test case in question involves the use
of mark and restore.  An initial call to _bt_killitems takes place that
leaves so->currPos.buf in a state that is inconsistent with the scan
being so->dropPin.  A subsequent call to _bt_killitems for the same
position (following so->currPos being saved in so->markPos, and then
restored as so->currPos) resulted in the failure of an assertion that
tests that so->currPos.buf is InvalidBuffer when the scan is so->dropPin
(non-assert builds got a "resource was not closed" WARNING instead).

The same problem exists on earlier releases, though the issue is far
more subtle there.  Recent commit e6eed40e introduced the so->dropPin
field as a partial replacement for testing so->currPos.buf directly.
Earlier releases won't get an assertion failure (or buffer pin leak),
but they will allow the second _bt_killitems call from the test case to
behave as if a buffer pin was consistently held since the original call
to _bt_readpage.  This is wrong; there will have been an initial window
during which no pin was held on the so->currPos page, and yet the second
_bt_killitems call will neglect to check if so->currPos.lsn continues to
match the page's now-current LSN.

As a result of all this, it's just about possible that _bt_killitems
will set the wrong items LP_DEAD (on release branches).  This could only
happen with merge joins (the sole user of nbtree mark/restore support),
when a concurrently inserted index tuple used a recently-recycled TID
(and only when the new tuple was inserted onto the same page as a
distinct concurrently-removed tuple with the same TID).  This is exactly
the scenario that _bt_killitems' check of the page's now-current LSN
against the LSN stashed in currPos was supposed to prevent.

A follow-up commit will make nbtree completely stop conditioning whether
or not a position's pin needs to be dropped on whether the 'buf' field
is set.  All call sites that might need to drop a still-held pin will be
taught to rely on the scan-level so->dropPin field recently introduced
by commit e6eed40e.  That will make bugs of the same general nature as
this one impossible (or make them much easier to detect, at least).

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/545be1e5-3786-439a-9257-a90d30f8b849@gmail.com
Backpatch-through: 13
2025-06-11 09:17:35 -04:00
361499538c psql: Remove PARTITION BY clause in tab completion for unlogged tables
CREATE UNLOGGED TABLE was still being recommended by psql's tab
completion as a possible pattern, but the backend is rejecting this
option since e2bab2d792.

Reported-by: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Shinya Kato <shinya11.kato@gmail.com>
Discussion: https://postgr.es/m/CAOzEurQZ1a+6d1K8b=+Ww1NFQVwAt9KSCQsBWXYBaPnYCenK3g@mail.gmail.com
2025-06-11 09:27:28 +09:00
137935bd11 Don't reduce output request size on non-Unix-socket connections.
Traditionally, libpq's pqPutMsgEnd has rounded down the amount-to-send
to be a multiple of 8K when it is eagerly writing some data.  This
still seems like a good idea when sending through a Unix socket, as
pipes typically have a buffer size of 8K or some fraction/multiple of
that.  But there's not much argument for it on a TCP connection, since
(a) standard MTU values are not commensurate with that, and (b) the
kernel typically applies its own packet splitting/merging logic.

Worse, our SSL and GSSAPI code paths both have API stipulations that
if they fail to send all the data that was offered in the previous
write attempt, we mustn't offer less data in the next attempt; else
we may get "SSL error: bad length" or "GSSAPI caller failed to
retransmit all data needing to be retried".  The previous write
attempt might've been pqFlush attempting to send everything in the
buffer, so pqPutMsgEnd can't safely write less than the full buffer
contents.  (Well, we could add some more state to track exactly how
much the previous write attempt was, but there's little value evident
in such extra complication.)  Hence, apply the round-down only on
AF_UNIX sockets, where we never use SSL or GSSAPI.

Interestingly, we had a very closely related bug report before,
which I attempted to fix in commit d053a879b.  But the test case
we had then seemingly didn't trigger this pqFlush-then-pqPutMsgEnd
scenario, or at least we failed to recognize this variant of the bug.

Bug: #18907
Reported-by: Dorjpalam Batbaatar <htgn.dbat.95@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18907-d41b9bcf6f29edda@postgresql.org
Backpatch-through: 13
2025-06-10 18:39:34 -04:00
8898082a5d inet_net_pton.c: use pg_ascii_tolower() rather than tolower().
Avoid dependence on setlocale(). No behavior change.

Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
2025-06-10 11:23:20 -07:00
4c787a24e7 copyfromparse.c: use pg_ascii_tolower() rather than tolower().
Avoid dependence on setlocale(). No behavior change.

Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
2025-06-10 11:22:57 -07:00
3feff3916e Use exported symbols list on macOS for loadable modules as well
On macOS, when building with the make system, the exported symbols
list $(SHLIB_EXPORTS) was ignored.  This was probably not intentional,
it was probably just forgotten, since that combination has never
actually been used until now (for libpq-oauth).

The meson build system handles this correctly.  Also, other platforms
have been doing this correctly.

This fixes it.  It also does a bit of refactoring to make the code
match the layout for other platforms.

Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/c70ca32e-b109-460d-9810-6e23ebb4473f%40eisentraut.org
2025-06-10 07:04:43 +02:00
166b4f4560 pg_restore: fix incompatibility with old directory-format dumps.
pg_restore failed to restore large objects (blobs) out of
directory-format dumps made by versions before PG v12.
That's because, due to a bug fixed in commit 548e50976, those
old versions put the wrong filename into the BLOBS TOC entry.
Said bug was harmless before v17, because we ignored the
incorrect filename field --- but commit a45c78e32 assumed it
would be correct.

Reported-by: Pavel Stehule <pavel.stehule@gmail.com>
Author: Pavel Stehule <pavel.stehule@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFj8pRCrZ=_e1Rv1N+6vDaH+6gf=9A2mE2J4RvnvKA1bLiXvXA@mail.gmail.com
Backpatch-through: 17
2025-06-08 17:06:39 -04:00
7d4667c620 Revert "postgres_fdw: Inherit the local transaction's access/deferrable modes."
We concluded that commit e5a3c9d9b is a feature rather than a fix; since
it was added after feature freeze, revert it.

Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reported-by: Michael Paquier <michael@paquier.xyz>
Reported-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/ed2296f1-1a6b-4932-b870-5bb18c2591ae%40oss.nttdata.com
2025-06-08 17:30:00 +09:00
1a857348e4 plpython: Remove obsolete test expected file
Move plpython_error_5.out to plpython_error.out, since the pre-3.5
version is no longer needed, since we raised the Python requirement to
3.6 (commit 45363fca63).

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/d620e7c6-becc-4a8e-9b43-eea0da55faf2@eisentraut.org
2025-06-07 09:04:29 +02:00
5b40feab59 Improve CREATE DATABASE error message for invalid libc locale.
Discussion: https://postgr.es/m/73959a14-267b-49c1-8293-291b175682cb@manitou-mail.org
Reviewed-by: Daniel Verite <daniel@manitou-mail.org>
2025-06-06 15:28:51 -07:00
a31767fc09 Use NULL instead of 0 for pointer arguments.
Commit 5fe08c006c fixed this for calls to dshash_create().  This
commit fixes calls to dshash_attach() and dsa_create_in_place().

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aECi_gSD9JnVWQ8T%40nathan
2025-06-06 12:08:17 -05:00
304862973e Fixed signed/unsigned mismatch in test_dsm_registry.
Oversight in commit 8b2bcf3f28.

Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/aECi_gSD9JnVWQ8T%40nathan
Backpatch-through: 17
2025-06-06 11:40:52 -05:00
e6eed40e44 Avoid BufferGetLSNAtomic() calls during nbtree scans.
Delay calling BufferGetLSNAtomic() until we finish reading a page that
actually contains items that btgettuple will return to the executor.
This reduces the number of calls during plain index scans (we'll only
call BufferGetLSNAtomic() when _bt_readpage returns true), and totally
eliminates calls during index-only scans, bitmap index scans, and plain
index scans of an unlogged relation.

Currently, when checksums (or wal_log_hints) are enabled, acquiring a
page's LSN in BufferGetLSNAtomic() involves locking the buffer header
(which involves the use of spinlocks).  Testing has shown that enabling
page-level checksums causes large regressions with certain workloads,
especially on larger multi-socket systems.

The regression isn't tied to any Postgres 18 commit.  However, Postgres
18 commit 04bec894 made initdb use checksums by default, so it seems
prudent to address the problem now.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/941f0190-e3c6-4622-9ac7-c04e936e5fdb@vondra.me
Discussion: https://postgr.es/m/CAH2-Wzk-Dg5XWs_jDuiHt4_7ryrSY+n=vxmHY51EVqPDFsKXmg@mail.gmail.com
2025-06-06 10:19:44 -04:00
54c6ea8c81 nbtree: Remove useless row compare arg.
Use of a RowCompare key makes nbtree index scans ineligible to use
pstate.forcenonrequired following recent bugfix commit 5f4d98d4.
There's no longer any need for _bt_check_rowcompare to accept a
forcenonrequired argument, so remove it.
2025-06-05 14:50:43 -04:00
e6f98d8848 Avoid bogus scans of partitions when marking FKs enforced
Similar to commit cc733ed164: when an unenforced foreign key that
references a partitioned table is altered to be enforced, we scan
the constrained table based on each partition on the referenced
partitioned table.  This is bogus and likely to cause the ALTER TABLE to
fail: we must only scan the constrained table as pointing to the
top-level partitioned table.  Oversight in commit eec0040c4b.  Fix by
eliding those scans.

Author: Amul Sul <sulamul@gmail.com>
Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF1e_gPOLtsDoaE4VCgQPC8KZW_kPAjPR5Rvv4Ew=fb2A@mail.gmail.com
2025-06-05 18:39:06 +02:00
cc733ed164 Avoid bogus scans of partitions when validating FKs to partitioned tables
Validating an unvalidated foreign key that references a partitioned
table would try to queue validations for each individual partition of
the referenced table, but this is wrong: each individual partition would
not necessarily have all the referenced rows, so errors would be raised.
Avoid doing that.  The pg_constraint rows that cause this to happen are
only there to support the action triggers that implement the DELETE/
UPDATE actions of the FK, so no validating scan is necessary.

This was an oversight in commit b663b9436e.

An equivalent oversight exists for NOT ENFORCED constraints, which is
not fixed in this commit.

Author: Amul Sul <sulamul@gmail.com>
Reported-by: Antonin Houska <ah@cybertec.at>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/26983.1748418675@localhost
2025-06-05 17:17:13 +02:00
4b05ebf095 Change role names used in trigger test.
The choices made in commit 01463e1cc might pose copyright hazards,
and are more cutesy than informative anyway.

Reported-by: Noah Misch <noah@leadboat.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20250415155850.9b.nmisch@google.com
2025-06-05 11:05:53 -04:00
112e40b867 psql: fix order of join clauses when listing extensions
Commit d696406a9b added a new join to the query for extensions, but did
so in the wrong place, causing the AND clause to be applied to the wrong
join.

Author:	Suraj Kharage <suraj.kharage@enterprisedb.com>
Reviewed-By: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/CAF1DzPVBrN-cmPB2zb7ZU=2J4vEF2fNdArGCG9w+9fnKq4v8tg@mail.gmail.com
2025-06-05 09:54:16 +02:00
b87163e5f3 Fix copy-pasto with process count calculation in method_io_uring.c
This commit replaces the formula used for "TotalProcs" with a call to
pgaio_uring_procs() in pgaio_uring_shmem_init() for the shared memory
initialization, which is exactly the same, removing a duplication.

pgaio_uring_procs() is used for shared memory sizing and a sanity check,
and it has some documentation explaining some reasoning behind the
formula.

Author: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/ME0P300MB044521067A1EDDA9EDEC3793B66DA@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-06-05 09:39:24 +09:00
f777d77387 Don't strip $libdir from LOAD command
Commit 4f7f7b0375 implemented the extension_control_path GUC, and to
make it work it was decided that we should strip the $libdir/ on
module_pathname from .control files, so that extensions don't need to
worry about this change.

This strip logic was implemented on expand_dynamic_library_name()
which works fine when executing the SQL functions from extensions, but
this function is also called when the LOAD command is executed, and
since the user may explicitly pass the $libdir prefix on LOAD
parameter, we should not strip in this case.

This commit fixes this issue by moving the strip logic from
expand_dynamic_library_name() to load_external_function() that is
called when the running the SQL script from extensions.

Reported-by: Evan Si <evsi@amazon.com>
Author: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Bug: #18920
Discussion: https://www.postgresql.org/message-id/flat/18920-b350b1c0a30af006%40postgresql.org
2025-06-04 11:38:12 +02:00
7f3381c7ee psql: Abort connection when using \syncpipeline after COPY TO/FROM
When the backend reads COPY data, it ignores all sync messages, as per
c01641f8ae.  With psql pipelines, it is possible to manually send sync
messages with \sendpipeline which leaves the frontend in an
unrecoverable state as the backend will not send the necessary
ReadyForQuery message that is expected to feed psql result consumption
logic.

It could be possible to artificially reduce the piped_syncs and
requested_results, however libpq's state would still have queued sync
messages in its command queue, and the only way to consume those without
directly calling pqCommandQueueAdvance() is to process ReadyForQuery
messages that won't be sent since the backend ignores these.  Perhaps
this could be improved in the future, but I am not really excited about
introducing this amount of complications in libpq to manipulate the
message queues without a better use case to support it.

Hence, this patch aborts the connection if we detect excessive sync
messages after a COPY in a pipeline to avoid staying in an inconsistent
protocol state, which is the best thing we can do with pipelines in
psql for now.  Note that this change does not prevent wrapping a set
of queries inside a block made of \startpipeline and \endpipeline, only
the use of \syncpipeline for a COPY.

Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/18944-8a926c30f68387dd@postgresql.org
2025-06-04 09:01:29 +09:00
58fbfde152 Fix incorrect format placeholders 2025-06-03 21:38:04 +02:00
0e164eb9f4 Fix a pg_dump scenario for platforms where SEEK_CUR != 1.
POSIX allows such platforms.  Given the lack of complaints, we may not
currently test on such a platform.  This is new in v18 (commit
7d5c83b4e9), so no back-patch.
2025-06-03 11:18:52 -07:00
73bdcfab35 Rename log_lock_failure GUC to log_lock_failures for consistency.
This commit renames the GUC log_lock_failure to log_lock_failures
to align with the existing similar setting log_lock_waits, which uses
the plural form. This improves naming consistency across related GUCs.

Suggested-by: Peter Eisentraut <peter@eisentraut.org>
Author: Fujii Masao <masao.fujii@gmail.com
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/7a8198b6-d5b8-4910-b41e-8d3efcbb015d@eisentraut.org
2025-06-03 10:02:55 +09:00
aa87f69c00 Disallow "=" in names of reloptions and foreign-data options.
We store values for these options as array elements with the syntax
"name=value", hence a name containing "=" confuses matters when
it's time to read the array back in.  Since validation of the
options is often done (long) after this conversion to array format,
that leads to confusing and off-point error messages.  We can
improve matters by rejecting names containing "=" up-front.

(Probably a better design would have involved pairs of array
elements, but it's too late now --- and anyway, there's no
evident use-case for option names like this.  We already
reject such names in some other contexts such as GUCs.)

Reported-by: Chapman Flack <jcflack@acm.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Chapman Flack <jcflack@acm.org>
Discussion: https://postgr.es/m/6830EB30.8090904@acm.org
Backpatch-through: 13
2025-06-02 15:22:44 -04:00
31a7e175fd Correct heap vacuum boundary state setup ordering
052026c9b9 mistakenly reordered setup steps in heap_vacuum_rel(),
incorrectly moving RelationGetNumberOfBlocks() before
vacuum_get_cutoffs().

OldestXmin must be determined before RelationGetNumberOfBlocks()
calculates the number of blocks in the relation that will be vacuumed.
Otherwise tuples older than OldestXmin may be inserted into the end of
the relation into blocks that are not vacuumed. If additional tuples
newer than those inserted into unscanned blocks but older than
OldestXmin are inserted into free space earlier in the relation, the
result could be advancing pg_class.relfrozenxid to a newer value than an
unfrozen XID in one of the unscanned heap pages.

Assigning an incorrect relfrozenxid can lead to data loss, so it is
imperative that it correctly reflect the oldest unfrozen xid.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzntqvVEdbbpqG5JqSZGuLWmy4PBfUO-OswfivKchr2gvw%40mail.gmail.com
2025-06-02 10:54:07 -04:00
fc32be3c94 Fix incorrect format placeholders
Fixes for return type of dclist_count().
2025-06-02 10:12:58 +02:00
32edf732e8 Rename gist stratnum support function
Commit 7406ab623f added a gist support function that we internally
refer to by the symbol GIST_STRATNUM_PROC.  This translated from
"well-known" strategy numbers to opfamily-specific strategy numbers.
However, we later (commit 630f9a43ce) changed this to fit into
index-AM-level compare type mapping, so this function actually now
maps from compare type to opfamily-specific strategy numbers.  So this
name is no longer fitting.

Moreover, the index AM level also supports the opposite, a function to
map from strategy number to compare type.  This is currently not
supported in gist, but one might wonder what this function is supposed
to be called when it is added.

This patch changes the naming of the gist-level functionality to be
more in line with the index-AM-level functionality.  This makes sense
because these are essentially the same thing on different levels.
This also changes the names of the externally visible functions that
are provided for use as such a support function.

Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/37ebb1d9-9036-485f-a215-e55435689917%40eisentraut.org
2025-06-02 08:41:27 +02:00
5231ed8262 Use replay LSN as target for cascading logical WAL senders
A cascading WAL sender doing logical decoding (as known as doing its
work on a standby) has been using as flush LSN the value returned by
GetStandbyFlushRecPtr() (last position safely flushed to disk).  This is
incorrect as such processes are only able to decode changes up to the
LSN that has been replayed by the startup process.

This commit changes cascading logical WAL senders to use the replay LSN,
as returned by GetXLogReplayRecPtr().  This distinction is important
particularly during shutdown, when WAL senders need to send any
remaining available data to their clients, switching WAL senders to a
caught-up state.  Using the latest flush LSN rather than the replay LSN
could cause the WAL senders to be stuck in an infinite loop preventing
them to shut down, as the startup process does not run when WAL senders
attempt to catch up, so they could keep waiting for work that would
never happen.

Backpatch down to v16, where logical decoding on standbys has been
introduced.

Author: Alexey Makhmutov <a.makhmutov@postgrespro.ru>
Reviewed-by: Ajin Cherian <itsajin@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/52138028-7246-421c-9161-4fa108b88070@postgrespro.ru
Backpatch-through: 16
2025-06-02 12:03:59 +09:00
4672b62239 Run pgindent on the previous commit.
Clean up after rearranging PG_TRY blocks.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2954090.1748723636@sss.pgh.pa.us
Backpatch-through: 13
2025-06-01 14:55:24 -04:00
c6f7f11d8f Fix edge-case resource leaks in PL/Python error reporting.
PLy_elog_impl and its subroutine PLy_traceback intended to avoid
leaking any PyObject reference counts, but their coverage of the
matter was sadly incomplete.  In particular, out-of-memory errors
in most of the string-construction subroutines could lead to
reference count leaks, because those calls were outside the
PG_TRY blocks responsible for dropping reference counts.

Fix by (a) adjusting the scopes of the PG_TRY blocks, and
(b) moving the responsibility for releasing the reference counts
of the traceback-stack objects to PLy_elog_impl.  This requires
some additional "volatile" markers, but not too many.

In passing, fix an ancient thinko: use of the "e_module_o" PyObject
was guarded by "if (e_type_s)", where surely "if (e_module_o)"
was meant.  This would only have visible consequences if the
"__name__" attribute were present but the "__module__" attribute
wasn't, which apparently never happens; but someday it might.

Rearranging the PG_TRY blocks requires indenting a fair amount
of code one more tab stop, which I'll do separately for clarity.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2954090.1748723636@sss.pgh.pa.us
Backpatch-through: 13
2025-06-01 14:48:35 -04:00
e5a3c9d9b5 postgres_fdw: Inherit the local transaction's access/deferrable modes.
Previously, postgres_fdw always 1) opened a remote transaction in READ
WRITE mode even when the local transaction was READ ONLY, causing a READ
ONLY transaction using it that references a foreign table mapped to a
remote view executing a volatile function to write in the remote side,
and 2) opened the remote transaction in NOT DEFERRABLE mode even when
the local transaction was DEFERRABLE, causing a SERIALIZABLE READ ONLY
DEFERRABLE transaction using it to abort due to a serialization failure
in the remote side.

To avoid these, modify postgres_fdw to open a remote transaction in the
same access/deferrable modes as the local transaction.  This commit also
modifies it to open a remote subtransaction in the same access mode as
the local subtransaction.

Although these issues exist since the introduction of postgres_fdw,
there have been no reports from the field.  So it seems fine to just fix
them in master only.

Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAPmGK16n_hcUUWuOdmeUS%2Bw4Q6dZvTEDHb%3DOP%3D5JBzo-M3QmpQ%40mail.gmail.com
2025-06-01 17:30:00 +09:00
b006bcd531 Fix MERGE into a plain inheritance parent table.
When a MERGE's target table is the parent of an inheritance tree, any
INSERT actions insert into the parent table using ModifyTableState's
rootResultRelInfo. However, there are two bugs in the way is
initialized:

1. ExecInitMerge() incorrectly uses a different ResultRelInfo entry
from ModifyTableState's resultRelInfo array to build the insert
projection, which may not be compatible with rootResultRelInfo.

2. ExecInitModifyTable() does not fully initialize rootResultRelInfo.
Specifically, ri_WithCheckOptions, ri_WithCheckOptionExprs,
ri_returningList, and ri_projectReturning are not initialized.

This can lead to crashes, or incorrect query results due to failing to
check WCO's or process the RETURNING list for INSERT actions.

Fix both these bugs in ExecInitMerge(), noting that it is only
necessary to fully initialize rootResultRelInfo if the MERGE has
INSERT actions and the target table is a plain inheritance parent.

Backpatch to v15, where MERGE was introduced.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/4rlmjfniiyffp6b3kv4pfy4jw3pciy6mq72rdgnedsnbsx7qe5@j5hlpiwdguvc
Backpatch-through: 15
2025-05-31 12:12:58 +01:00
e050af2868 Change internal plan ID type from uint64 to int64
uint64 was chosen to be consistent with the type used by the query ID,
but the conclusion of a recent discussion for the query ID is that int64
is a better fit as the signed form is shown to the user, for PGSS or
EXPLAIN outputs.

This commit changes the plan ID to use int64, following c3eda50b06
that has done the same for the query ID.

The plan ID is new to v18, introduced in 2a0cd38da5.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/aCvzJNwetyEI3Sgo@paquier.xyz
2025-05-31 09:40:45 +09:00
706054b11b Ensure we have a snapshot when updating various system catalogs.
A few places that access system catalogs don't set up an active
snapshot before potentially accessing their TOAST tables.  To fix,
push an active snapshot just before each section of code that might
require accessing one of these TOAST tables, and pop it shortly
afterwards.  While at it, this commit adds some rather strict
assertions in an attempt to prevent such issues in the future.

Commit 16bf24e0e4 recently removed pg_replication_origin's TOAST
table in order to fix the same problem for that catalog.  On the
back-branches, those bugs are left in place.  We cannot easily
remove a catalog's TOAST table on released major versions, and only
replication origins with extremely long names are affected.  Given
the low severity of the issue, fixing older versions doesn't seem
worth the trouble of significantly modifying the patch.

Also, on v13 and v14, the aforementioned strict assertions have
been omitted because commit 2776922201, which added
HaveRegisteredOrActiveSnapshot(), was not back-patched.  While we
could probably back-patch it now, I've opted against it because it
seems unlikely that new TOAST snapshot issues will be introduced in
the oldest supported versions.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18127-fe54b6a667f29658%40postgresql.org
Discussion: https://postgr.es/m/18309-c0bf914950c46692%40postgresql.org
Discussion: https://postgr.es/m/ZvMSUPOqUU-VNADN%40nathan
Backpatch-through: 13
2025-05-30 15:17:28 -05:00
d98cefe114 Allow larger packets during GSSAPI authentication exchange.
Our GSSAPI code only allows packet sizes up to 16kB.  However it
emerges that during authentication, larger packets might be needed;
various authorities suggest 48kB or 64kB as the maximum packet size.
This limitation caused login failure for AD users who belong to many
AD groups.  To add insult to injury, we gave an unintelligible error
message, typically "GSSAPI context establishment error: The routine
must be called again to complete its function: Unknown error".

As noted in code comments, the 16kB packet limit is effectively a
protocol constant once we are doing normal data transmission: the
GSSAPI code splits the data stream at those points, and if we change
the limit then we will have cross-version compatibility problems
due to the receiver's buffer being too small in some combinations.
However, during the authentication exchange the packet sizes are
not determined by us, but by the underlying GSSAPI library.  So we
might as well just try to send what the library tells us to.
An unpatched recipient will fail on a packet larger than 16kB,
but that's not worse than the sender failing without even trying.
So this doesn't introduce any meaningful compatibility problem.

We still need a buffer size limit, but we can easily make it be
64kB rather than 16kB until transport negotiation is complete.
(Larger values were discussed, but don't seem likely to add
anything.)

Reported-by: Chris Gooch <cgooch@bamfunds.com>
Fix-suggested-by: Jacob Champion <jacob.champion@enterprisedb.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/DS0PR22MB5971A9C8A3F44BCC6293C4DABE99A@DS0PR22MB5971.namprd22.prod.outlook.com
Backpatch-through: 13
2025-05-30 12:55:15 -04:00
961553daf5 Make XactLockTableWait() and ConditionalXactLockTableWait() interruptable more.
Previously, XactLockTableWait() and ConditionalXactLockTableWait() could enter
a non-interruptible loop when they successfully acquired a lock on a transaction
but the transaction still appeared to be running. Since this loop continued
until the transaction completed, it could result in long, uninterruptible waits.

Although this scenario is generally unlikely since XactLockTableWait() and
ConditionalXactLockTableWait() can basically acquire a transaction lock
only when the transaction is not running, it can occur in a hot standby.
In such cases, the transaction may still appear active due to
the KnownAssignedXids list, even while no lock on the transaction exists.
For example, this situation can happen when creating a logical replication
slot on a standby.

The cause of the non-interruptible loop was the absence of CHECK_FOR_INTERRUPTS()
within it. This commit adds CHECK_FOR_INTERRUPTS() to the loop in both functions,
ensuring they can be interrupted safely.

Back-patch to all supported branches.

Author: Kevin K Biju <kevinkbiju@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAM45KeELdjhS-rGuvN=ZLJ_asvZACucZ9LZWVzH7bGcD12DDwg@mail.gmail.com
Backpatch-through: 13
2025-05-31 00:08:40 +09:00