1
0
mirror of https://github.com/postgres/postgres.git synced 2025-10-18 04:29:09 +03:00
Commit Graph

46205 Commits

Author SHA1 Message Date
Nathan Bossart
3ef2b863a3 Use PqMsg_* macros in fe-protocol3.c.
Oversight in commit f4b54e1ed9.

Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Discussion: https://postgr.es/m/aKx5vEbbP03JNgtp%40nathan
2025-08-25 11:08:26 -05:00
Peter Eisentraut
878656dbde Formatting cleanup of guc_tables.c
This cleans up a few minor formatting inconsistencies.

Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/dae6fe89-1e0c-4c3f-8d92-19d23374fb10%40eisentraut.org
2025-08-25 09:10:27 +02:00
Noah Misch
ad4412480d Rewrite previous commit's test for TestUpgradeXversion compatibility.
v17 introduced the MAINTAIN ON TABLES privilege.  That changed the
applicable "baseacls" reaching buildACLCommands().  That yielded
spurious TestUpgradeXversion diffs.  Change to use a TYPES privilege.
Types have the same one privilege in all supported versions, so they
avoid the problem.  Per buildfarm.  Back-patch to v13, like that commit.

Discussion: https://postgr.es/m/20250823144505.88.nmisch@google.com
Backpatch-through: 13
2025-08-23 16:46:20 -07:00
Noah Misch
b61a5c4bed Sort DO_DEFAULT_ACL dump objects independent of OIDs.
Commit 0decd5e89d missed DO_DEFAULT_ACL,
leading to assertion failures, potential dump order instability, and
spurious schema diffs.  Back-patch to v13, like that commit.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/d32aaa8d-df7c-4f94-bcb3-4c85f02bea21@gmail.com
Backpatch-through: 13
2025-08-22 20:50:28 -07:00
Alexander Korotkov
c13070a27b Revert "Get rid of WALBufMappingLock"
This reverts commit bc22dc0e0d.
It appears that conditional variables are not suitable for use inside
critical sections.  If WaitLatch()/WaitEventSetWaitBlock() face postmaster
death, they exit, releasing all locks instead of PANIC.  In certain
situations, this leads to data corruption.

Reported-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/B3C69B86-7F82-4111-B97F-0005497BB745%40yandex-team.ru
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Yura Sokolov <y.sokolov@postgrespro.ru>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Backpatch-through: 18
2025-08-22 19:26:38 +03:00
Nathan Bossart
b63952a781 vacuumdb: Fix --missing-stats-only with virtual generated columns.
Statistics aren't created for virtual generated columns, so
"vacuumdb --missing-stats-only" always chooses to analyze tables
that have them.  To fix, modify vacuumdb's query for retrieving
relations that are missing statistics to exclude those columns.

Oversight in commit edba754f05.

Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/20250820104226.8ba51e43164cd590b863ce41%40sraoss.co.jp
Backpatch-through: 18
2025-08-22 11:11:28 -05:00
Heikki Linnakangas
807ee417e5 Revert unnecessary check for NULL
Jelte pointed out that this was unnecessary, but I failed to remove it
before pushing f6f0542266. Oops.

Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://www.postgresql.org/message-id/CAGECzQT%3DxNV-V%2BvFC7YQwYQMj0wGN61b3p%3DJ1_rL6M0vbjTtrA@mail.gmail.com
Backpatch-through: 18
2025-08-22 14:47:19 +03:00
Heikki Linnakangas
e411a8d25a libpq: Be strict about cancel key lengths
The protocol documentation states that the maximum length of a cancel
key is 256 bytes. This starts checking for that limit in libpq.
Otherwise third party backend implementations will probably start
using more bytes anyway. We also start requiring that a protocol 3.0
connection does not send a longer cancel key, to make sure that
servers don't start breaking old 3.0-only clients by accident. Finally
this also restricts the minimum key length to 4 bytes (both in the
protocol spec and in the libpq implementation).

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Jacob Champion <jchampion@postgresql.org>
Discussion: https://www.postgresql.org/message-id/df892f9f-5923-4046-9d6f-8c48d8980b50@iki.fi
Backpatch-through: 18
2025-08-22 14:39:29 +03:00
Heikki Linnakangas
f6f0542266 libpq: Handle OOM by disconnecting instead of hanging or skipping msgs
In most cases, if an out-of-memory situation happens, we attach the
error message to the connection and report it at the next
PQgetResult() call. However, there are a few cases, while processing
messages that are not associated with any particular query, where we
handled failed allocations differently and not very nicely:

- If we ran out of memory while processing an async notification,
  getNotify() either returned EOF, which stopped processing any
  further data until more data was received from the server, or
  silently dropped the notification. Returning EOF is problematic
  because if more data never arrives, e.g. because the connection was
  used just to wait for the notification, or because the next
  ReadyForQuery was already received and buffered, it would get stuck
  forever. Silently dropping a notification is not nice either.

- (New in v18) If we ran out of memory while receiving BackendKeyData
  message, getBackendKeyData() returned EOF, which has the same issues
  as in getNotify().

- If we ran out of memory while saving a received a ParameterStatus
  message, we just skipped it. A later call to PQparameterStatus()
  would return NULL, even though the server did send the status.

Change all those cases to terminate the connnection instead. Our
options for reporting those errors are limited, but it seems better to
terminate than try to soldier on. Applications should handle
connection loss gracefully, whereas silently missing a notification,
parameter status, or cancellation key could cause much weirder
problems.

This also changes the error message on OOM while expanding the input
buffer. It used to report "cannot allocate memory for input buffer",
followed by "lost synchronization with server: got message type ...".
The "lost synchronization" message seems unnecessary, so remove that
and report only "cannot allocate memory for input buffer". (The
comment speculated that the out of memory could indeed be caused by
loss of sync, but that seems highly unlikely.)

This evolved from a more narrow patch by Jelte Fennema-Nio, which was
reviewed by Jacob Champion.

Somewhat arbitrarily, backpatch to v18 but no further. These are
long-standing issues, but we haven't received any complaints from the
field. We can backpatch more later, if needed.

Co-authored-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Jacob Champion <jchampion@postgresql.org>
Discussion: https://www.postgresql.org/message-id/df892f9f-5923-4046-9d6f-8c48d8980b50@iki.fi
Backpatch-through: 18
2025-08-22 14:39:25 +03:00
Heikki Linnakangas
661f821ef0 Use ereport() rather than elog()
Noah pointed this out before I committed 50f770c3d9, but I
accidentally pushed the old version with elog() anyway. Oops.

Reported-by: Noah Misch <noah@leadboat.com>
Discussion: https://www.postgresql.org/message-id/20250820003756.31.nmisch@google.com
2025-08-22 13:35:05 +03:00
Heikki Linnakangas
50f770c3d9 Revert GetTransactionSnapshot() to return historic snapshot during LR
Commit 1585ff7387 changed GetTransactionSnapshot() to throw an error
if it's called during logical decoding, instead of returning the
historic snapshot. I made that change for extra protection, because a
historic snapshot can only be used to access catalog tables while
GetTransactionSnapshot() is usually called when you're executing
arbitrary queries. You might get very subtle visibility problems if
you tried to use the historic snapshot for arbitrary queries.

There's no built-in code in PostgreSQL that calls
GetTransactionSnapshot() during logical decoding, but it turns out
that the pglogical extension does just that, to evaluate row filter
expressions. You would get weird results if the row filter runs
arbitrary queries, but it is sane as long as you don't access any
non-catalog tables. Even though there are no checks to enforce that in
pglogical, a typical row filter expression does not access any tables
and works fine. Accessing tables marked with the user_catalog_table =
true option is also OK.

To fix pglogical with row filters, and any other extensions that might
do similar things, revert GetTransactionSnapshot() to return a
historic snapshot during logical decoding.

To try to still catch the unsafe usage of historic snapshots, add
checks in heap_beginscan() and index_beginscan() to complain if you
try to use a historic snapshot to scan a non-catalog table. We're very
close to the version 18 release however, so add those new checks only
in master.

Backpatch-through: 18
Reported-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://www.postgresql.org/message-id/20250809222338.cc.nmisch@google.com
2025-08-22 13:07:46 +03:00
Peter Eisentraut
16a0039dc0 Reduce lock level for ALTER DOMAIN ... VALIDATE CONSTRAINT
Reduce from ShareLock to ShareUpdateExclusivelock.  Validation during
ALTER DOMAIN ... ADD CONSTRAINT keeps using ShareLock.

Example:

    create domain d1 as int;
    create table t (a d1);
    alter domain d1 add constraint cc10 check (value > 10) not valid;

    begin;
    alter domain d1 validate constraint cc10;

    -- another session
    insert into t values (8);

Now we should still be able to perform DML operations on table t while
the domain constraint is being validated.  The equivalent works
already on table constraints.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CACJufxHz92A88NLRTA2msgE2dpXpE-EoZ2QO61od76-6bfqurA%40mail.gmail.com
2025-08-22 08:56:11 +02:00
Michael Paquier
13b935cd52 Change dynahash.c and hsearch.h to use int64 instead of long
This code was relying on "long", which is signed 8 bytes everywhere
except on Windows where it is 4 bytes, that could potentially expose it
to overflows, even if the current uses in the code are fine as far as I
know.  This code is now able to rely on the same sizeof() variable
everywhere, with int64.  long was used for sizes, partition counts and
entry counts.

Some callers of the dynahash.c routines used long declarations, that can
be cleaned up to use int64 instead.  There was one shortcut based on
SIZEOF_LONG, that can be removed.  long is entirely removed from
dynahash.c and hsearch.h.

Similar work was done in b1e5c9fa9a.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aKQYp-bKTRtRauZ6@paquier.xyz
2025-08-22 11:59:02 +09:00
Michael Paquier
ef03ea01fe Ignore temporary relations in RelidByRelfilenumber()
Temporary relations may share the same RelFileNumber with a permanent
relation, or other temporary relations associated with other sessions.

Being able to uniquely identify a temporary relation would require
RelidByRelfilenumber() to know about the proc number of the temporary
relation it wants to identify, something it is not designed for since
its introduction in f01d1ae3a1.

There are currently three callers of RelidByRelfilenumber():
- autoprewarm.
- Logical decoding, reorder buffer.
- pg_filenode_relation(), that attempts to find a relation OID based on
a tablespace OID and a RelFileNumber.

This makes the situation problematic particularly for the first two
cases, leading to the possibility of random ERRORs due to
inconsistencies that temporary relations can create in the cache
maintained by RelidByRelfilenumber().  The third case should be less of
an issue, as I suspect that there are few direct callers of
pg_filenode_relation().

The window where the ERRORs are happen is very narrow, requiring an OID
wraparound to create a lookup conflict in RelidByRelfilenumber() with a
temporary table reusing the same OID as another relation already cached.
The problem is easier to reach in workloads with a high OID consumption
rate, especially with a higher number of temporary relations created.

We could get pg_filenode_relation() and RelidByRelfilenumber() to work
with temporary relations if provided the means to identify them with an
optional proc number given in input, but the years have also shown that
we do not have a use case for it, yet.  Note that this could not be
backpatched if pg_filenode_relation() needs changes.  It is simpler to
ignore temporary relations.

Reported-by: Shenhao Wang <wangsh.fnst@fujitsu.com>
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-By: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-By: Robert Haas <robertmhaas@gmail.com>
Reviewed-By: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-By: Takamichi Osumi <osumi.takamichi@fujitsu.com>
Reviewed-By: Michael Paquier <michael@paquier.xyz>
Reviewed-By: Masahiko Sawada <sawada.mshk@gmail.com>
Reported-By: Shenhao Wang <wangsh.fnst@fujitsu.com>
Discussion: https://postgr.es/m/bbaaf9f9-ebb2-645f-54bb-34d6efc7ac42@fujitsu.com
Backpatch-through: 13
2025-08-22 09:03:59 +09:00
Peter Eisentraut
47932f3cdc Use consistent type for pgaio_io_get_id() result
The result of pgaio_io_get_id() was being assigned to a mix of int and
uint32 variables.  This fixes it to use int consistently, which seems
the most correct.  Also change the queue empty special value in
method_worker.c to -1 from UINT32_MAX.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/70c784b3-f60b-4652-b8a6-75e5f051243e%40eisentraut.org
2025-08-21 19:45:25 +02:00
Fujii Masao
12da45742c Disallow server start with sync_replication_slots = on and wal_level < logical.
Replication slot synchronization (sync_replication_slots = on)
requires wal_level to be logical. This commit prevents the server
from starting if sync_replication_slots is enabled but wal_level
is set to minimal or replica.

Failing early during startup helps users catch invalid configurations
immediately, which is important because changing wal_level requires
a server restart.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Discussion: https://postgr.es/m/CAH0PTU_pc3oHi__XESF9ZigCyzai1Mo3LsOdFyQA4aUDkm01RA@mail.gmail.com
2025-08-21 22:18:11 +09:00
Peter Eisentraut
53eff471c6 PL/Python: Add event trigger support
Allow event triggers to be written in PL/Python.  It provides a TD
dictionary with some information about the event trigger.

Author: Euler Taveira <euler@eulerto.com>
Co-authored-by: Dimitri Fontaine <dimitri@2ndQuadrant.fr>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/03f03515-2068-4f5b-b357-8fb540883c38%40app.fastmail.com
2025-08-21 09:21:11 +02:00
Peter Eisentraut
6e09c960eb PL/Python: Refactor for event trigger support
Change is_trigger type from boolean to enum.  That's a preparation for
adding event trigger support.

Author: Euler Taveira <euler@eulerto.com>
Co-authored-by: Dimitri Fontaine <dimitri@2ndQuadrant.fr>
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/03f03515-2068-4f5b-b357-8fb540883c38%40app.fastmail.com
2025-08-21 09:16:29 +02:00
Michael Paquier
e8eb98754b Apply some fat commas to commands of TAP tests
This is similar to 19c6e92b13, in order to keep the style used in the
scripts consistent for the option names and values used in commands.
The places updated in this commit have been added recently in
71ea0d6795.

These changes are cosmetic; there is no need for a backpatch.
2025-08-21 14:17:26 +09:00
Tom Lane
a67d4847a4 Fix re-execution of a failed SQLFunctionCache entry.
If we error out during execution of a SQL-language function, we will
often leave behind non-null pointers in its SQLFunctionCache's cplan
and eslist fields.  This is problematic if the SQLFunctionCache is
re-used, because those pointers will point at resources that were
released during error cleanup.  This problem escaped detection so far
because ordinarily we won't re-use an FmgrInfo+SQLFunctionCache struct
after a query error.  However, in the rather improbable case that
someone implements an opclass support function in SQL language, there
will be long-lived FmgrInfos for it in the relcache, and then the
problem is reachable after the function throws an error.

To fix, add a flag to SQLFunctionCache that tracks whether execution
escapes out of fmgr_sql, and clear out the relevant fields during
init_sql_fcache if so.  (This is going to need more thought if we ever
try to share FMgrInfos across threads; but it's very far from being
the only problem such a project will encounter, since many functions
regard fn_extra as being query-local state.)

This broke at commit 0313c5dc6; before that we did not try to re-use
SQLFunctionCache state across calls.  Hence, back-patch to v18.

Bug: #19026
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19026-90aed5e71d0c8af3@postgresql.org
Backpatch-through: 18
2025-08-20 16:09:18 -04:00
Peter Eisentraut
e9c043a11a Minor error message enhancement
In refuseDupeIndexAttach(), change from

    errdetail("Another index is already attached for partition \"%s\"."...)

to

    errdetail("Another index \"%s\" is already attached for partition \"%s\"."...)

so we can easily understand which index is already attached for
partition \"%s\".

Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/CACJufxGBfykJ_1ztk9T%2BL_gLmkOSOF%2BmL9Mn4ZPydz-rh%3DLccQ%40mail.gmail.com
2025-08-20 18:14:24 +02:00
Michael Paquier
1f2e51e3c7 Fix assertion failure with replication slot release in single-user mode
Some replication slot manipulations (logical decoding via SQL,
advancing) were failing an assertion when releasing a slot in
single-user mode, because active_pid was not set in a ReplicationSlot
when its slot is acquired.

ReplicationSlotAcquire() has some logic to be able to work with the
single-user mode.  This commit sets ReplicationSlot->active_pid to
MyProcPid, to let the slot-related logic fall-through, considering the
single process as the one holding the slot.

Some TAP tests are added for various replication slot functions with the
single-user mode, while on it, for slot creation, drop, advancing, copy
and logical decoding with multiple slot types (temporary, physical vs
logical).  These tests are skipped on Windows, as direct calls of
postgres --single would fail on permission failures.  There is no
platform-specific behavior that needs to be checked, so living with this
restriction should be fine.  The CI is OK with that, now let's see what
the buildfarm tells.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Mutaamba Maasha <maasha@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB14966ED588A0328DAEBE8CB25F5FA2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 13
2025-08-20 15:00:04 +09:00
Fujii Masao
6429e5b771 vacuumdb: Make vacuumdb --analyze-only process partitioned tables.
vacuumdb should follow the behavior of the underlying VACUUM and ANALYZE
commands. When --analyze-only is used, it ought to analyze regular tables,
materialized views, and partitioned tables, just as ANALYZE (with no explicit
target tables) does. Otherwise, it should only process regular tables and
materialized views, since VACUUM skips partitioned tables when no targets
are given.

Previously, vacuumdb --analyze-only skipped partitioned tables. This was
inconsistent, and also inconvenient after pg_upgrade, where --analyze-only
is typically used to gather missing statistics.

This commit fixes the behavior so that vacuumdb --analyze-only also processes
partitioned tables. As a result, both vacuumdb --analyze-only and
ANALYZE (with no explicit targets) now analyze regular tables,
partitioned tables, and materialized views, but not foreign tables.

Because this is a nontrivial behavior change, it is applied only to master.

Reported-by: Zechman, Derek S <Derek.S.Zechman@snapon.com>
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Co-authored-by: Mircea Cadariu <cadariu.mircea@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CO1PR04MB8281387B9AD9DE30976966BBC045A%40CO1PR04MB8281.namprd04.prod.outlook.com
2025-08-20 13:16:06 +09:00
Nathan Bossart
3eec0e6533 Fix comment for MAX_SIMUL_LWLOCKS.
This comment mentions that pg_buffercache locks all buffer
partitions simultaneously, but it hasn't done so since v10.

Oversight in commit 6e654546fb.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/aKTuAHVEuYCUmmIy%40nathan
2025-08-19 16:48:22 -05:00
Nathan Bossart
c6abf24ebf Fix misspelling of "tranche" in dsa.h.
Oversight in commit bb952c8c8b.

Discussion: https://postgr.es/m/aKOWzsCPgrsoEG1Q%40nathan
2025-08-19 10:43:15 -05:00
Peter Eisentraut
16d434d53d Add src/include/catalog/README
This just includes a link to the bki documentation, to help people get
started.

Before commit 372728b0d4, there was a README at
src/backend/catalog/README, but then this was moved to the SGML
documentation.  So this effectively puts back a link to what was
moved.  But src/include/catalog/ is probably a better location,
because that's where all the interesting files are.

Co-authored-by: Florents Tselai <florents.tselai@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+v5N400GJFJ9RyXAX7hFKbtF7vVQGvWdFWEfcSQmvVhi9xfrA@mail.gmail.com
2025-08-19 08:41:42 +02:00
Amit Kapila
aa21e49225 Fix self-deadlock during DROP SUBSCRIPTION.
The DROP SUBSCRIPTION command performs several operations: it stops the
subscription workers, removes subscription-related entries from system
catalogs, and deletes the replication slot on the publisher server.
Previously, this command acquired an AccessExclusiveLock on
pg_subscription before initiating these steps.

However, while holding this lock, the command attempts to connect to the
publisher to remove the replication slot. In cases where the connection is
made to a newly created database on the same server as subscriber, the
cache-building process during connection tries to acquire an
AccessShareLock on pg_subscription, resulting in a self-deadlock.

To resolve this issue, we reduce the lock level on pg_subscription during
DROP SUBSCRIPTION from AccessExclusiveLock to RowExclusiveLock. Earlier,
the higher lock level was used to prevent the launcher from starting a new
worker during the drop operation, as a restarted worker could become
orphaned.

Now, instead of relying on a strict lock, we acquire an AccessShareLock on
the specific subscription being dropped and re-validate its existence
after acquiring the lock. If the subscription is no longer valid, the
worker exits gracefully. This approach avoids the deadlock while still
ensuring that orphan workers are not created.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/18988-7312c868be2d467f@postgresql.org
2025-08-19 05:33:17 +00:00
Michael Paquier
a977e419ee Refactor ReadMultiXactCounts() into GetMultiXactInfo()
This provides a single entry point to access some information about the
state of MultiXacts, able to return some data about multixacts offsets
and counts.  Originally this function was only able to return some
information about the number of multixacts and multixact members,
extended here to provide some data about the oldest multixact ID in use
and the oldest offset, if known.

This change has been proposed in a patch that aims at providing more
monitoring capabilities for multixacts, and it is useful on its own.
GetMultiXactInfo() is added to multixact.h, becoming available for
out-of-core code.

Extracted from a larger patch by the same author.

Author: Naga Appani <nagnrik@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com
2025-08-19 14:04:09 +09:00
Michael Paquier
9b7eb6f02e Remove useless pointer update in StatsShmemInit()
This pointer was not used after its last update.  This variable
assignment was most likely a vestige artifact of the earlier versions of
the patch set that have led to 5891c7a8ed.

This pointer update is useless, so let's remove it.  It removes one call
to pgstat_dsa_init_size(), making the code slightly easier to grasp.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aKLsu2sdpnyeuSSc@ip-10-97-1-34.eu-west-3.compute.internal
2025-08-19 09:54:18 +09:00
Richard Guo
bf9ee294e5 Simplify relation_has_unique_index_for()
Now that the only call to relation_has_unique_index_for() that
supplied an exprlist and oprlist has been removed, the loop handling
those lists is effectively dead code.  This patch removes that loop
and simplifies the function accordingly.

Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com
2025-08-19 09:37:04 +09:00
Richard Guo
24225ad9aa Pathify RHS unique-ification for semijoin planning
There are two implementation techniques for semijoins: one uses the
JOIN_SEMI jointype, where the executor emits at most one matching row
per left-hand side (LHS) row; the other unique-ifies the right-hand
side (RHS) and then performs a plain inner join.

The latter technique currently has some drawbacks related to the
unique-ification step.

* Only the cheapest-total path of the RHS is considered during
unique-ification.  This may cause us to miss some optimization
opportunities; for example, a path with a better sort order might be
overlooked simply because it is not the cheapest in total cost.  Such
a path could help avoid a sort at a higher level, potentially
resulting in a cheaper overall plan.

* We currently rely on heuristics to choose between hash-based and
sort-based unique-ification.  A better approach would be to generate
paths for both methods and allow add_path() to decide which one is
preferable, consistent with how path selection is handled elsewhere in
the planner.

* In the sort-based implementation, we currently pay no attention to
the pathkeys of the input subpath or the resulting output.  This can
result in redundant sort nodes being added to the final plan.

This patch improves semijoin planning by creating a new RelOptInfo for
the RHS rel to represent its unique-ified version.  It then generates
multiple paths that represent elimination of distinct rows from the
RHS, considering both a hash-based implementation using the cheapest
total path of the original RHS rel, and sort-based implementations
that either exploit presorted input paths or explicitly sort the
cheapest total path.  All resulting paths compete in add_path(), and
those deemed worthy of consideration are added to the new RelOptInfo.
Finally, the unique-ified rel is joined with the other side of the
semijoin using a plain inner join.

As a side effect, most of the code related to the JOIN_UNIQUE_OUTER
and JOIN_UNIQUE_INNER jointypes -- used to indicate that the LHS or
RHS path should be made unique -- has been removed.  Besides, the
T_Unique path now has the same meaning for both semijoins and upper
DISTINCT clauses: it represents adjacent-duplicate removal on
presorted input.  This patch unifies their handling by sharing the
same data structures and functions.

This patch also removes the UNIQUE_PATH_NOOP related code along the
way, as it is dead code -- if the RHS rel is provably unique, the
semijoin should have already been simplified to a plain inner join by
analyzejoins.c.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://postgr.es/m/CAMbWs4-EBnaRvEs7frTLbsXiweSTUXifsteF-d3rvv01FKO86w@mail.gmail.com
2025-08-19 09:35:40 +09:00
Michael Paquier
3c07944d04 test_ddl_deparse: Rename test create_sequence_1 to create_sequence
This test was the only one named following the convention used for
alternate output files.  This was a little bit confusing when looking at
the diffs of the test, because one would think that the diffs are based
on an uncommon case, as alternate outputs are usually used for uncommon
configuration scenarios.

create_sequence_1 was the only test in the tree using such a name, and
it had no alternate output.

Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/aKLY6wCa_OInr3kY@paquier.xyz
2025-08-19 09:08:57 +09:00
Michael Paquier
24e71d53f8 Remove unneeded header declarations in multixact.c
Two header declarations were related to SQL-callable functions, that
should have been cleaned up in df9133fa63.  Some more includes can be
removed on closer inspection, so let's clean up these as well, while on
it.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/345438.1755524834@sss.pgh.pa.us
2025-08-19 08:57:20 +09:00
David Rowley
a98ccf727e Remove HASH_DEBUG output from dynahash.c
This existed in a semi broken stated from be0a66666 until 296cba276.
Recent discussion has questioned the value of having this at all as it
only outputs static information from various of the hash table's
properties when the hash table is created.

Author: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/OSCPR01MB1496650D03FA0293AB9C21416F534A@OSCPR01MB14966.jpnprd01.prod.outlook.com
2025-08-19 11:14:21 +12:00
David Rowley
05fcb9667c Use elog(DEBUG4) for dynahash.c statistics output
Previously this was being output to stderr.  This commit adjusts things
to use elog(DEBUG4).  Here we also adjust the format of the message to
add the hash table name and also put the message on a single line.  This
should make grepping the logs for this information easier.

Also get rid of the global hash table statistics.  This seems very dated
and didn't fit very well with trying to put all the statistics for a
specific hash table on a single log line.

The main aim here is to allow it so we can have at least one buildfarm
member build with HASH_STATISTICS to help prevent future changes from
breaking things in that area.  ca3891251 recently fixed some issues
here.

In passing, switch to using uint64 data types rather than longs for the
usage counters.  The long type is 32 bits on some platforms we support.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAApHDvoccvJ9CG5zx+i-EyCzJbcL5K=CzqrnL_YN59qaL5hiaw@mail.gmail.com
2025-08-19 10:57:44 +12:00
Tom Lane
5e8f05cd70 Fix missing "use Test::More" in Kerberos.pm.
Apparently the only Test::More function this script uses is
BAIL_OUT, so this omission just results in the wrong error
output appearing in the cases where it bails out.

Seems to have been an oversight in commit 9f899562d which
split Kerberos.pm out of another script.

Author: Maxim Orlov <orlovmg@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACG=ezY1Dp-S94b78nN0ZuaBGGcMUB6_nF-VyYUwPt1ArFqmGA@mail.gmail.com
Backpatch-through: 17
2025-08-18 14:54:59 -04:00
Peter Eisentraut
c61d51d500 Detect buffer underflow in get_th()
Input with zero length can result in a buffer underflow when
accessing *(num + (len - 1)), as (len - 1) would produce a negative
index.  Add an assertion for zero-length input to prevent it.

This was found by ALT Linux Team.

Reviewing the call sites shows that get_th() currently cannot be
applied to an empty string: it is always called on a string containing
a number we've just printed.  Therefore, an assertion rather than a
user-facing error message is sufficient.

Co-authored-by: Alexander Kuznetsov <kuznetsovam@altlinux.org>
Discussion: https://www.postgresql.org/message-id/flat/e22df993-cdb4-4d0a-b629-42211ebed582@altlinux.org
2025-08-18 11:03:22 +02:00
Michael Paquier
df9133fa63 Move SQL-callable code related to multixacts into its own file
A patch is under discussion to add more SQL capabilities related to
multixacts, and this move avoids bloating the file more than necessary.
This affects pg_get_multixact_members().  A side effect of this move is
the requirement to add mxstatus_to_string() to multixact.h.

Extracted from a larger patch by the same author, tweaked by me.

Author: Naga Appani <nagnrik@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com
2025-08-18 14:57:55 +09:00
Michael Paquier
ba3d93b2e8 Refactor init_params() in sequence.c to not use FormData_pg_sequence_data
init_params() sets up "last_value" and "is_called" for a sequence
relation holdind its metadata, based on the sequence properties in
pg_sequences.  "log_cnt" is the third property that can be updated in
this routine for FormData_pg_sequence_data, tracking when WAL records
should be generated for a sequence after nextval() iterations.  This
routine is called when creating or altering a sequence.

This commit refactors init_params() to not depend anymore on
FormData_pg_sequence_data, removing traces of it in sequence.c, making
easier the manipulation of metadata related to sequences.  The knowledge
about "log_cnt" is replaced with a more general "reset_state" flag, to
let the caller know if the sequence state should be reset.  In the case
of in-core sequences, this relates to WAL logging.  We still need to
depend on FormData_pg_sequence.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/ZWlohtKAs0uVVpZ3@paquier.xyz
2025-08-18 11:38:44 +09:00
Michael Paquier
97ca67377a Remove md5() call from isolation test for CLUSTER and TOAST
This test was failing because MD5 computations are not supported in
these environments.  This switches the test to rely on sha256() instead,
providing the same coverage while avoiding the failure.

Oversight in f57e214d1c.  Per buildfarm members gecko, molamola,
shikra and froghopper.

Discussion: https://postgr.es/m/aKJijS2ZRfRZiYb0@paquier.xyz
2025-08-18 08:18:09 +09:00
Etsuro Fujita
5a8ab650a7 Update obsolete comments in ResultRelInfo struct.
Commit c5b7ba4e6 changed things so that the ri_RootResultRelInfo field
of this struct is set for both partitions and inheritance children and
used for tuple routing and transition capture (before that commit, it
was only set for partitions to route tuples into), but failed to update
these comments.

Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAPmGK14NF5CcdCmTZpxrvpvBiT0y4EqKikW1r_wAu1CEHeOmUA%40mail.gmail.com
Backpatch-through: 14
2025-08-17 19:40:00 +09:00
Michael Paquier
f57e214d1c Add isolation test for TOAST value reuse during CLUSTER
This test exercises the corner case in toast_save_datum() where CLUSTER
operations encounter duplicated TOAST references, reusing the existing
TOAST data instead of creating redundant copies.

During table rewrites like CLUSTER, both live and recently-dead versions
of a row may reference the same TOAST value.  When copying the second or
later version of such a row, the system checks if a TOAST value already
exists in the new TOAST table using toastrel_valueid_exists().  If
found, toast_save_datum() sets data_todo = 0 so as redundant data is not
stored, ensuring only one copy of the TOAST value exists in the new
table.

The test relies on a combination of UPDATE, CLUSTER, and checks of the
TOAST values used before and after the relation rewrite, to make sure
that the same values are reused across the rewrite.

This is a continuation of 69f75d6714 to make sure that this corner
case keeps working should we mess with this area of the code.

Author: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com>
Discussion: https://postgr.es/m/CAFAfj_E+kw5P713S8_jZyVgQAGVFfzFiTUJPrgo-TTtJJoazQw@mail.gmail.com
2025-08-17 15:20:01 +09:00
Masahiko Sawada
928da6ff12 Fix typos in comments.
Oversight in commit fd5a1a0c3e.

Author: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAHewXNmTT3M_w4NngG=6G3mdT3iJ6DdncTqV9YnGXBPHW8XYtA@mail.gmail.com
2025-08-16 01:11:40 -07:00
Masahiko Sawada
37265ca01f Fix constant when extracting timestamp from UUIDv7.
When extracting a timestamp from a UUIDv7, a conversion from
milliseconds to microseconds was using the incorrect constant
NS_PER_US instead of US_PER_MS. Although both constants have the same
value, this fix improves code clarity by using the semantically
correct constant.

Backpatch to v18, where UUIDv7 was introduced.

Author: Erik Nordström <erik@tigerdata.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CACAa4V+i07eaP6h4MHNydZeX47kkLPwAg0sqe67R=M5tLdxNuQ@mail.gmail.com
Backpatch-through: 18
2025-08-15 11:58:53 -07:00
Peter Eisentraut
8212c83939 Add TAP tests for LDAP connection parameter lookup
Add TAP tests that tests the LDAP Lookup of Connection Parameters
functionality in libpq.  Prior to this commit, LDAP test coverage only
existed for the server-side authentication functionality and for
connection service file with parameters directly specified in the
file.  The tests included here test a pg_service.conf that contains a
link to an LDAP system that contains all of the connection parameters.

Author: Andrew Jackson <andrewjackson947@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAKK5BkHixcivSCA9pfd_eUp7wkLRhvQ6OtGLAYrWC%3Dk7E76LDQ%40mail.gmail.com
2025-08-15 10:17:22 +02:00
David Rowley
296cba2760 Fix invalid format string in HASH_DEBUG code
This seems to have been broken back in be0a66666.

Reported-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/OSCPR01MB14966E11EEFB37D7857FCEDB7F535A@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 14
2025-08-15 18:05:44 +12:00
David Rowley
ca38912512 Fix failing -D HASH_STATISTICS builds
This seems to have been broken for a few years by cc5ef90ed.

Author: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/OSCPR01MB14966E11EEFB37D7857FCEDB7F535A@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 17
2025-08-15 17:23:45 +12:00
David Rowley
b4632883d4 Add Asserts to validate prevbit values in bms_prev_member
bms_prev_member() could attempt to access memory outside of the words[]
array in cases where the prevbit was a number < -1 or > a->nwords *
BITS_PER_BITMAPWORD + 1.

Here we add the Asserts to help draw attention to bogus callers so we're
more likely to catch them during development.

In passing, fix wording of bms_prev_member's header comment which talks
about how we expect the callers to ensure only valid prevbit values are
used.

Author: Greg Burd <greg@burd.me>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2000A717-1FFE-4031-827B-9330FB2E9065%40getmailspring.com
2025-08-15 16:33:07 +12:00
Michael Paquier
69f75d6714 Add SQL test for TOAST value allocations on rewrite
The SQL test added in this commit check a specific code path that had no
coverage until now.  When a TOAST datum is rewritten, toast_save_datum()
has a dedicated path to make sure that a new value is allocated if it
does not exist on the TOAST table yet.

This test uses a trick with PLAIN and EXTERNAL storage, with a tuple
large enough to be toasted and small enough to fit on a page.  It is
initially stored in plain more, and the rewrite forces the tuple to be
stored externally.  The key point is that there is no value allocated
during the initial insert, and that there is one after the rewrite.  A
second pattern checked is the reuse of the same value across rewrites,
using \gset.

A set of patches under discussion is messing up with this area of the
code, so this makes sure that such rewrite cases remain consistent
across the board.

Author: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAFAfj_E+kw5P713S8_jZyVgQAGVFfzFiTUJPrgo-TTtJJoazQw@mail.gmail.com
2025-08-15 12:30:36 +09:00
Andres Freund
49cba82bec ci: Per-repo configuration for manually trigger tasks
We do not want to trigger some tasks by default, to avoid using too many
compute credits. These tasks have to be manually triggered to be run. But
e.g. for cfbot we do have sufficient resources, so we always want to start
those tasks.

With this commit, an individual repository can be configured to trigger
them automatically using an environment variable defined under
"Repository Settings", for example:

REPO_CI_AUTOMATIC_TRIGGER_TASKS="mingw netbsd openbsd"

This will enable cfbot to turn them on by default when running tests for the
Commitfest app.

Backpatch this back to PG 15, even though PG 15 does not have any manually
triggered task. Keeping the CI infrastructure the same seems advantageous.

Author: Andres Freund <andres@anarazel.de>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/20240413021221.hg53rvqlvldqh57i%40awork3.anarazel.de
Backpatch-through: 16
2025-08-14 11:54:03 -04:00