1
0
mirror of https://github.com/postgres/postgres.git synced 2025-10-16 17:07:43 +03:00
Commit Graph

62243 Commits

Author SHA1 Message Date
Nathan Bossart
ed1aad15e0 Move named LWLock tranche requests to shared memory.
In EXEC_BACKEND builds, GetNamedLWLockTranche() can segfault when
called outside of the postmaster process, as it might access
NamedLWLockTrancheRequestArray, which won't be initialized.  Given
the lack of reports, this is apparently unusual, presumably because
it is usually called from a shmem_startup_hook like this:

    mystruct = ShmemInitStruct(..., &found);
    if (!found)
    {
        mystruct->locks = GetNamedLWLockTranche(...);
        ...
    }

This genre of shmem_startup_hook evades the aforementioned
segfaults because the struct is initialized in the postmaster, so
all other callers skip the !found path.

We considered modifying the documentation or requiring
GetNamedLWLockTranche() to be called from the postmaster, but
ultimately we decided to simply move the request array to shared
memory (and add it to the BackendParameters struct), thereby
allowing calls outside postmaster on all platforms.  Since the main
shared memory segment is initialized after accepting LWLock tranche
requests, postmaster builds the request array in local memory first
and then copies it to shared memory later.

Given the lack of reports, back-patching seems unnecessary.

Reported-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0v1_15QPg5Sqd2Qz5rh_qcsyCeHHmRDY89xVHcy2yt5BQ%40mail.gmail.com
2025-09-11 16:13:55 -05:00
Tom Lane
a0b99fc122 Report the correct is_temporary flag for column defaults.
pg_event_trigger_dropped_objects() would report a column default
object with is_temporary = false, even if it belongs to a
temporary table.  This seems clearly wrong, so adjust it to
report the table's temp-ness.

While here, refactor EventTriggerSQLDropAddObject to make its
handling of namespace objects less messy and avoid duplication
of the schema-lookup code.  And add some explicit test coverage
of dropped-object reports for dependencies of temp tables.

Back-patch to v15.  The bug exists further back, but the
GetAttrDefaultColumnAddress function this patch depends on does not,
and it doesn't seem worth adjusting it to cope with the older code.

Author: Antoine Violin <violin.antuan@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFjUV9x3-hv0gihf+CtUc-1it0hh7Skp9iYFhMS7FJjtAeAptA@mail.gmail.com
Backpatch-through: 15
2025-09-11 17:11:57 -04:00
Álvaro Herrera
1d5800019f Improve comment about snapshot macros
The comment mistakenly had "the others" for "the other", but this
commit also reorders the comment so it matches the macros below.  Now we
describe the levels in increasing strictness.  In addition, it seems
easier to follow if we introduce one level at a time, rather than
describing two, followed by "the other" (and then jumping back to one of
the first two).  Finally, reword the sentence about the purpose of the
macros, which was slightly off-point.

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Rustam ALLAKOV <rustamallakov@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+renyUp=xja80rBaB6NpY3RRdi750y046x28bo_xg29zKY72Q@mail.gmail.com
2025-09-11 19:49:57 +02:00
Álvaro Herrera
e8cec3d179 Add test for temporal referential integrity
This commit adds an isolation test showing that temporal foreign keys do
not permit referential integrity violations under concurrency, like
fk-snapshot-2.  You can show that the test fails by passing false for
detectNewRows to ri_PerformCheck in ri_restrict.

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Rustam ALLAKOV <rustamallakov@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+renyUp=xja80rBaB6NpY3RRdi750y046x28bo_xg29zKY72Q@mail.gmail.com
2025-09-11 18:16:25 +02:00
Álvaro Herrera
a2b4102a21 Fill testing gap for possible referential integrity violation
This commit adds a missing isolation test for (non-PERIOD) foreign keys.
With REPEATABLE READ, one transaction can insert a referencing row while
another deletes the referenced row, and both see a valid state.  But
after they have committed, the table violates referential integrity.

If the INSERT precedes the DELETE, we use a crosscheck snapshot to see
the just-added row, so that the DELETE can raise a foreign key error.
You can see the table violate referential integrity if you change
ri_restrict to pass false for detectNewRows to ri_PerformCheck.

A crosscheck snapshot is not needed when the DELETE comes first, because
the INSERT's trigger takes a FOR KEY SHARE lock that sees the row now
marked for deletion, waits for that transaction to commit, and raises a
serialization error.  I (Paul) added a test for that too though.

We already have a similar test (in ri-triggers.spec) for SERIALIZABLE
snapshot isolation showing that you can implement foreign keys with just
pl/pgSQL, but that test does nothing to validate ri_triggers.c.  We also
have tests (in fk-snapshot.spec) for other concurrency scenarios, but
not this one: we test concurrently deleting both the referencing and
referenced row, when the constraint activates a cascade/set null action.
But those tests don't exercise ri_restrict, and the consequence of
omitting a crosscheck comparison is different: a serialization failure,
not a referential integrity violation.

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Rustam ALLAKOV <rustamallakov@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+renyUp=xja80rBaB6NpY3RRdi750y046x28bo_xg29zKY72Q@mail.gmail.com
2025-09-11 18:11:46 +02:00
Peter Eisentraut
4fbe015145 Remove checks for no longer supported GCC versions
Since commit f5e0186f86 (Raise C requirement to C11), we effectively
require at least GCC version 4.7, so checks for older versions can be
removed.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/a0f817ee-fb86-483a-8a14-b6f7f5991b6e%40eisentraut.org
2025-09-11 12:05:59 +02:00
Peter Eisentraut
368c38dd47 Remove stray semicolon at global scope
The Sun Studio compiler complains about an empty declaration here.

Note for future historians:  This does not mean that this compiler is
still of current interest for anyone using PostgreSQL.  But we can let
this small fix be its parting gift.

Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/a0f817ee-fb86-483a-8a14-b6f7f5991b6e%40eisentraut.org
2025-09-11 12:03:15 +02:00
Amit Kapila
01d793698f Fix intermittent test failure introduced in 6456c6e2c4.
The test assumes that a backend will execute COMMIT PREPARED on the
publisher and hit the injection point commit-after-delay-checkpoint within
the commit critical section. This should cause the apply worker on the
subscriber to wait for the transaction to complete.

However, the test does not guarantee that the injection point is actually
triggered, creating a race condition where the apply worker may proceed
prematurely during COMMIT PREPARED.

This commit resolves the issue by explicitly waiting for the injection
point to be hit before continuing with the test, ensuring consistent and
reliable behavior.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Discussion: https://postgr.es/m/TY4PR01MB1690751D1CA8C128B0770EC6F9409A@TY4PR01MB16907.jpnprd01.prod.outlook.com
2025-09-11 09:33:48 +00:00
Dean Rasheed
2bbbb2eca9 doc: Fix indentation in func-datetime.sgml.
Incorrect indentation introduced by commit faf071b553.
2025-09-11 09:49:39 +01:00
Dean Rasheed
9c24111c4d doc: Improve description of new random(min, max) functions.
Mention that the new variants of random(min, max) are affected by
setseed(), like the original functions.

Reported-by: Marcos Pegoraro <marcos@f10.com.br>
Discussion: https://postgr.es/m/CAB-JLwb1=drA3Le6uZXDBi_tCpeS1qm6XQU7dKwac_x91Z4qDg@mail.gmail.com
2025-09-11 09:25:47 +01:00
Michael Paquier
26eadf4d2b Fix description of WAL record blocks in hash_xlog.h
hash_xlog.h included descriptions for the blocks used in WAL records
that were was not completely consistent with how the records are
generated, with one block missing for SQUEEZE_PAGE, and inconsistent
descriptions used for block 0 in VACUUM_ONE_PAGE and MOVE_PAGE_CONTENTS.

This information was incorrect since c11453ce0a, cross-checking the
logic for the record generation.

Author: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CALdSSPj1j=a1d1hVA3oabRFz0hSU3KKrYtZPijw4UPUM7LY9zw@mail.gmail.com
Backpatch-through: 13
2025-09-11 17:17:04 +09:00
Michael Paquier
c88ce73eda Fix incorrect file reference in guc.h
GucSource_Names was documented as being in guc.c, but since 0a20ff54f5
it is located in guc_tables.c.  The reference to the location of
GucSource_Names is important, as GucSource needs to be kept in sync with
GucSource_Names.

Author: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/CAKFQuwYPgAHWPYjPzK7iXzhSZ6MKR8w20_Nz7ZXpOvx=kZbs7A@mail.gmail.com
Backpatch-through: 16
2025-09-11 10:15:33 +09:00
Tom Lane
09036dc71c Avoid faulty alignment of Datums in build_sorted_items().
If sizeof(Pointer) is 4 then sizeof(SortItem) will be 12, so that
if data->numrows is odd then we placed the values array at a location
that's not a multiple of 8.  That was fine when sizeof(Datum) was also
4, but in the wake of commit 2a600a93c it makes some alignment-picky
machines unhappy.  (You need a 32-bit machine that nonetheless expects
8-byte alignment of 8-byte quantities, which is an odd-seeming
combination but it does exist outside the Intel universe.)

To fix, MAXALIGN the space allocated to the SortItem array.
In passing, let's make the "len" variable be Size not int,
just for paranoia's sake.

This code was arguably not too safe even before 2a600a93c, but at
present I don't see a strong argument for back-patching.

Reported-by: Tomas Vondra <tomas@vondra.me>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/87036018-8d70-40ad-a0ac-192b07bd7b04@vondra.me
2025-09-10 17:51:24 -04:00
Tom Lane
bdc6cfcd12 Eliminate duplicative hashtempcxt in nodeSubplan.c.
Instead of building a separate memory context that's used just
for running hash functions, make the hash functions run in the
per-tuple context of the node's innerecontext.  This saves a
little space at runtime, and it avoids needing to reset two
contexts instead of one inside buildSubPlanHash's main loop.

This largely reverts commit 133924e13.  That's safe to do now
because bf6c614a2 decoupled the evaluation context used by
TupleHashTableMatch from that used for hash function evaluation,
so that there's no longer a risk of resetting the innerecontext
too soon.

Per discussion of bug #19040, although this is not directly
a fix for that.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Haiyang Li <mohen.lhy@alibaba-inc.com>
Reviewed-by: Fei Changhong <feichanghong@qq.com>
Discussion: https://postgr.es/m/19040-c9b6073ef814f48c@postgresql.org
2025-09-10 16:15:08 -04:00
Tom Lane
abdeacdb09 Fix memory leakage in nodeSubplan.c.
If the hash functions used for hashing tuples leaked any memory,
we failed to clean that up, resulting in query-lifespan memory
leakage in queries using hashed subplans.  One way that could
happen is if the values being hashed require de-toasting, since
most of our hash functions don't trouble to clean up de-toasted
inputs.

Prior to commit bf6c614a2, this leakage was largely masked
because TupleHashTableMatch would reset hashtable->tempcxt
(via execTuplesMatch).  But it doesn't do that anymore, and
that's not really the right place for this anyway: doing it
there could reset the tempcxt many times per hash lookup,
or not at all.  Instead put reset calls into ExecHashSubPlan
and buildSubPlanHash.  Along the way to that, rearrange
ExecHashSubPlan so that there's just one place to call
MemoryContextReset instead of several.

This amounts to accepting the de-facto API spec that the caller
of the TupleHashTable routines is responsible for resetting the
tempcxt adequately often.  Although the other callers seem to
get this right, it was not documented anywhere, so add a comment
about it.

Bug: #19040
Reported-by: Haiyang Li <mohen.lhy@alibaba-inc.com>
Author: Haiyang Li <mohen.lhy@alibaba-inc.com>
Reviewed-by: Fei Changhong <feichanghong@qq.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19040-c9b6073ef814f48c@postgresql.org
Backpatch-through: 13
2025-09-10 16:05:03 -04:00
Nathan Bossart
9016fa7e3b meson: Build numeric.c with -ftree-vectorize.
autoconf builds have compiled this file with -ftree-vectorize since
commit 8870917623, but meson builds seem to have missed the memo.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/aL85CeasM51-0D1h%40nathan
Backpatch-through: 16
2025-09-10 11:21:12 -05:00
Peter Eisentraut
33eec80940 Fix CREATE TABLE LIKE with not-valid check constraint
In CREATE TABLE ... LIKE, any check constraints copied from the source
table should be set to valid if they are ENFORCED (the default).

Bug introduced in commit ca87c415e2.

Author: jian he <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACJufxH%3D%2Bod8Wy0P4L3_GpapNwLUP3oAes5UFRJ7yTxrM_M5kg%40mail.gmail.com
2025-09-10 13:25:58 +02:00
Michael Paquier
e6da68a6e1 Remove dynahash.h
All the callers of my_log2() are now limited inside dynahash.c, so let's
remove this header.  The same capability is provided by pg_bitutils.h
already.

Discussion: https://postgr.es/m/CAEZATCUJPQD_7sC-wErak2CQGNa6bj2hY-mr8wsBki=kX7f2_A@mail.gmail.com
2025-09-10 14:11:50 +09:00
Michael Paquier
b1187266e0 Replace callers of dynahash.h's my_log() by equivalent in pg_bitutils.h
All the calls replaced by this commit use 4-byte integers for their
variables used in input of my_log2().  Hence, the limit against
too-large inputs does not really apply.  Thresholds are also applied, as
of:
- In nodeAgg.c, the number of partitions is limited by
HASHAGG_MAX_PARTITIONS.
- In nodeHash.c, ExecChooseHashTableSize() caps its maximum number of
buckets based on HashJoinTuple and palloc() allocation limit.
- In worker.c, the number of subxacts tracked by ApplySubXactData uses
uint32, making pg_ceil_log2_64() safe to use directly.

Several approaches have been discussed, like an integration with
thresholds in pg_bitutils.h, but it was found confusing.  This uses
Dean's idea, which gives a simpler result than what I came up with to be
able to remove dynahash.h.  dynahash.h will be removed in a follow-up
commit, removing some duplication with the ceil log2 routines.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAEZATCUJPQD_7sC-wErak2CQGNa6bj2hY-mr8wsBki=kX7f2_A@mail.gmail.com
2025-09-10 11:20:46 +09:00
Michael Paquier
8c8f7b199d Fix leak with SMgrRelations in startup process
The startup process does not process shared invalidation messages, only
sending them, and never calls AtEOXact_SMgr() which clean up any
unpinned SMgrRelations.  Hence, it is never able to free SMgrRelations
on a periodic basis, bloating its hashtable over time.

Like the checkpointer and the bgwriter, this commit takes a conservative
approach by freeing periodically SMgrRelations when replaying a
checkpoint record, either online or shutdown, so as the startup process
has a way to perform a periodic cleanup.

Issue caused by 21d9c3ee4e, so backpatch down to v17.

Author: Jingtang Zhang <mrdrivingduck@gmail.com>
Reviewed-by: Yuhang Qiu <iamqyh@gmail.com>
Discussion: https://postgr.es/m/28C687D4-F335-417E-B06C-6612A0BD5A10@gmail.com
Backpatch-through: 17
2025-09-10 07:23:05 +09:00
Nathan Bossart
d96c854dfc Fix documentation for shmem_startup_hook.
This section claims that each backend executes the
shmem_startup_hook shortly after attaching to shared memory, which
is true for EXEC_BACKEND builds, but not for others.  This commit
adds this important detail.

Oversight in commit 964152c476.

Reported-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0vEGT1eigGbVt604LkXP6mUPMwPMxQoRCbFny44w%2B9EUQ%40mail.gmail.com
Backpatch-through: 17
2025-09-09 14:35:30 -05:00
Nathan Bossart
530cfa8eb5 test_slru: Fix LWLock tranche allocation in EXEC_BACKEND builds.
Currently, test_slru's shmem_startup_hook unconditionally generates
new LWLock tranche IDs.  This is fine on non-EXEC_BACKEND builds,
where only the postmaster executes this hook, but on EXEC_BACKEND
builds, every backend executes it, too.  To fix, only generate the
tranche IDs in the postmaster process by checking the
IsUnderPostmaster variable.

This is arguably a bug fix and could be back-patched, but since the
damage is limited to some extra unused tranche IDs in a test
module, I'm not going to bother.

Reported-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0vaAuonaf12CeDddQJu5xKL%2B6xVyS%2B_q1%2BcH%3D33JXV82w%40mail.gmail.com
2025-09-09 14:09:36 -05:00
Peter Eisentraut
81a61fde84 Fix typo in comment
Author: Alexandra Wang <alexandra.wang.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAK98qZ0whQ%3Dc%2BJGXbGSEBxCtLgy6sf-YGYqsKTAGsS-wt0wj%2BA%40mail.gmail.com
2025-09-09 15:33:46 +02:00
Dean Rasheed
faf071b553 Add date and timestamp variants of random(min, max).
This adds 3 new variants of the random() function:

    random(min date, max date) returns date
    random(min timestamp, max timestamp) returns timestamp
    random(min timestamptz, max timestamptz) returns timestamptz

Each returns a random value x in the range min <= x <= max.

Author: Damien Clochard <damien@dalibo.info>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/f524d8cab5914613d9e624d9ce177d3d@dalibo.info
2025-09-09 10:39:30 +01:00
Amit Kapila
5ac3c1ac22 Fix Coverity issue reported in commit a850be2fe.
Address a potential SIGSEGV that may occur when the tablesync worker
attempts to locate a deleted row while applying changes. This situation
arises during conflict detection for update-deleted scenarios.

To prevent this crash, ensure that the operation is errored out early if
the leader apply worker is unavailable. Since the leader worker maintains
the necessary conflict detection metadata, proceeding without it serves no
purpose and risks reporting incorrect conflict type.

In the passing, improve a nearby comment.

Reported by Tom Lane as per Coverity
Author: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/334468.1757280992@sss.pgh.pa.us
2025-09-09 03:18:22 +00:00
Melanie Plageman
8ec97e78a7 Add error codes when vacuum discovers VM corruption
Commit fd6ec93bf8 and other previous work established the
principle that when an error is potentially reachable in case of on-disk
corruption but is not expected to be reached otherwise,
ERRCODE_DATA_CORRUPTED should be used. This allows log monitoring
software to search for evidence of corruption by filtering on the error
code.

Enhance the existing log messages emitted when the heap page is found to
be inconsistent with the VM by adding this error code.

Suggested-by: Andrey Borodin <x4mmm@yandex-team.ru>
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/87DD95AA-274F-4F4F-BAD9-7738E5B1F905%40yandex-team.ru
2025-09-08 17:13:31 -04:00
Jeff Davis
9af672bcb2 meson: build checksums with extra optimization flags.
Use -funroll-loops and -ftree-vectorize when building checksum.c to
match what autoconf does.

Discussion: https://postgr.es/m/a81f2f7ef34afc24a89c613671ea017e3651329c.camel@j-davis.com
Reviewed-by: Andres Freund <andres@anarazel.de>
2025-09-08 12:29:42 -07:00
Nathan Bossart
3bcfcd815e pg_upgrade: Transfer pg_largeobject_metadata's files when possible.
Commit 161a3e8b68 taught pg_upgrade to use COPY for large object
metadata for upgrades from v12 and newer, which is much faster to
restore than the proper large object commands.  For upgrades from
v16 and newer, we can take this a step further and transfer the
large object metadata files as if they were user tables.  We can't
transfer the files from older versions because the aclitem data
type (needed by pg_largeobject_metadata.lomacl) changed its storage
format in v16 (see commit 7b378237aa).  Note that this commit is
essentially a revert of commit 12a53c732c.

There are a couple of caveats.  First, we still need to COPY the
corresponding pg_shdepend rows for large objects.  Second, we need
to COPY anything in pg_largeobject_metadata with a comment or
security label, else restoring those will fail.  This means that an
upgrade in which every large object has a comment or security label
won't gain anything from this commit, but it should at least avoid
making those unusual use-cases any worse.

pg_upgrade must also take care to transfer the relfilenodes of
pg_largeobject_metadata and its index, as was done for
pg_largeobject in commits d498e052b4 and bbe08b8869.

Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aJ3_Gih_XW1_O2HF%40nathan
2025-09-08 14:19:48 -05:00
Melanie Plageman
4b5f206de2 Remove unused xl_heap_prune member, reason
f83d709760 refactored xl_heap_prune and added an unused member,
reason. While PruneReason is used when constructing this WAL record to
set the WAL record definition, it doesn't need to be stored in a
separate field in the record. Remove it.

We won't backport this, since modifying an exposed struct definition to
remove an unused field would do more harm than good.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>

Discussion: https://postgr.es/m/tvvtfoxz5ykpsctxjbzxg3nldnzfc7geplrt2z2s54pmgto27y%40hbijsndifu45
2025-09-08 14:25:10 -04:00
Robert Haas
5a170e992a Don't generate fake "*TLOCRN*" or "*TROCRN*" aliases, either.
This is just like the previous two commits, except that this fix
actually doesn't change any regression test outputs.

Author: Robert Haas <rhaas@postgresql.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+TgmoYSYmDA2GvanzPMci084n+mVucv0bJ0HPbs6uhmMN6HMg@mail.gmail.com
2025-09-08 12:58:07 -04:00
Robert Haas
6f79024df3 Don't generate fake "ANY_subquery" aliases, either.
This is just like the previous commit, but for a different invented
alias name.

Author: Robert Haas <rhaas@postgresql.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+TgmoYSYmDA2GvanzPMci084n+mVucv0bJ0HPbs6uhmMN6HMg@mail.gmail.com
2025-09-08 12:24:02 -04:00
Robert Haas
585e31fcb6 Don't generate fake "*SELECT*" or "*SELECT* %d" subquery aliases.
rte->alias should point only to a user-written alias, but in these
cases that principle was violated. Fixing this causes some regression
test output changes: wherever rte->alias previously had a value and
is now NULL, rte->eref is now set to a generated name rather than to
rte->alias; and the scheme used to generate eref names differs from
what we were doing for aliases.

The upshot is that instead of "*SELECT*" or "*SELECT* %d",
EXPLAIN will now emit "unnamed_subquery" or "unnamed_subquery_%d".
But that's a reasonable descriptor, and we were already producing
that in yet other cases, so this seems not too objectionable.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Co-authored-by: Robert Haas <rhaas@postgresql.org>
Discussion: https://postgr.es/m/CA+TgmoYSYmDA2GvanzPMci084n+mVucv0bJ0HPbs6uhmMN6HMg@mail.gmail.com
2025-09-08 11:50:33 -04:00
Melanie Plageman
3399c26554 Remove unneeded VM pin from VM replay
Previously, heap_xlog_visible() called visibilitymap_pin() even after
getting a buffer from XLogReadBufferForRedoExtended() -- which returns a
pinned buffer containing the specified block of the visibility map.

This would just have resulted in visibilitymap_pin() returning early
since the specified page was already present and pinned, but it was
confusing extraneous code, so remove it. It doesn't seem worth
backporting, though.

It appears to be an oversight in 2c03216.

While we are at it, remove two VM-related redundant asserts in the COPY
FREEZE code path. visibilitymap_set() already asserts that
PD_ALL_VISIBLE is set on the heap page and checks that the vmbuffer
contains the bits corresponding to the specified heap block, so callers
do not also need to check this.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: Melanie Plageman <melanieplageman@gmail.com>
Reported-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>

Discussion: https://postgr.es/m/CALdSSPhu7WZd%2BEfQDha1nz%3DDC93OtY1%3DUFEdWwSZsASka_2eRQ%40mail.gmail.com
2025-09-08 10:22:42 -04:00
Amit Kapila
6456c6e2c4 Add test to prevent premature removal of conflict-relevant data.
A test has been added to ensure that conflict-relevant data is not
prematurely removed when a concurrent prepared transaction is being
committed on the publisher.

This test introduces an injection point that simulates the presence of a
prepared transaction in the commit phase, validating that the system
correctly delays conflict slot advancement until the transaction is fully
committed.

Additionally, the test serves as a safeguard for developers, ensuring that
the acquisition of the commit timestamp does not occur before marking
DELAY_CHKPT_IN_COMMIT in RecordTransactionCommitPrepared.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/OS9PR01MB16913F67856B0DA2A909788129400A@OS9PR01MB16913.jpnprd01.prod.outlook.com
2025-09-08 12:06:03 +00:00
Michael Paquier
8191e0c16a Fix corruption of pgstats shared hashtable due to OOM failures
A new pgstats entry is created as a two-step process:
- The entry is looked at in the shared hashtable of pgstats, and is
inserted if not found.
- When not found and inserted, its fields are then initialized.  This
part include a DSA chunk allocation for the stats data of the new entry.

As currently coded, if the DSA chunk allocation fails due to an
out-of-memory failure, an ERROR is generated, leaving in the pgstats
shared hashtable an inconsistent entry due to the first step, as the
entry has already been inserted in the hashtable.  These broken entries
can then be found by other backends, crashing them.

There are only two callers of pgstat_init_entry(), when loading the
pgstats file at startup and when creating a new pgstats entry.  This
commit changes pgstat_init_entry() so as we use dsa_allocate_extended()
with DSA_ALLOC_NO_OOM, making it return NULL on allocation failure
instead of failing.  This way, a backend failing an entry creation can
take appropriate cleanup actions in the shared hashtable before throwing
an error.  Currently, this means removing the entry from the shared
hashtable before throwing the error for the allocation failure.

Out-of-memory errors unlikely happen in the wild, and we do not bother
with back-patches when these are fixed, usually.  However, the problem
dealt with here is a degree worse as it breaks the shared memory state
of pgstats, impacting other processes that may look at an inconsistent
entry that a different process has failed to create.

Author: Mikhail Kot <mikhail.kot@databricks.com>
Discussion: https://postgr.es/m/CAAi9E7jELo5_-sBENftnc2E8XhW2PKZJWfTC3i2y-GMQd2bcqQ@mail.gmail.com
Backpatch-through: 15
2025-09-08 15:52:23 +09:00
Amit Kapila
1f7e9ba3ac Post-commit review fixes for 228c370868.
This commit fixes three issues:

1) When a disabled subscription is created with retain_dead_tuples set to true,
the launcher is not woken up immediately, which may lead to delays in creating
the conflict detection slot.

Creating the conflict detection slot is essential even when the subscription is
not enabled. This ensures that dead tuples are retained, which is necessary for
accurately identifying the type of conflict during replication.

2) Conflict-related data was unnecessarily retained when the subscription does
not have a table.

3) Conflict-relevant data could be prematurely removed before applying
prepared transactions on the publisher that are in the commit critical section.

This issue occurred because the backend executing COMMIT PREPARED was not
accounted for during the computation of oldestXid in the commit phase on
the publisher. As a result, the subscriber could advance the conflict
slot's xmin without waiting for such COMMIT PREPARED transactions to
complete.

We fixed this issue by identifying prepared transactions that are in the
commit critical section during computation of oldestXid in commit phase.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/OS9PR01MB16913DACB64E5721872AA5C02943BA@OS9PR01MB16913.jpnprd01.prod.outlook.com
Discussion: https://postgr.es/m/OS9PR01MB16913F67856B0DA2A909788129400A@OS9PR01MB16913.jpnprd01.prod.outlook.com
2025-09-08 06:10:15 +00:00
Michael Paquier
43eb2c5419 Update parser README to include parse_jsontable.c
The README was missing parse_jsontable.c which handles JSON_TABLE.
Oversight in de3600452b.

Author: Karthik S <karthikselvaam@gmail.com>
Discussion: https://postgr.es/m/CAK4gQD9gdcj+vq_FZGp=Rv-W+41v8_C7cmCUmDeu=cfrOdfXEw@mail.gmail.com
Backpatch-through: 17
2025-09-08 10:07:14 +09:00
Tatsuo Ishii
06473f5a34 Allow to log raw parse tree.
This commit allows to log the raw parse tree in the same way we
currently log the parse tree, rewritten tree, and plan tree.

To avoid unnecessary log noise for users not interested in this
detail, a new GUC option, "debug_print_raw_parse", has been added.

When starting the PostgreSQL process with "-d N", and N is 3 or higher,
debug_print_raw_parse is enabled automatically, alongside
debug_print_parse.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tatsuo Ishii <ishii@postgresql.org>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/CAEoWx2mcO0Gpo4vd8kPMAFWeJLSp0MeUUnaLdE1x0tSVd-VzUw%40mail.gmail.com
2025-09-06 07:49:51 +09:00
Andres Freund
2c78940527 bufmgr: Remove freelist, always use clock-sweep
This set of changes removes the list of available buffers and instead simply
uses the clock-sweep algorithm to find and return an available buffer.  This
also removes the have_free_buffer() function and simply caps the
pg_autoprewarm process to at most NBuffers.

While on the surface this appears to be removing an optimization it is in fact
eliminating code that induces overhead in the form of synchronization that is
problematic for multi-core systems.

The main reason for removing the freelist, however, is not the moderate
improvement in scalability, but that having the freelist would require
dedicated complexity in several upcoming patches. As we have not been able to
find a case benefiting from the freelist...

Author: Greg Burd <greg@burd.me>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/70C6A5B5-2A20-4D0B-BC73-EB09DD62D61C@getmailspring.com
2025-09-05 12:25:59 -04:00
Andres Freund
50e4c6ace5 bufmgr: Use consistent naming of the clock-sweep algorithm
Minor edits to comments only.

Author: Greg Burd <greg@burd.me>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/70C6A5B5-2A20-4D0B-BC73-EB09DD62D61C@getmailspring.com
2025-09-05 12:25:59 -04:00
Melanie Plageman
e3d5ddb7ca Add assert and log message to visibilitymap_set
Add an assert to visibilitymap_set() that the provided heap buffer is
exclusively locked, which is expected.

Also, enhance the debug logging message to specify which VM flags were
set.

Based on a related suggestion by Kirill Reshke on an in-progress
patchset.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CALdSSPhAU56g1gGVT0%2BwG8RrSWE6qW8TOfNJS1HNAWX6wPgbFA%40mail.gmail.com
2025-09-05 09:33:36 -04:00
Dean Rasheed
6ede13d1b5 Fix concurrent update issue with MERGE.
When executing a MERGE UPDATE action, if there is more than one
concurrent update of the target row, the lock-and-retry code would
sometimes incorrectly identify the latest version of the target tuple,
leading to incorrect results.

This was caused by using the ctid field from the TM_FailureData
returned by table_tuple_lock() in a case where the result was TM_Ok,
which is unsafe because the TM_FailureData struct is not guaranteed to
be fully populated in that case. Instead, it should use the tupleid
passed to (and updated by) table_tuple_lock().

To reduce the chances of similar errors in the future, improve the
commentary for table_tuple_lock() and TM_FailureData to make it
clearer that table_tuple_lock() updates the tid passed to it, and most
fields of TM_FailureData should not be relied on in non-failure cases.
An exception to this is the "traversed" field, which is set in both
success and failure cases.

Reported-by: Dmitry <dsy.075@yandex.ru>
Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/1570d30e-2b95-4239-b9c3-f7bf2f2f8556@yandex.ru
Backpatch-through: 15
2025-09-05 08:18:18 +01:00
Michael Paquier
567d27e8e2 Fix outdated comments in slru.c
SlruRecentlyUsed() is an inline function since 53c2a97a92, not a
macro.  The description of long_segment_names was missing at the top of
SimpleLruInit(), part forgotten in 4ed8f0913b.

Author: Julien Rouhaud <rjuju123@gmail.com>
Discussion: https://postgr.es/m/aLpBLMOYwEQkaleF@jrouhaud
Backpatch-through: 17
2025-09-05 14:10:08 +09:00
Michael Paquier
4246a977ba Switch some numeric-related functions to use soft error reporting
This commit changes some functions related to the data type numeric to
use the soft error reporting rather than a custom boolean flag (called
"have_error") that callers of these functions could rely on to bypass
the generation of ERROR reports, letting the callers do their own error
handling (timestamp, jsonpath and numeric_to_char() require them).

This results in the removal of some boilerplate code that was required
to handle both the ereport() and the "have_error" code paths bypassing
ereport(), unifying everything under the soft error reporting facility.

While on it, some duplicated error messages are removed.  The function
upgraded in this commit were suffixed with "_opt_error" in their names.
They are renamed to "_safe" instead.

This change relies on d9f7f5d32f, that has introduced the soft error
reporting infrastructure.

Author: Amul Sul <sulamul@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAAJ_b96No5h5tRuR+KhcC44YcYUCw8WAHuLoqqyyop8_k3+JDQ@mail.gmail.com
2025-09-05 13:53:47 +09:00
Michael Paquier
ae45312008 Change pg_lsn_in_internal() to use soft error reporting
pg_lsn includes pg_lsn_in_internal() for the purpose of parsing a LSN
position for the GUC recovery_target_lsn (21f428ebde).  It relies on a
boolean called "have_error" that would be set when the LSN parsing
fails, then let its callers handle any errors.

d9f7f5d32f has added support for soft error reporting.  This commit
removes some boilerplate code and switches the routine to use soft error
reporting directly, giving to the callers of pg_lsn_in_internal()
the possibility to be fed the error message generated on failure.

pg_lsn_in_internal() routine is renamed to pg_lsn_in_safe(), for
consistency with other similar routines that are given an escontext.

Author: Amul Sul <sulamul@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAAJ_b96No5h5tRuR+KhcC44YcYUCw8WAHuLoqqyyop8_k3+JDQ@mail.gmail.com
2025-09-05 12:59:29 +09:00
Nathan Bossart
d814d7fc3d Revert recent change to RequestNamedLWLockTranche().
Commit 38b602b028 modified this function to allocate enough space
for MAX_NAMED_TRANCHES (256) requests, which is likely far more
than most clusters need.  This commit reverts that change so that
it first allocates enough space for only 16 requests and resizes
the array when necessary.  While at it, remove the check for too
many tranches from this function.  We can now rely on
InitializeLWLocks() to do that check via its calls to
LWLockNewTrancheId() for the named tranches.

Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/aLmzwC2dRbqk14y6%40nathan
2025-09-04 15:34:48 -05:00
Peter Eisentraut
f0478149c3 Clean up newly added guc_tables.inc.c
There was a missing makefile rule to clean up the guc_tables.inc.c
symlink in src/include/.  Oversight in commit 6359989654.

Author: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/dae6fe89-1e0c-4c3f-8d92-19d23374fb10%40eisentraut.org
2025-09-04 17:25:43 +02:00
Nathan Bossart
1129d3e4c8 Adjust commentary for WaitEventLWLock in wait_event_names.txt.
In addition to changing a couple of references for clarity, this
commit combines the two similar comments.
2025-09-04 10:18:42 -05:00
Dean Rasheed
fc6600fc1c Fix replica identity check for MERGE.
When executing a MERGE, check that the target relation supports all
actions mentioned in the MERGE command. Specifically, check that it
has a REPLICA IDENTITY if it publishes updates or deletes and the
MERGE command contains update or delete actions. Failing to do this
can silently break replication.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Tested-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/OS3PR01MB57180C87E43A679A730482DF94B62@OS3PR01MB5718.jpnprd01.prod.outlook.com
Backpatch-through: 15
2025-09-04 11:45:44 +01:00
Dean Rasheed
5386bfb9c1 Fix replica identity check for INSERT ON CONFLICT DO UPDATE.
If an INSERT has an ON CONFLICT DO UPDATE clause, the executor must
check that the target relation supports UPDATE as well as INSERT. In
particular, it must check that the target relation has a REPLICA
IDENTITY if it publishes updates. Formerly, it was not doing this
check, which could lead to silently breaking replication.

Fix by adding such a check to CheckValidResultRel(), which requires
adding a new onConflictAction argument. In back-branches, preserve ABI
compatibility by introducing a wrapper function with the original
signature.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Tested-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/OS3PR01MB57180C87E43A679A730482DF94B62@OS3PR01MB5718.jpnprd01.prod.outlook.com
Backpatch-through: 13
2025-09-04 11:27:53 +01:00