1
0
mirror of https://github.com/postgres/postgres.git synced 2025-12-07 12:02:30 +03:00
Commit Graph

62744 Commits

Author SHA1 Message Date
Daniel Gustafsson
64527a17a5 doc: Consistently use restartpoint in the documentation
The majority of cases already used "restartpoint" with just a few
instances of "restart point". Changing the latter spelling to the
former ensures consistency in the user facing documentation. Code
comments are not affected by this since it is not worth the churn
to change anything there.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/0F6E38D0-649F-4489-B2C1-43CD937E6636@yesql.se
2025-12-03 15:22:38 +01:00
Peter Eisentraut
9790affcce Fix stray references to SubscriptRef
This type never existed.  SubscriptingRef was meant instead.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/2eaa45e3-efc5-4d75-b082-f8159f51445f%40eisentraut.org
2025-12-03 14:44:14 +01:00
Peter Eisentraut
1b2bb5077e Change Pointer to void *
The comment for the Pointer type said 'XXX Pointer arithmetic is done
with this, so it can't be void * under "true" ANSI compilers.'.  This
has been fixed in the previous commit 756a436893.  This now changes
the definition of the type from char * to void *, as envisaged by that
comment.

Extension code that relies on using Pointer for pointer arithmetic
will need to make changes similar to commit 756a436893, but those
changes would be backward compatible.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org
2025-12-03 10:22:17 +01:00
Peter Eisentraut
756a436893 Don't rely on pointer arithmetic with Pointer type
The comment for the Pointer type says 'XXX Pointer arithmetic is done
with this, so it can't be void * under "true" ANSI compilers.'.  This
fixes that.  Change from Pointer to use char * explicitly where
pointer arithmetic is needed.  This makes the meaning of the code
clearer locally and removes a dependency on the actual definition of
the Pointer type.  (The definition of the Pointer type is not changed
in this commit.)

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org
2025-12-03 09:54:15 +01:00
Peter Eisentraut
8c6bbd674e Use more appropriate DatumGet* function
Use DatumGetCString() instead of DatumGetPointer() for returning a C
string.  Right now, they are the same, but that doesn't always have to
be so.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org
2025-12-03 08:52:28 +01:00
Peter Eisentraut
623801b3bd Remove useless casts to Pointer
in arguments of memcpy() and memmove() calls

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org
2025-12-03 08:40:33 +01:00
Amit Kapila
c252d37d8c Fix shadow variable warning in subscriptioncmds.c.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CAHut+PsF8R0Bt4J3c92+T2F0mun0rRfK=-GH+iBv2s-O8ahJJw@mail.gmail.com
2025-12-03 03:31:31 +00:00
Nathan Bossart
a6d05c8193 Use LW_SHARED in dsa.c where possible.
Both dsa_get_total_size() and dsa_get_total_size_from_handle() take
an exclusive lock just to read a variable.  This commit reduces the
lock level to LW_SHARED in those functions.

Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/aS8fMzWs9e8iHxk2%40nathan
2025-12-02 16:40:23 -06:00
Heikki Linnakangas
cbe04e5d72 Fix amcheck's handling of half-dead B-tree pages
amcheck incorrectly reported the following error if there were any
half-dead pages in the index:

ERROR:  mismatch between parent key and child high key in index
"amchecktest_id_idx"

It's expected that a half-dead page does not have a downlink in the
parent level, so skip the test.

Reported-by: Konstantin Knizhnik <knizhnik@garret.ru>
Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Discussion: https://www.postgresql.org/message-id/33e39552-6a2a-46f3-8b34-3f9f8004451f@garret.ru
Backpatch-through: 14
2025-12-02 21:11:15 +02:00
Heikki Linnakangas
c085aab278 Add a test for half-dead pages in B-tree indexes
To increase our test coverage in general, and because I will use this
in the next commit to test a bug we currently have in amcheck.

Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://www.postgresql.org/message-id/33e39552-6a2a-46f3-8b34-3f9f8004451f@garret.ru
2025-12-02 21:11:05 +02:00
Heikki Linnakangas
6c05ef5729 Fix amcheck's handling of incomplete root splits in B-tree
When the root page is being split, it's normal that root page
according to the metapage is not marked BTP_ROOT. Fix bogus error in
amcheck about that case.

Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://www.postgresql.org/message-id/abd65090-5336-42cc-b768-2bdd66738404@iki.fi
Backpatch-through: 14
2025-12-02 21:10:51 +02:00
Heikki Linnakangas
1e4e5783e7 Add a test for incomplete splits in B-tree indexes
To increase our test coverage in general, and because I will add onto
this in the next commit to also test amcheck with incomplete splits.

This is copied from the similar test we had for GIN indexes. B-tree's
incomplete splits work similarly to GIN's, so with small changes, the
same test works for B-tree too.

Reviewed-by: Peter Geoghegan <pg@bowt.ie>
Discussion: https://www.postgresql.org/message-id/abd65090-5336-42cc-b768-2bdd66738404@iki.fi
2025-12-02 21:10:47 +02:00
Nathan Bossart
f894acb24a Show size of DSAs and dshashes in pg_dsm_registry_allocations.
Presently, this view reports NULL for the size of DSAs and dshash
tables because 1) the current backend might not be attached to them
and 2) the registry doesn't save the pointers to the dsa_area or
dshash_table in local memory.  Also, the view doesn't show
partially-initialized entries to avoid ambiguity, since those
entries would report a NULL size as well.

This commit introduces a function that looks up the size of a DSA
given its handle (transiently attaching to the control segment if
needed) and teaches pg_dsm_registry_allocations to use it to show
the size of successfully-initialized DSA and dshash entries.
Furthermore, the view now reports partially-initialized entries
with a NULL size.

Reviewed-by: Rahila Syed <rahilasyed90@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aSeEDeznAsHR1_YF%40nathan
2025-12-02 10:29:45 -06:00
Álvaro Herrera
758479213d Remove doc and code comments about ON CONFLICT deficiencies
They have been fixed, so we don't need this text anymore.  This reverts
commit 8b18ed6dfb.

Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Discussion: https://postgr.es/m/CADzfLwWo+FV9WSeOah9F1r=4haa6eay1hNvYYy_WfziJeK+aLQ@mail.gmail.com
2025-12-02 16:47:18 +01:00
Álvaro Herrera
5dee7a603f Avoid use of NOTICE to wait for snapshot invalidation
This idea (implemented in commits and bc32a12e0d and 9e8fa05d34) of
using notices to detect that a session is sleeping was unreliable, so
simplify the concurrency controller session to just look at
pg_stat_activity for a process sleeping on the injection point we want
it to hit.  This change allows us to remove a secondary injection point
and the alternative expected output files.

Reproduced by Alexander Lakhin following a report in buildfarm member
skink (which runs the server under valgrind).

Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/3e302c96-cdd2-45ec-af84-03dbcdccde4a@gmail.com
2025-12-02 16:43:27 +01:00
Álvaro Herrera
90eae926ab Fix ON CONFLICT with REINDEX CONCURRENTLY and partitions
When planning queries with ON CONFLICT on partitioned tables, the
indexes to consider as arbiters for each partition are determined based
on those found in the parent table.  However, it's possible for an index
on a partition to be reindexed, and in that case, the auxiliary indexes
created on the partition must be considered as arbiters as well; failing
to do that may result in spurious "duplicate key" errors given
sufficient bad luck.

We fix that in this commit by matching every index that doesn't have a
parent to each initially-determined arbiter index.  Every unparented
matching index is considered an additional arbiter index.

Closely related to the fixes in bc32a12e0d and 2bc7e886fc, and for
identical reasons, not backpatched (for now) even though it's a
longstanding issue.

Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com
2025-12-02 13:51:53 +01:00
Peter Eisentraut
4f941d432b Remove useless casting to same type
This removes some casts where the input already has the same type as
the type specified by the cast.  Their presence could cause risks of
hiding actual type mismatches in the future or silently discarding
qualifiers.  It also improves readability.  Same kind of idea as
7f798aca1d and ef8fe69360.  (This does not change all such
instances, but only those hand-picked by the author.)

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/aSQy2JawavlVlEB0%40ip-10-97-1-34.eu-west-3.compute.internal
2025-12-02 10:09:32 +01:00
Peter Eisentraut
35988b31db Simplify hash_xlog_split_allocate_page()
Instead of complicated pointer arithmetic, overlay a uint32 array and
just access the array members.  That's safe thanks to
XLogRecGetBlockData() returning a MAXALIGNed buffer.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/aSQy2JawavlVlEB0%40ip-10-97-1-34.eu-west-3.compute.internal
2025-12-02 09:18:54 +01:00
Peter Eisentraut
ec782f56b0 Replace pointer comparisons and assignments to literal zero with NULL
While 0 is technically correct, NULL is the semantically appropriate
choice for pointers.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/aS1AYnZmuRZ8g%2B5G%40ip-10-97-1-34.eu-west-3.compute.internal
2025-12-02 08:39:24 +01:00
Peter Eisentraut
376c649634 Update comment related to C99
One could do more work here to eliminate the Windows difference
described in the comment, but that can be a separate project.  The
purpose of this change is to update comments that might confusingly
indicate that C99 is not required.

Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/170308e6-a7a3-4484-87b2-f960bb564afa%40eisentraut.org
2025-12-02 08:20:43 +01:00
Michael Paquier
713d9a847e Update some timestamp[tz] functions to use soft-error reporting
This commit updates two functions that convert "timestamptz" to
"timestamp", and vice-versa, to use the soft error reporting rather than
a their own logic to do the same.  These are now named as follows:
- timestamp2timestamptz_safe()
- timestamptz2timestamp_safe()

These functions were suffixed with "_opt_overflow", previously.

This shaves some code, as it is possible to detect how a timestamp[tz]
overflowed based on the returned value rather than a custom state.  It
is optionally possible for the callers of these functions to rely on the
error generated internally by these functions, depending on the error
context.

Similar work has been done in d03668ea05 and 4246a977ba.

Reviewed-by: Amul Sul <sulamul@gmail.com>
Discussion: https://postgr.es/m/aS09YF2GmVXjAxbJ@paquier.xyz
2025-12-02 09:30:23 +09:00
Jeff Davis
19b966243c Make regex "max_chr" depend on encoding, not provider.
The regex mechanism scans through the first "max_chr" character values
to cache character property ranges (isalpha, etc.). For single-byte
encodings, there's no sense in scanning beyond UCHAR_MAX; but for
UTF-8 it makes sense to cache higher code point values (though not all
of them; only up to MAX_SIMPLE_CHR).

Prior to 5a38104b36, the logic about how many character values to scan
was based on the pg_regex_strategy, which was dependent on the
provider. Commit 5a38104b36 preserved that logic exactly, allowing
different providers to define the "max_chr".

Now, change it to depend only on the encoding and whether
ctype_is_c. For this specific calculation, distinguishing between
providers creates more complexity than it's worth.

Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
2025-12-01 11:06:17 -08:00
Jeff Davis
99cd8890be Change some callers to use pg_ascii_toupper().
The input is ASCII anyway, so it's better to be clear that it's not
locale-dependent.

Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-12-01 09:24:03 -08:00
Álvaro Herrera
2bc7e886fc Fix ON CONFLICT ON CONSTRAINT during REINDEX CONCURRENTLY
When REINDEX CONCURRENTLY is processing the index that supports a
constraint, there are periods during which multiple indexes match the
constraint index's definition.  Those must all be included in the set of
inferred index for INSERT ON CONFLICT, in order to avoid spurious
"duplicate key" errors.

To fix, we set things up to match all indexes against attributes,
expressions and predicates of the constraint index, then return all
indexes that match those, rather than just the one constraint index.
This is more onerous than before, where we would just test the named
constraint for validity, but it's not more onerous than processing
"conventional" inference (where a list of attribute names etc is given).

This is closely related to the misbehaviors fixed by bc32a12e0d, for a
different situation.  We're not backpatching this one for now either,
for the same reasons.

Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com
2025-12-01 17:34:13 +01:00
Peter Eisentraut
2fcc5a7151 Fix a strict aliasing violation
This one is almost a textbook example of an aliasing violation, and it
is straightforward to fix, so clean it up.  (The warning only shows up
if you remove the -fno-strict-aliasing option.)  Also, move the code
after the error checking.  Doesn't make a difference technically, but
it seems strange to do actions before errors are checked.

Reported-by: Tatsuo Ishii <ishii@postgresql.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/20240724.155525.366150353176322967.ishii%40postgresql.org
2025-12-01 16:41:08 +01:00
Michael Paquier
a87987cafc Move WAL sequence code into its own file
This split exists for most of the other RMGRs, and makes cleaner the
separation between the WAL code, the redo code and the record
description code (already in its own file) when it comes to the sequence
RMGR.  The redo and masking routines are moved to a new file,
sequence_xlog.c.  All the RMGR routines are now located in a new header,
sequence_xlog.h.

This separation is useful for a different patch related to sequences
that I have been working on, where it makes a refactoring of sequence.c
easier if its RMGR routines and its core routines are split.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/aSfTxIWjiXkTKh1E@paquier.xyz
2025-12-01 16:21:41 +09:00
Michael Paquier
d03668ea05 Switch some date/timestamp functions to use the soft error reporting
This commit changes some functions related to the data types date and
timestamp to use the soft error reporting rather than a custom boolean
flag called "overflow", used to let the callers of these functions know
if an overflow happens.

This results in the removal of some boilerplate code, as it is possible
to rely on an error context rather than a custom state, with the
possibility to use the error generated inside the functions updated
here, if necessary.

These functions were suffixed with "_opt_overflow".  They are now
renamed to use "_safe" as suffix.

This work is similar to 4246a977ba.

Author: Amul Sul <sulamul@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAAJ_b95HEmFyzHZfsdPquSHeswcopk8MCG1Q_vn4tVkZ+xxofw@mail.gmail.com
2025-12-01 15:22:20 +09:00
David Rowley
5424f4da90 Don't call simplify_aggref with a NULL PlannerInfo
42473b3b3 added prosupport infrastructure to allow simplification of
Aggrefs during constant-folding.  In some cases the context->root that's
given to eval_const_expressions_mutator() can be NULL.  42473b3b3 failed
to take that into account, which could result in a crash.

To fix, add a check and only call simplify_aggref() when the PlannerInfo
is set.

Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Birler, Altan <altan.birler@tum.de>
Discussion: https://postgr.es/m/132d4da23b844d5ab9e352d34096eab5@tum.de
2025-11-30 12:55:34 +13:00
Peter Geoghegan
c902bd57af Update obsolete row compare preprocessing comments.
We have some limited ability to detect redundant and contradictory
conditions involving an nbtree row comparison key following commits
f09816a0 and bd3f59fd: we can do so in simple cases involving IS NULL
and IS NOT NULL keys on a row compare key's first column.  We can
likewise determine that a scan's qual is unsatisfiable given a row
compare whose first subkey's arg is NULL.  Update obsolete comments that
claimed that we merely copied row compares into the output key array
"without any editorialization".

Also update another _bt_preprocess_keys header comment paragraph: add a
parenthetical remark that points out that preprocessing will generate a
skip array for the preceding example qual.  That will ultimate lead to
preprocessing marking the example's lower-order y key required -- which
is exactly what the example supposes cannot happen.  Keep the original
comment, though, since it accurately describes the mechanical rules that
determine which keys get marked required in the absence of skip arrays
(which can occasionally still matter).  This fixes an oversight in
commit 92fe23d9, which added the nbtree skip scan optimization.

Author: Peter Geoghegan <pg@bowt.ie>
Backpatch-through: 18
2025-11-29 16:41:51 -05:00
Dean Rasheed
3881561d77 Avoid rewriting data-modifying CTEs more than once.
Formerly, when updating an auto-updatable view, or a relation with
rules, if the original query had any data-modifying CTEs, the rewriter
would rewrite those CTEs multiple times as RewriteQuery() recursed
into the product queries. In most cases that was harmless, because
RewriteQuery() is mostly idempotent. However, if the CTE involved
updating an always-generated column, it would trigger an error because
any subsequent rewrite would appear to be attempting to assign a
non-default value to the always-generated column.

This could perhaps be fixed by attempting to make RewriteQuery() fully
idempotent, but that looks quite tricky to achieve, and would probably
be quite fragile, given that more generated-column-type features might
be added in the future.

Instead, fix by arranging for RewriteQuery() to rewrite each CTE
exactly once (by tracking the number of CTEs already rewritten as it
recurses). This has the advantage of being simpler and more efficient,
but it does make RewriteQuery() dependent on the order in which
rewriteRuleAction() joins the CTE lists from the original query and
the rule action, so care must be taken if that is ever changed.

Reported-by: Bernice Southey <bernice.southey@gmail.com>
Author: Bernice Southey <bernice.southey@gmail.com>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/CAEDh4nyD6MSH9bROhsOsuTqGAv_QceU_GDvN9WcHLtZTCYM1kA@mail.gmail.com
Backpatch-through: 14
2025-11-29 12:28:59 +00:00
Peter Eisentraut
87c6f8b047 Generate translator comments for GUC parameter descriptions
Automatically generate comments like

    /* translator: GUC parameter "client_min_messages" short description */

in the generated guc_tables.inc.c.

This provides translators more context.

Reviewed-by: Pavlo Golub <pavlo.golub@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Stéphane Schildknecht <sas.postgresql@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/1a89b3f0-e588-41ef-b712-aba766143cad%40eisentraut.org
2025-11-28 16:01:59 +01:00
Peter Eisentraut
8b3e2c622a Fix pg_isblank()
There was a pg_isblank() function that claimed to be a replacement for
the standard isblank() function, which was thought to be "not very
portable yet".  We can now assume that it's portable (it's in C99).

But pg_isblank() actually diverged from the standard isblank() by also
accepting '\r', while the standard one only accepts space and tab.
This was added to support parsing pg_hba.conf under Windows.  But the
hba parsing code now works completely differently and already handles
line endings before we get to pg_isblank().  The other user of
pg_isblank() is for ident protocol message parsing, which also handles
'\r' separately.  So this behavior is now obsolete and confusing.

To improve clarity, I separated those concerns.  The ident parsing now
gets its own function that hardcodes the whitespace characters
mentioned by the relevant RFC.  pg_isblank() is now static in hba.c
and is a wrapper around the standard isblank(), with some extra logic
to ensure robust treatment of non-ASCII characters.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/170308e6-a7a3-4484-87b2-f960bb564afa%40eisentraut.org
2025-11-28 08:33:07 +01:00
Amit Kapila
e68b6adad9 Add slotsync_skip_reason column to pg_replication_slots view.
Introduce a new column, slotsync_skip_reason, in the pg_replication_slots
view. This column records the reason why the last slot synchronization was
skipped. It is primarily relevant for logical replication slots on standby
servers where the 'synced' field is true. The value is NULL when
synchronization succeeds.

Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com
2025-11-28 05:21:35 +00:00
Michael Paquier
9ccc049dfe pg_buffercache: Add pg_buffercache_mark_dirty{,_relation,_all}()
This commit introduces three new functions for marking shared buffers as
dirty by using the functions introduced in 9660906dbd:
* pg_buffercache_mark_dirty() for one shared buffer.
- pg_buffercache_mark_dirt_relation() for all the shared buffers in a
relation.
* pg_buffercache_mark_dirty_all() for all the shared buffers in pool.

The "_all" and "_relation" flavors are designed to address the
inefficiency of repeatedly calling pg_buffercache_mark_dirty() for each
individual buffer, which can be time-consuming when dealing with with
large shared buffers pool.

These functions are intended as developer tools and are available only
to superusers.  There is no need to bump the version of pg_buffercache,
4b203d499c having done this job in this release cycle.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Joseph Koshakow <koshy44@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Yuhang Qiu <iamqyh@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw@mail.gmail.com
2025-11-28 09:04:04 +09:00
David Rowley
d167c19295 Fix possibly uninitialized HeapScanDesc.rs_startblock
The solution used in 0ca3b1697 to determine the Parallel TID Range
Scan's start location was to modify the signature of
table_block_parallelscan_startblock_init() to allow the startblock
to be passed in as a parameter.  This allows the scan limits to be
adjusted before that function is called so that the limits are picked up
when the parallel scan starts.  The commit made it so the call to
table_block_parallelscan_startblock_init uses the HeapScanDesc's
rs_startblock to pass the startblock to the parallel scan.  That all
works ok for Parallel TID Range scans as the HeapScanDesc rs_startblock
gets set by heap_setscanlimits(), but for Parallel Seq Scans, initscan()
does not initialize rs_startblock, and that results in passing an
uninitialized value to table_block_parallelscan_startblock_init() as
noted by the buildfarm member skink, running Valgrind.

To fix this issue, make it so initscan() sets the rs_startblock for
parallel scans unless we're doing a rescan.  This makes it so
table_block_parallelscan_startblock_init() will be called with the
startblock set to InvalidBlockNumber, and that'll allow the syncscan
code to find the correct start location (when enabled).  For Parallel
TID Range Scans, this InvalidBlockNumber value will be overwritten in
the call to heap_setscanlimits().

initscan() is a bit light on documentation on what's meant to get
initialized where for parallel scans.  From what I can tell, it looks like
it just didn't matter prior to 0ca3b1697 that rs_startblock was left
uninitialized for parallel scans.  To address the light documentation,
I've also added some comments to mention that the syncscan location for
parallel scans is figured out in table_block_parallelscan_startblock_init.
I've also taken the liberty to adjust the if/else if/else code in
initscan() to make it clearer which parts apply to parallel scans and
which parts are for the serial scans.

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqALm+k7FyfdQdCw1yF_8HojvR61YRrNhwRQPE=zSmnQA@mail.gmail.com
2025-11-28 12:40:50 +13:00
Michael Paquier
c75bf57a90 doc: Add missing tags in pg_buffercache page
Issue noticed while looking at this area of the documentation, for a
different patch.  This is a matter of style, so no backpatch is done.

Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw@mail.gmail.com
2025-11-28 08:00:23 +09:00
Michael Paquier
9660906dbd Add routines for marking buffers dirty efficiently
This commit introduces new internal bufmgr routines for marking shared
buffers as dirty:
* MarkDirtyUnpinnedBuffer()
* MarkDirtyRelUnpinnedBuffers()
* MarkDirtyAllUnpinnedBuffers()

These functions provide an efficient mechanism to respectively mark one
buffer, all the buffers of a relation, or the entire shared buffer pool
as dirty, something that can be useful to force patterns for the
checkpointer.  MarkDirtyUnpinnedBufferInternal(), an extra routine, is
used by these three, to mark as dirty an unpinned buffer.

They are intended as developer tools to manipulate buffer dirtiness in
bulk, and will be used in a follow-up commit.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Aidar Imamov <a.imamov@postgrespro.ru>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Joseph Koshakow <koshy44@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Yuhang Qiu <iamqyh@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ0h_YoSqqutxV6DES1RW8ig6wcA8CR9rJk358YRMxZFmw@mail.gmail.com
2025-11-28 07:39:33 +09:00
Tom Lane
5528e8d104 Allow indexscans on partial hash indexes with implied quals.
Normally, if a WHERE clause is implied by the predicate of a partial
index, we drop that clause from the set of quals used with the index,
since it's redundant to test it if we're scanning that index.
However, if it's a hash index (or any !amoptionalkey index), this
could result in dropping all available quals for the index's first
key, preventing us from generating an indexscan.

It's fair to question the practical usefulness of this case.  Since
hash only supports equality quals, the situation could only arise
if the index's predicate is "WHERE indexkey = constant", implying
that the index contains only one hash value, which would make hash
a really poor choice of index type.  However, perhaps there are
other !amoptionalkey index AMs out there with which such cases are
more plausible.

To fix, just don't filter the candidate indexquals this way if
the index is !amoptionalkey.  That's a bit hokey because it may
result in testing quals we didn't need to test, but to do it
more accurately we'd have to redundantly identify which candidate
quals are actually usable with the index, something we don't know
at this early stage of planning.  Doesn't seem worth the effort.

Reported-by: Sergei Glukhov <s.glukhov@postgrespro.ru>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/e200bf38-6b45-446a-83fd-48617211feff@postgrespro.ru
Backpatch-through: 14
2025-11-27 13:09:59 -05:00
Fujii Masao
246ec4a51c doc: Fix misleading synopsis for CREATE/ALTER PUBLICATION.
The documentation for CREATE/ALTER PUBLICATION previously showed:

        [ ONLY ] table_name [ * ] [ ( column_name [, ... ] ) ] [ WHERE ( expression ) ] [, ... ]

to indicate that the table/column specification could be repeated.
However, placing [, ... ] directly after a multi-part construct was
misleading and made it unclear which portion was repeatable.

This commit introduces a new term, table_and_columns, to represent:

        [ ONLY ] table_name [ * ] [ ( column_name [, ... ] ) ] [ WHERE ( expression ) ]

and updates the synopsis to use:

        table_and_columns [, ... ]

which clearly identifies the repeatable element.

Backpatched to v15, where the misleading syntax was introduced.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHut+PtsyvYL3KmA6C8f0ZpXQ=7FEqQtETVy-BOF+cm9WPvfMQ@mail.gmail.com
Backpatch-through: 15
2025-11-27 23:29:57 +09:00
Álvaro Herrera
9e8fa05d34 Fix new test for CATCACHE_FORCE_RELEASE builds
Two of the isolation tests introduce by commit bc32a12e0d had a
problem under CATCACHE_FORCE_RELEASE, as evidenced by buildfarm member
prion.  An injection point is hit ahead of what the test spec expects,
so a session goes to sleep and there's no one there to wait it up.  Fix
in the simplest possible way, which is to conditionally wake the process
up if it's waiting.  An alternative output file is necessary to cover
both cases.

This suggests a couple of possible improvements to the injection points
infrastructure: a conditional wakeup (doing nothing if no one is
sleeping, as opposed to throwing an error), as well as a way to attach
to a point in "deactivated" mode, activated later.

Author: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com>
Discussion: https://postgr.es/m/202511261817.fyixgtt3hqdr@alvherre.pgsql
2025-11-27 13:10:56 +01:00
Daniel Gustafsson
e396a18f32 doc: Fix typo in pg_dump documentation
Reported-by: Erik Rijkers <er@xs4all.nl>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/7596672c-43e8-a030-0850-2dd09af98cac@xs4all.nl
2025-11-27 09:25:56 +01:00
Peter Eisentraut
e7075a3405 Use C11 alignas in pg_atomic_uint64 definitions
They were already using pg_attribute_aligned.  This replaces that with
alignas and moves that into the required syntactic position.  This
ends up making these three atomics implementations appear a bit more
consistent, but shouldn't change anything otherwise.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/46f05236-d4d4-4b4e-84d4-faa500f14691%40eisentraut.org
2025-11-27 07:53:34 +01:00
Amit Langote
519fa0433b Fix error reporting for SQL/JSON path type mismatches
transformJsonFuncExpr() used exprType()/exprLocation() on the
possibly coerced path expression, which could be NULL when
coercion to jsonpath failed, leading to "cache lookup failed
for type 0" errors.

Preserve the original expression node so that type and location
in the "must be of type jsonpath" error are reported correctly.
Add regression tests to cover these cases.

Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/CACJufxHunVg81JMuNo8Yvv_hJD0DicgaVN2Wteu8aJbVJPBjZA@mail.gmail.com
Backpatch-through: 17
2025-11-27 12:07:01 +09:00
David Rowley
0ca3b16973 Add parallelism support for TID Range Scans
In v14, bb437f995 added support for scanning for ranges of TIDs using a
dedicated executor node for the purpose.  Here, we allow these scans to
be parallelized.  The range of blocks to scan is divvied up similarly to
how a Parallel Seq Scans does that, where 'chunks' of blocks are
allocated to each worker and the size of those chunks is slowly reduced
down to 1 block per worker by the time we're nearing the end of the
scan.  Doing that means workers finish at roughly the same time.

Allowing TID Range Scans to be parallelized removes the dilemma from the
planner as to whether a Parallel Seq Scan will cost less than a
non-parallel TID Range Scan due to the CPU concurrency of the Seq Scan
(disk costs are not divided by the number of workers).  It was possible
the planner could choose the Parallel Seq Scan which would result in
reading additional blocks during execution than the TID Scan would have.
Allowing Parallel TID Range Scans removes the trade-off the planner
makes when choosing between reduced CPU costs due to parallelism vs
additional I/O from the Parallel Seq Scan due to it scanning blocks from
outside of the required TID range.  There is also, of course, the
traditional parallelism performance benefits to be gained as well, which
likely doesn't need to be explained here.

Author: Cary Huang <cary.huang@highgo.ca>
Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Steven Niu <niushiji@gmail.com>
Discussion: https://postgr.es/m/18f2c002a24.11bc2ab825151706.3749144144619388582@highgo.ca
2025-11-27 14:05:04 +13:00
David Rowley
42473b3b31 Have the planner replace COUNT(ANY) with COUNT(*), when possible
This adds SupportRequestSimplifyAggref to allow pg_proc.prosupport
functions to receive an Aggref and allow them to determine if there is a
way that the Aggref call can be optimized.

Also added is a support function to allow transformation of COUNT(ANY)
into COUNT(*).  This is possible to do when the given "ANY" cannot be
NULL and also that there are no ORDER BY / DISTINCT clauses within the
Aggref.  This is a useful transformation to do as it is common that
people write COUNT(1), which until now has added unneeded overhead.
When counting a NOT NULL column.  The overheads can be worse as that
might mean deforming more of the tuple, which for large fact tables may
be many columns in.

It may be possible to add prosupport functions for other aggregates.  We
could consider if ORDER BY could be dropped for some calls, e.g. the
ORDER BY is quite useless in MAX(c ORDER BY c).

There is a little bit of passing fallout from adjusting
expr_is_nonnullable() to handle Const which results in a plan change in
the aggregates.out regression test.  Previously, nothing was able to
determine that "One-Time Filter: (100 IS NOT NULL)" was always true,
therefore useless to include in the plan.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAApHDvqGcPTagXpKfH=CrmHBqALpziThJEDs_MrPqjKVeDF9wA@mail.gmail.com
2025-11-27 10:43:28 +13:00
Nathan Bossart
dbdc717ac6 Teach DSM registry to retry entry initialization if needed.
If DSM registry entry initialization fails, backends could try to
use an uninitialized DSM segment, DSA, or dshash table (since the
entry is still added to the registry).  To fix, restructure the
code so that the registry retries initialization as needed.  This
commit also modifies pg_get_dsm_registry_allocations() to leave out
partially-initialized entries, as they shouldn't have any allocated
memory.

DSM registry entry initialization shouldn't fail often in practice,
but retrying was deemed better than leaving entries in a
permanently failed state (as was done by commit 1165a933aa, which
has since been reverted).

Suggested-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/E1vJHUk-006I7r-37%40gemulon.postgresql.org
Backpatch-through: 17
2025-11-26 15:12:25 -06:00
Jeff Davis
1476028225 Allow pg_locale_t APIs to work when ctype_is_c.
Previously, the caller needed to check ctype_is_c first for some
routines and not others. Now, the APIs consistently work, and the
caller can just check ctype_is_c for optimization purposes.

Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
2025-11-26 12:54:37 -08:00
Daniel Gustafsson
1cdb84bb1b Check for correct version of perltidy
pgperltidy requires a particular version of perltidy, but the version
wasn't checked like how pgindent checks the underlying indent binary.
Fix by checking the version of perltidy and error out if an incorrect
version is used.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/1209850.1764092152@sss.pgh.pa.us
2025-11-26 20:43:09 +01:00
Jeff Davis
8d299052fe Add #define for UNICODE_CASEMAP_BUFSZ.
Useful for mapping a single codepoint at a time into a
statically-allocated buffer.

Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
2025-11-26 10:05:11 -08:00
Jeff Davis
ec4997a9d7 Inline pg_ascii_tolower() and pg_ascii_toupper().
Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
2025-11-26 10:04:32 -08:00