1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-18 02:02:55 +03:00
Commit Graph

27476 Commits

Author SHA1 Message Date
Nathan Bossart
4c5e1d0785 Remove make_temptable_name_n().
This small function is only used in one place, and it fails to
handle quoted table names (although the table name portion of the
input should never be quoted in current usage).  In addition to
removing make_temptable_name_n() in favor of open-coding it where
needed, this commit ensures the "diff" table name is properly
quoted in order to future-proof this area a bit.

Author: Aleksander Alekseev <aleksander@tigerdata.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Shinya Kato <shinya11.kato@gmail.com>
Discussion: https://postgr.es/m/CAJ7c6TO3a5q2NKRsjdJ6sLf8isVe4aMaaX1-Hj2TdHdhFw8zRA%40mail.gmail.com
2025-10-22 12:31:55 -05:00
Fujii Masao
f33e60a53a Make invalid primary_slot_name follow standard GUC error reporting.
Previously, if primary_slot_name was set to an invalid slot name and
the configuration file was reloaded, both the postmaster and all other
backend processes reported a WARNING. With many processes running,
this could produce a flood of duplicate messages. The problem was that
the GUC check hook for primary_slot_name reported errors at WARNING
level via ereport().

This commit changes the check hook to use GUC_check_errdetail() and
GUC_check_errhint() for error reporting. As with other GUC parameters,
this causes non-postmaster processes to log the message at DEBUG3,
so by default, only the postmaster's message appears in the log file.

Backpatch to all supported versions.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/CAHGQGwFud-cvthCTfusBfKHBS6Jj6kdAPTdLWKvP2qjUX6L_wA@mail.gmail.com
Backpatch-through: 13
2025-10-22 20:09:43 +09:00
Tatsuo Ishii
2d7b247cb4 Fix multi WinGetFuncArgInFrame/Partition calls with IGNORE NULLS.
Previously it was mistakenly assumed that there's only one window
function argument which needs to be processed by WinGetFuncArgInFrame
or WinGetFuncArgInPartition when IGNORE NULLS option is specified. To
eliminate the limitation, WindowObject->notnull_info is modified from
"uint8 *" to "uint8 **" so that WindowObject->notnull_info could store
pointers to "uint8 *" which holds NOT NULL info corresponding to each
window function argument. Moreover, WindowObject->num_notnull_info is
changed from "int" to "int64 *" so that WindowObject->num_notnull_info
could store the number of NOT NULL info corresponding to each function
argument. Memories for these data structures will be allocated when
WinGetFuncArgInFrame or WinGetFuncArgInPartition is called. Thus no
memory except the pointers is allocated for function arguments which
do not call these functions

Also fix the set mark position logic in WinGetFuncArgInPartition to
not raise a "cannot fetch row before WindowObject's mark position"
error in IGNORE NULLS case.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Tatsuo Ishii <ishii@postgresql.org>
Discussion: https://postgr.es/m/2952409.1760023154%40sss.pgh.pa.us
2025-10-22 12:06:33 +09:00
Fujii Masao
883a95646a Fix stalled lag columns in pg_stat_replication when replay LSN stops advancing.
Previously, when the replay LSN reported in feedback messages from a standby
stopped advancing, for example, due to a recovery conflict, the write_lag and
flush_lag columns in pg_stat_replication would initially update but then stop
progressing. This prevented users from correctly monitoring replication lag.

The problem occurred because when any LSN stopped updating, the lag tracker's
cyclic buffer became full (the write head reached the slowest read head).
In that state, the lag tracker could no longer compute round-trip lag values
correctly.

This commit fixes the issue by handling the slowest read entry (the one
causing the buffer to fill) as a separate overflow entry and freeing space
so the write and other read heads can continue advancing in the buffer.
As a result, write_lag and flush_lag now continue updating even if the reported
replay LSN remains stalled.

Backpatch to all supported versions.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwGdGQ=1-X-71Caee-LREBUXSzyohkoQJd4yZZCMt24C0g@mail.gmail.com
Backpatch-through: 13
2025-10-22 11:27:15 +09:00
Michael Paquier
2b75c38b70 Add error_on_null(), checking if the input is the null value
This polymorphic function produces an error if the input value is
detected as being the null value; otherwise it returns the input value
unchanged.

This function can for example become handy in SQL function bodies, to
enforce that exactly one row was returned.

Author: Joel Jacobson <joel@compiler.org>
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/ece8c6d1-2ab1-45d5-ba12-8dec96fc8886@app.fastmail.com
Discussion: https://postgr.es/m/de94808d-ed58-4536-9e28-e79b09a534c7@app.fastmail.com
2025-10-22 09:55:17 +09:00
David Rowley
2470ca435c Use CompactAttribute more often, when possible
5983a4cff added CompactAttribute for storing commonly used fields from
FormData_pg_attribute.  5983a4cff didn't go to the trouble of adjusting
every location where we can use CompactAttribute rather than
FormData_pg_attribute, so here we change the remaining ones.

There are some locations where I've left the code using
FormData_pg_attribute.  These are mostly in the ALTER TABLE code.  Using
CompactAttribute here seems more risky as often the TupleDesc is being
changed and those changes may not have been flushed to the
CompactAttribute yet.

I've also left record_recv(), record_send(), record_cmp(), record_eq()
and record_image_eq() alone as it's not clear to me that accessing the
CompactAttribute is a win here due to the FormData_pg_attribute still
having to be accessed for most cases.  Switching the relevant parts to
use CompactAttribute would result in having to access both for common
cases.  Careful benchmarking may reveal that something can be done to
make this better, but in absence of that, the safer option is to leave
these alone.

In ReorderBufferToastReplace(), there was a check to skip attnums < 0
while looping over the TupleDesc.  Doing this is redundant since
TupleDescs don't store < 0 attnums.  Removing that code allows us to
move to using CompactAttribute.

The change in validateDomainCheckConstraint() just moves fetching the
FormData_pg_attribute into the ERROR path, which is cold due to calling
errstart_cold() and results in code being moved out of the common path.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAApHDvrMy90o1Lgkt31F82tcSuwRFHq3vyGewSRN=-QuSEEvyQ@mail.gmail.com
2025-10-22 11:36:26 +13:00
Jeff Davis
ff53907c35 Make char2wchar() static.
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com
2025-10-21 09:32:12 -07:00
Jeff Davis
844385d12e Remove obsolete global database_ctype_is_c.
Now that tsearch uses the database default locale, there's no need to
track the database CTYPE separately.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com
2025-10-21 09:32:04 -07:00
Jeff Davis
e113f9c102 tsearch: use database default collation for parsing.
Previously, tsearch used the database's CTYPE setting, which only
matches the database default collation if the locale provider is libc.

Note that tsearch types (tsvector and tsquery) are not collatable
types. The locale affects parsing the original text, which is a lossy
process, so a COLLATE clause on the already-parsed value would not
make sense.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com
2025-10-21 09:31:49 -07:00
Nathan Bossart
e94a7afe44 Re-pgindent brin.c.
Backpatch-through: 13
2025-10-21 09:56:26 -05:00
Álvaro Herrera
b7cc6474e9 Make smgr access for a BufferManagerRelation safer in relcache inval
Currently there's no bug, because we have no code path where we
invalidate relcache entries where it'd cause a problem.  But it's more
robust to do it this way in case we introduce such a path later, as some
Postgres forks reportedly already have.

Author: Daniil Davydov <3danissimo@gmail.com>
Reviewed-by: Stepan Neretin <slpmcf@gmail.com>
Discussion: https://postgr.es/m/CAJDiXgj3FNzAhV+jjPqxMs3jz=OgPohsoXFj_fh-L+nS+13CKQ@mail.gmail.com
2025-10-21 10:51:55 +03:00
David Rowley
9fd29d7ff4 Fix BRIN 32-bit counter wrap issue with huge tables
A BlockNumber (32-bit) might not be large enough to add bo_pagesPerRange
to when the table contains close to 2^32 pages.  At worst, this could
result in a cancellable infinite loop during the BRIN index scan with
power-of-2 pagesPerRange, and slow (inefficient) BRIN index scans and
scanning of unneeded heap blocks for non power-of-2 pagesPerRange.

Backpatch to all supported versions.

Author: sunil s <sunilfeb26@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAOG6S4-tGksTQhVzJM19NzLYAHusXsK2HmADPZzGQcfZABsvpA@mail.gmail.com
Backpatch-through: 13
2025-10-21 20:46:14 +13:00
Michael Paquier
e4e496e88c Fix comment in pg_get_shmem_allocations_numa()
The comment fixed in this commit described the function as dealing with
database blocks, but in reality it processes shared memory allocations.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aH4DDhdiG9Gi0rG7@ip-10-97-1-34.eu-west-3.compute.internal
Backpatch-through: 18
2025-10-21 16:12:30 +09:00
Richard Guo
18d2614093 Fix pushdown of degenerate HAVING clauses
67a54b9e8 taught the planner to push down HAVING clauses even when
grouping sets are present, as long as the clause does not reference
any columns that are nullable by the grouping sets.  However, there
was an oversight: if any empty grouping sets are present, the
aggregation node can produce a row that did not come from the input,
and pushing down a HAVING clause in this case may cause us to fail to
filter out that row.

Currently, non-degenerate HAVING clauses are not pushed down when
empty grouping sets are present, since the empty grouping sets would
nullify the vars they reference.  However, degenerate (variable-free)
HAVING clauses are not subject to this restriction and may be
incorrectly pushed down.

To fix, explicitly check for the presence of empty grouping sets and
retain degenerate clauses in HAVING when they are present.  This
ensures that we don't emit a bogus aggregated row.  A copy of each
such clause is also put in WHERE so that query_planner() can use it in
a gating Result node.

To facilitate this check, this patch expands the groupingSets tree of
the query to a flat list of grouping sets before applying the HAVING
pushdown optimization.  This does not add any additional planning
overhead, since we need to do this expansion anyway.

In passing, make a small tweak to preprocess_grouping_sets() by
reordering its initial operations a bit.

Backpatch to v18, where this issue was introduced.

Reported-by: Yuhang Qiu <iamqyh@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/0879D9C9-7FE2-4A20-9593-B23F7A0B5290@gmail.com
Backpatch-through: 18
2025-10-21 12:35:36 +09:00
Masahiko Sawada
4bea91f21f Support COPY TO for partitioned tables.
Previously, COPY TO command didn't support directly specifying
partitioned tables so users had to use COPY (SELECT ...) TO variant.

This commit adds direct COPY TO support for partitioned
tables, improving both usability and performance. Performance tests
show it's faster than the COPY (SELECT ...) TO variant as it avoids
the overheads of query processing and sending results to the COPY TO
command.

When used with partitioned tables, COPY TO copies the same rows as
SELECT * FROM table. Row-level security policies of the partitioned
table are applied in the same way as when executing COPY TO on a plain
table.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CACJufxEZt%2BG19Ors3bQUq-42-61__C%3Dy5k2wk%3DsHEFRusu7%3DiQ%40mail.gmail.com
2025-10-20 10:38:52 -07:00
Tom Lane
92cf557ffa Add static assertion that RELSEG_SIZE fits in an int.
Our configure script intended to ensure this, but it supposed that
expr(1) would report an error for integer overflow.  Maybe that
was true when the code was written (commit 3c6248a82 of 2008-05-02),
but all the modern expr's I tried will deliver bigger-than-int32
results without complaint.  Moreover, if you use --with-segsize-blocks
then there's no check at all.

Ideally we'd add a test in configure itself to check that the value
fits in int, but to do that we'd need to suppose that test(1) handles
bigger-than-int32 numbers correctly.  Probably modern ones do, but
that's an assumption I could do without; and I'm not too trusting
about meson either.  Instead, let's install a static assertion, so
that even people who ignore all the compiler warnings you get from
such values will be forced to confront the fact that it won't work.

This has been hazardous for awhile, but given that we hadn't heard
a complaint about it till now, I don't feel a need to back-patch.

Reported-by: Casey Shobe <casey.allen.shobe@icloud.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/C5DC82D6-C76D-4E8F-BC2E-DF03EFC4FA24@icloud.com
2025-10-19 18:28:46 -04:00
Tatsuo Ishii
dd766a441d Fix Coverity issue reported in commit 2273fa32bc.
Coverity complains that the return value from gettuple_eval_partition
(stored in variable "datum") in a do..while loop in
WinGetFuncArgInPartition is overwritten when exiting the while
loop. This commit tries to fix the issue by changing the
gettuple_eval_partition call to:

(void) gettuple_eval_partition()

explicitly stating that we discard the return value. We are just
interested in whether we are inside or outside of partition, NULL or
NOT NULL here.

Also enhance some comments for easier code reading.

Reported-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aPCOabSE4VfJLaky%40paquier.xyz
2025-10-19 09:29:26 +09:00
Jeff Davis
e533524b23 Add pg_database_locale() to retrieve database default locale.
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com
2025-10-18 16:25:23 -07:00
Jeff Davis
67a8b49e96 Add pg_iswxdigit(), useful for tsearch.
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com
2025-10-18 16:25:11 -07:00
David Rowley
e3b9e44689 Tidyup truncate_useless_pathkeys() function
This removes a few static functions and replaces them with 2 functions
which aim to be more reusable.  The upper planner's pathkey requirements
can be simplified down to operations which require pathkeys in the same
order as the pathkeys for the given operation, and operations which can
make use of a Path's pathkeys in any order.

Here we also add some short-circuiting to truncate_useless_pathkeys().  At
any point we discover that all pathkeys are useful to a single operation,
we can stop checking the remaining operations as we're not going to be
able to find any further useful pathkeys - they're all possibly useful
already.  Adjusting this seems to warrant trying to put the checks
roughly in order of least-expensive-first so that the short-circuits
have the most chance of skipping the more expensive checks.

In passing clean up has_useful_pathkeys() as it seems to have grown a
redundant check for group_pathkeys.  This isn't needed as
standard_qp_callback will set query_pathkeys if there's any requirement
to have group_pathkeys.  All this code does is waste run-time effort and
take up needless space.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAApHDvpbsEoTksvW5901MMoZo-hHf78E5up3uDOfkJnxDe_WAw@mail.gmail.com
2025-10-19 10:13:13 +13:00
David Rowley
5c0a20003b Fix reset of incorrect hash iterator in GROUPING SETS queries
This fixes an unlikely issue when fetching GROUPING SET results from
their internally stored hash tables.  It was possible in rare cases that
the hash iterator would be set up incorrectly which could result in a
crash.

This was introduced in 4d143509c, so backpatch to v18.

Many thanks to Yuri Zamyatin for reporting and helping to debug this
issue.

Bug: #19078
Reported-by: Yuri Zamyatin <yuri@yrz.am>
Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/19078-dfd62f840a2c0766@postgresql.org
Backpatch-through: 18
2025-10-18 16:07:04 +13:00
Tomas Vondra
b85c4700fc Fix hashjoin memory balancing logic
Commit a1b4f289be improved the hashjoin sizing to also consider the
memory used by BufFiles for batches. The code however had multiple
issues, making it ineffective or not working as expected in some cases.

* The amount of memory needed by buffers was calculated using uint32,
  so it would overflow for nbatch >= 262144. If this happened the loop
  would exit prematurely and the memory usage would not be reduced.

  The nbatch overflow is fixed by reworking the condition to not use a
  multiplication at all, so there's no risk of overflow. An explicit
  cast was added to a similar calculation in ExecHashIncreaseBatchSize.

* The loop adjusting the nbatch value used hash_table_bytes to calculate
  the old/new size, but then updated only space_allowed. The consequence
  is the total memory usage was not reduced, but all the memory saved by
  reducing the number of batches was used for the internal hash table.

  This was fixed by using only space_allowed. This is also more correct,
  because hash_table_bytes does not account for skew buckets.

* The code was also doubling multiple parameters (e.g. the number of
  buckets for hash table), but was missing overflow protections.

  The loop now checks for overflow, and terminates if needed. It'd be
  possible to cap the value and continue the loop, but it's not worth
  the complexity. And the overflow implies the in-memory hash table is
  already very large anyway.

While at it, rework the comment explaining how the memory balancing
works, to make it more concise and easier to understand.

The initial nbatch overflow issue was reported by Vaibhav Jain. The
other issues were noticed by me and Melanie Plageman. Fix by me, with a
lot of review and feedback by Melanie.

Backpatch to 18, where the hashjoin memory balancing was introduced.

Reported-by: Vaibhav Jain <jainva@google.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CABa-Az174YvfFq7rLS+VNKaQyg7inA2exvPWmPWqnEn6Ditr_Q@mail.gmail.com
2025-10-17 22:21:50 +02:00
Peter Eisentraut
e1a912c86d Change config_generic.vartype to be initialized at compile time
Previously, this was initialized at run time so that it did not have
to be maintained by hand in guc_tables.c.  But since that table is now
generated anyway, we might as well generate this bit as well.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org
2025-10-17 10:33:54 +02:00
Peter Eisentraut
0a7bde4610 Use designated initializers for guc_tables
This makes the generating script simpler and the output easier to
read.  In the future, it will make it easier to reorder and rearrange
the underlying C structures.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org
2025-10-17 10:29:42 +02:00
Daniel Gustafsson
6aa184c80f Replace defunct URL with stable archive.org URL in rbtree.c
The URL for "Sorting and Searching Algorithms: A Cookbook"
by Thomas Niemann has started returning 404, and since we
refer to the page for license terms this replaces the now
defunct link with one to the copy on archive.org.

Author: Chao Li <lic@highgo.com>
Discussion: https://postgr.es/m/6DED3DEF-875E-4D1D-8F8F-7353D5AF7B79@gmail.com
2025-10-17 09:38:49 +02:00
Nathan Bossart
812221b204 Remove partColsUpdated.
This information appears to have been unused since commit
c5b7ba4e67.  We could not find any references in third-party code,
either.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aO_CyFRpbVMtgJWM%40nathan
2025-10-16 11:31:38 -05:00
Amit Kapila
41c674d2e3 Refactor logical worker synchronization code into a separate file.
To support the upcoming addition of a sequence synchronization worker,
this patch extracts common synchronization logic shared by table sync
workers and the new sequence sync worker into a dedicated file. This
modularization improves code reuse, maintainability, and clarity in the
logical workers framework.

Author: vignesh C <vignesh21@gmail.com>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
2025-10-16 05:10:50 +00:00
Amit Langote
905e932f09 Fix EPQ crash from missing partition directory in EState
EvalPlanQualStart() failed to propagate es_partition_directory into
the child EState used for EPQ rechecks. When execution time partition
pruning ran during the EPQ scan, executor code dereferenced a NULL
partition directory and crashed.

Previously, propagating es_partition_directory into the EPQ EState was
unnecessary because CreatePartitionPruneState(), which sets it on
demand, also initialized the exec-pruning context.  After commit
d47cbf474, CreatePartitionPruneState() now initializes only the init-
time pruning context, leaving exec-pruning context initialization to
ExecInitNode(). Since EvalPlanQualStart() runs only ExecInitNode() and
not CreatePartitionPruneState(), it can encounter a NULL
es_partition_directory.  Other executor fields initialized during
CreatePartitionPruneState() are already copied into the child EState
thanks to commit 8741e48e5d, but es_partition_directory was missed.

Fix by borrowing the parent estate's  es_partition_directory in
EvalPlanQualStart(), and by clearing that field in EvalPlanQualEnd()
so the parent remains responsible for freeing the directory.

Add an isolation test permutation that triggers EPQ with execution-
time partition pruning, the case that reproduces this crash.

Bug: #19078
Reported-by: Yuri Zamyatin <yuri@yrz.am>
Diagnosed-by: David Rowley <dgrowleyml@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Co-authored-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/19078-dfd62f840a2c0766@postgresql.org
Backpatch-through: 18
2025-10-16 14:01:44 +09:00
Nathan Bossart
079480dc20 Fix lookup code for REINDEX INDEX.
This commit adjusts RangeVarCallbackForReindexIndex() to handle an
extremely unlikely race condition involving concurrent OID reuse.
In short, if REINDEX INDEX is executed at the same time that the
index is re-created with the same name and OID but a different
parent table OID, we might lock the wrong parent table.  To fix,
simply detect when this happens and emit an ERROR.  Unfortunately,
we can't gracefully handle this situation because we will have
already locked the index, and we must lock the parent table before
the index to avoid deadlocks.

While at it, I've replaced all but one early return in this
callback function with ERRORs that should be unreachable.  While I
haven't verified the presence of a live bug, the checks in question
appear to be unnecessary, and the early returns seem prone to
breaking the parent table locking code in subtle ways.  If nothing
else, this simplifies the code a bit.

This is a bug fix and could be back-patched, but given the presumed
rarity of the race condition and the lack of reports, I'm not going
to bother.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/Z8zwVmGzXyDdkAXj%40nathan
2025-10-15 16:32:40 -05:00
Jeff Davis
af164f31b9 Add pg_iswalpha() and related functions.
Per-character pg_locale_t APIs. Useful for tsearch parsing and
potentially other places.

Significant overlap with the regc_wc_isalpha() and related functions
in regc_pg_locale.c, but this change leaves those intact for
now.

Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com
2025-10-15 12:54:01 -07:00
Nathan Bossart
688dc6299a Fix lookups in pg_{clear,restore}_{attribute,relation}_stats().
Presently, these functions look up the relation's OID, lock it, and
then check privileges.  Not only does this approach provide no
guarantee that the locked relation matches the arguments of the
lookup, but it also allows users to briefly lock relations for
which they do not have privileges, which might enable
denial-of-service attacks.  This commit adjusts these functions to
use RangeVarGetRelidExtended(), which is purpose-built to avoid
both of these issues.  The new RangeVarGetRelidCallback function is
somewhat complicated because it must handle both tables and
indexes, and for indexes, we must check privileges on the parent
table and lock it first.  Also, it needs to handle a couple of
extremely unlikely race conditions involving concurrent OID reuse.

A downside of this change is that the coding doesn't allow for
locking indexes in AccessShare mode anymore; everything is locked
in ShareUpdateExclusive mode.  Per discussion, the original choice
of lock levels was intended for a now defunct implementation that
used in-place updates, so we believe this change is okay.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/Z8zwVmGzXyDdkAXj%40nathan
Backpatch-through: 18
2025-10-15 12:47:33 -05:00
Peter Eisentraut
5f4c3b33a9 Change reset_extra into a config_generic common field
This is not specific to the GUC parameter type, so it can be part of
the generic struct rather than the type-specific struct (like the
related "extra" field).  This allows for some code simplifications.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org
2025-10-15 15:20:28 +02:00
Peter Eisentraut
dd3ae37830 Add log_autoanalyze_min_duration
The log output functionality of log_autovacuum_min_duration applies to
both VACUUM and ANALYZE, so it is not possible to separate the VACUUM
and ANALYZE log output thresholds. Logs are likely to be output only for
VACUUM and not for ANALYZE.

Therefore, we decided to separate the threshold for log output of VACUUM
by autovacuum (log_autovacuum_min_duration) and the threshold for log
output of ANALYZE by autovacuum (log_autoanalyze_min_duration).

Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Kasahara Tatsuhito <kasaharatt@oss.nttdata.com>
Discussion: https://www.postgresql.org/message-id/flat/CAOzEurQtfV4MxJiWT-XDnimEeZAY+rgzVSLe8YsyEKhZcajzSA@mail.gmail.com
2025-10-15 14:31:12 +02:00
Peter Eisentraut
29dc7a6687 Add some const qualifiers
in guc-related source files, in anticipation of some further
restructuring.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org
2025-10-15 10:05:53 +02:00
Peter Eisentraut
1a79518888 Modernize some for loops
in guc-related source files, in anticipation of some further
restructuring.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org
2025-10-15 10:00:37 +02:00
Amit Kapila
2436b8c047 Standardize use of REFRESH PUBLICATION in code and messages.
This patch replaces ALTER SUBSCRIPTION REFRESH with
ALTER SUBSCRIPTION REFRESH PUBLICATION in comments and error messages to
improve clarity and support future extensibility. The change aligns with
upcoming addition REFRESH SEQUENCES for sequence synchronization.

Author: vignesh C <vignesh21@gmail.com>
Author: Hou Zhijie <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com
2025-10-15 03:42:27 +00:00
Melanie Plageman
3e4705484e Make heap_page_is_all_visible independent of LVRelState
This function only requires a few fields from LVRelState, so pass them
in individually.

This change allows calling heap_page_is_all_visible() from code such as
pruneheap.c, which does not have access to an LVRelState.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl%40mdyyqpqzxjek
2025-10-14 17:43:41 -04:00
Melanie Plageman
43b05b38ea Inline TransactionIdFollows/Precedes[OrEquals]()
These functions appeared prominently in a profile of a patch that sets
the visibility map on-access. Inline them to remove call overhead and
make them cheaper to use in hot paths.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl%40mdyyqpqzxjek
2025-10-14 17:03:48 -04:00
Melanie Plageman
c8dd6542ba Add helper for freeze determination to heap_page_prune_and_freeze
After scanning the line pointers on a heap page during the first phase
of vacuum, we use the information collected to decide whether to use
the assembled freeze plans.

Move this decision logic into a helper function to improve readability.

While here, rename a PruneState member and disambiguate some local
variables in heap_page_prune_and_freeze().

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/2wk7jo4m4qwh5sn33pfgerdjfujebbccsmmlownybddbh6nawl%40mdyyqpqzxjek
2025-10-14 15:08:50 -04:00
Jeff Davis
8efe982fe2 pg_regc_locale.c: rename some static functions.
Use the more specific prefix "regc_" rather than the generic prefix
"pg_".

A subsequent commit will create generic versions of some of these
functions that can be called from other modules.

Discussion: https://postgr.es/m/0151ad01239e2cc7b3139644358cf8f7b9622ff7.camel@j-davis.com
2025-10-14 11:04:04 -07:00
Tatsuo Ishii
5f3808646f Use ereport rather than elog in WinCheckAndInitializeNullTreatment.
Previously WinCheckAndInitializeNullTreatment() used elog() to emit an
error message. ereport() should be used instead because it's a
user-facing error. Also use existing get_func_name() to get a
function's name, rather than own implementation.

Moreover add an assertion to validate winobj parameter, just like
other window function API.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Tatsuo Ishii <ishii@postgresql.org>
Reviewed-by: Chao Li <lic@highgo.com>
Discussion: https://postgr.es/m/2952409.1760023154%40sss.pgh.pa.us
2025-10-14 19:15:24 +09:00
Richard Guo
1206df04c2 Rename apply_at to apply_agg_at for clarity
The field name "apply_at" in RelAggInfo was a bit ambiguous.  Rename
it to "apply_agg_at" to improve clarity and make its purpose clearer.

Per complaint from David Rowley, Robert Haas.

Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+TgmoZ0KR2_XCWHy17=HHcQ3p2Mamc9c6Dnnhf1J6wPYFD9ng@mail.gmail.com
2025-10-14 16:35:22 +09:00
Melanie Plageman
add323da40 Eliminate XLOG_HEAP2_VISIBLE from vacuum phase III
Instead of emitting a separate XLOG_HEAP2_VISIBLE WAL record for each
page that becomes all-visible in vacuum's third phase, specify the VM
changes in the already emitted XLOG_HEAP2_PRUNE_VACUUM_CLEANUP record.

Visibility checks are now performed before marking dead items unused.
This is safe because the heap page is held under exclusive lock for the
entire operation.

This reduces the number of WAL records generated by VACUUM phase III by
up to 50%.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_ZMw6Npd_qm2KM%2BFwQ3cMOMx1Dh3VMhp8-V7SOLxdK9-g%40mail.gmail.com
2025-10-13 18:01:06 -04:00
Tom Lane
03bf7a12c5 Fix incorrect message-printing in win32security.c.
log_error() would probably fail completely if used, and would
certainly print garbage for anything that needed to be interpolated
into the message, because it was failing to use the correct printing
subroutine for a va_list argument.

This bug likely went undetected because the error cases this code
is used for are rarely exercised - they only occur when Windows
security API calls fail catastrophically (out of memory, security
subsystem corruption, etc).

The FRONTEND variant can be fixed just by calling vfprintf()
instead of fprintf().  However, there was no va_list variant
of write_stderr(), so create one by refactoring that function.
Following the usual naming convention for such things, call
it vwrite_stderr().

Author: Bryan Green <dbryan.green@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=TzaM+tWPnw@mail.gmail.com
Backpatch-through: 13
2025-10-13 17:56:45 -04:00
Peter Geoghegan
7a662a46eb Remove unused nbtree array advancement variable.
Remove a variable that is no longer in use following commit 9a2e2a28.
It's not immediately clear why there were no compiler warnings about
this oversight.

Author: Peter Geoghegan <pg@bowt.ie>
Backpatch-through: 18
2025-10-12 14:04:08 -04:00
Álvaro Herrera
3231fd0455 Stop creating constraints during DETACH CONCURRENTLY
Commit 71f4c8c6f7 (which implemented DETACH CONCURRENTLY) added code
to create a separate table constraint when a table is detached
concurrently, identical to the partition constraint, on the theory that
such a constraint was needed in case the optimizer had constructed any
query plans that depended on the constraint being there.  However, that
theory was apparently bogus because any such plans would be invalidated.

For hash partitioning, those constraints are problematic, because their
expressions reference the OID of the parent partitioned table, to which
the detached table is no longer related; this causes all sorts of
problems (such as inability of restoring a pg_dump of that table, and
the table no longer working properly if the partitioned table is later
dropped).

We'd like to get rid of all those constraints.  In fact, for branch
master, do that -- no longer create any substitute constraints.
However, out of fear that some users might somehow depend on these
constraints for other partitioning strategies, for stable branches
(back to 14, which added DETACH CONCURRENTLY), only do it for hash
partitioning.

(If you repeatedly DETACH CONCURRENTLY and then ATTACH a partition, then
with this constraint addition you don't need to scan the table in the
ATTACH step, which presumably is good.  But if users really valued this
feature, they would have requested that it worked for non-concurrent
DETACH also.)

Author: Haiyang Li <mohen.lhy@alibaba-inc.com>
Reported-by: Fei Changhong <feichanghong@qq.com>
Reported-by: Haiyang Li <mohen.lhy@alibaba-inc.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/18371-7fef49f63de13f02@postgresql.org
Discussion: https://postgr.es/m/19070-781326347ade7c57@postgresql.org
2025-10-11 20:30:12 +02:00
Álvaro Herrera
ff47f9c16c dbase_redo: Fix Valgrind-reported memory leak
Introduced by my (Álvaro's) commit 9e4f914b5e, which was itself
backpatched to pg10, though only pg15 and up contain the problem
because of commit 9c08aea6a3.

This isn't a particularly significant leak, but given the fix is
trivial, we might as well backpatch to all branches where it applies, so
do that.

Author: Nathan Bossart <nathandbossart@gmail.com>
Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/x4odfdlrwvsjawscnqsqjpofvauxslw7b4oyvxgt5owoyf4ysn@heafjusodrz7
2025-10-11 16:39:22 +02:00
Peter Geoghegan
843e50208a Remove overzealous _bt_killitems assertion.
An assertion in _bt_killitems expected the scan's currPos state to
contain a valid LSN, saved from when currPos's page was initially read.
The assertion failed to account for the fact that even logged relations
can have leaf pages with an invalid LSN when built with wal_level set to
"minimal".  Remove the faulty assertion.

Oversight in commit e6eed40e (though note that the assertion was
backpatched to stable branches before 18 by commit 7c319f54).

Author: Peter Geoghegan <pg@bowt.ie>
Reported-By: Matthijs van der Vleuten <postgresql@zr40.nl>
Bug: #19082
Discussion: https://postgr.es/m/19082-628e62160dbbc1c1@postgresql.org
Backpatch-through: 13
2025-10-10 14:52:25 -04:00
Michael Paquier
3a36543d7d Fix two typos in xlogstats.h and xlogstats.c
Issue found while browsing this area of the code, introduced and
copy-pasted around by 2258e76f90.

Backpatch-through: 15
2025-10-10 11:51:45 +09:00
Michael Paquier
912af1c7e9 Remove state.tmp when failing to save a replication slot
An error happening while a slot data is saved on disk in
SaveSlotToPath() could cause a state.tmp file (temporary file holding
the slot state data, renamed to its permanent name at the end of the
function) to remain around after it has been created.  This temporary
file is created with O_EXCL, meaning that if an existing state.tmp is
found, its creation would fail.  This would prevent the slot data to be
saved, requiring a manual intervention to remove state.tmp before being
able to save again a slot.  Possible scenarios where this temporary file
could remain on disk is for example a ENOSPC case (no disk space) while
writing, syncing or renaming it.  The bug reports point to a write
failure as the principal cause of the problems.

Using O_TRUNC has been argued back in 2019 as a potential solution to
discard any temporary file that could exist.  This solution was rejected
as O_EXCL can also act as a safety measure when saving the slot state,
crash recovery offering cleanup guarantees post-crash.  This commit uses
the alternative approach that has been suggested by Andres Freund back
in 2019.  When the temporary state file cannot be written, synced,
closed or renamed (note: not when created!), an unlink() is used to
remove the temporary state file while holding the in-progress I/O
LWLock, so as any follow-up attempts to save a slot's data would not
choke on an existing file that remained around because of a previous
failure.

This problem has been reported a few times across the years, going back
to 2019, but for some reason I have never come back to do something
about it and it has been forgotten.  A recent report has reminded me
that this was still a problem.

Reported-by: Kevin K Biju <kevinkbiju@gmail.com>
Reported-by: Sergei Kornilov <sk@zsrv.org>
Reported-by: Grigory Smolkin <g.smolkin@postgrespro.ru>
Discussion: https://postgr.es/m/CAM45KeHa32soKL_G8Vk38CWvTBeOOXcsxAPAs7Jt7yPRf2mbVA@mail.gmail.com
Discussion: https://postgr.es/m/3559061693910326@qy4q4a6esb2lebnz.sas.yp-c.yandex.net
Discussion: https://postgr.es/m/08bbfab1-a61d-3750-fc18-4ab2c1aa7f09@postgrespro.ru
Backpatch-through: 13
2025-10-10 09:23:59 +09:00