If DSM entry initialization fails, backends could try to use an
uninitialized DSM segment, DSA, or dshash table (since the entry is
still added to the registry). To fix, keep track of whether
initialization completed, and ERROR if a backend tries to attach to
an uninitialized entry. We could instead retry initialization as
needed, but that seemed complicated, error prone, and unlikely to
help most cases. Furthermore, such problems probably indicate a
coding error.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/dd36d384-55df-4fc2-825c-5bc56c950fa9%40gmail.com
Backpatch-through: 17
Before we started to freeze async notify entries (commit 8eeb4a0f7c),
no one looked at the 'xid' on an entry with invalid 'dboid'. But now
we might actually need to freeze it later. Initialize them with
InvalidTransactionId to begin with, to avoid that work later.
Álvaro pointed this out in review of commit 8eeb4a0f7c, but I forgot
to include this change there.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://www.postgresql.org/message-id/202511071410.52ll56eyixx7@alvherre.pgsql
Backpatch-through: 14
Previous commit fixed a bug where VACUUM would truncate the CLOG
that's still needed to check the commit status of XIDs in the async
notify queue, but as mentioned in the commit message, it wasn't a full
fix. If a backend is executing asyncQueueReadAllNotifications() and
has just made a local copy of an async SLRU page which contains old
XIDs, vacuum can concurrently truncate the CLOG covering those XIDs,
and the backend still gets an error when it calls
TransactionIdDidCommit() on those XIDs in the local copy. This commit
fixes that race condition.
To fix, hold the SLRU bank lock across the TransactionIdDidCommit()
calls in NOTIFY processing.
Per Tom Lane's idea. Backpatch to all supported versions.
Reviewed-by: Joel Jacobson <joel@compiler.org>
Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Discussion: https://www.postgresql.org/message-id/2759499.1761756503@sss.pgh.pa.us
Backpatch-through: 14
The async notification queue contains the XID of the sender, and when
processing notifications we call TransactionIdDidCommit() on the
XID. But we had no safeguards to prevent the CLOG segments containing
those XIDs from being truncated away. As a result, if a backend didn't
for some reason process its notifications for a long time, or when a
new backend issued LISTEN, you could get an error like:
test=# listen c21;
ERROR: 58P01: could not access status of transaction 14279685
DETAIL: Could not open file "pg_xact/000D": No such file or directory.
LOCATION: SlruReportIOError, slru.c:1087
To fix, make VACUUM "freeze" the XIDs in the async notification queue
before truncating the CLOG. Old XIDs are replaced with
FrozenTransactionId or InvalidTransactionId.
Note: This commit is not a full fix. A race condition remains, where a
backend is executing asyncQueueReadAllNotifications() and has just
made a local copy of an async SLRU page which contains old XIDs, while
vacuum concurrently truncates the CLOG covering those XIDs. When the
backend then calls TransactionIdDidCommit() on those XIDs from the
local copy, you still get the error. The next commit will fix that
remaining race condition.
This was first reported by Sergey Zhuravlev in 2021, with many other
people hitting the same issue later. Thanks to:
- Alexandra Wang, Daniil Davydov, Andrei Varashen and Jacques Combrink
for investigating and providing reproducable test cases,
- Matheus Alcantara and Arseniy Mukhin for review and earlier proposed
patches to fix this,
- Álvaro Herrera and Masahiko Sawada for reviews,
- Yura Sokolov aka funny-falcon for the idea of marking transactions
as committed in the notification queue, and
- Joel Jacobson for the final patch version. I hope I didn't forget
anyone.
Backpatch to all supported versions. I believe the bug goes back all
the way to commit d1e027221d, which introduced the SLRU-based async
notification queue.
Discussion: https://www.postgresql.org/message-id/16961-25f29f95b3604a8a@postgresql.org
Discussion: https://www.postgresql.org/message-id/18804-bccbbde5e77a68c2@postgresql.org
Discussion: https://www.postgresql.org/message-id/CAK98qZ3wZLE-RZJN_Y%2BTFjiTRPPFPBwNBpBi5K5CU8hUHkzDpw@mail.gmail.com
Backpatch-through: 14
Previously, if async notify processing encountered an error, we would
report the error to the client and advance our read position past the
offending entry to prevent trying to process it over and over
again. Trying to continue after an error has a few problems however:
- We have no way of telling the client that a notification was
lost. They get an ERROR, but that doesn't tell you much. As such,
it's not clear if keeping the connection alive after losing a
notification is a good thing. Depending on the application logic,
missing a notification could cause the application to get stuck
waiting, for example.
- If the connection is idle, PqCommReadingMsg is set and any ERROR is
turned into FATAL anyway.
- We bailed out of the notification processing loop on first error
without processing any subsequent notifications. The subsequent
notifications would not be processed until another notify interrupt
arrives. For example, if there were two notifications pending, and
processing the first one caused an ERROR, the second notification
would not be processed until someone sent a new NOTIFY.
This commit changes the behavior so that any ERROR while processing
async notifications is turned into FATAL, causing the client
connection to be terminated. That makes the behavior more consistent
as that's what happened in idle state already, and terminating the
connection is a clear signal to the application that it might've
missed some notifications.
The reason to do this now is that the next commits will change the
notification processing code in a way that would make it harder to
skip over just the offending notification entry on error.
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Discussion: https://www.postgresql.org/message-id/fedbd908-4571-4bbe-b48e-63bfdcc38f64@iki.fi
Backpatch-through: 14
Instead of having to write a semicolon inside the macro argument, we can
insert a semicolon with another macro layer. This no longer gives
pg_bsd_indent indigestion, so we can remove the digestive aids that had
to be installed in the pgindent Perl script.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/202511111134.njrwf5w5nbjm@alvherre.pgsql
Backpatch-through: 18
The synopsis for the ALTER PUBLICATION ... DROP ... command incorrectly
implied that a column list and WHERE clause could be specified as part of
the publication object. However, these options are not allowed for
DROP operations, making the documentation misleading.
This commit corrects the synopsis to clearly show only the valid forms
of publication objects.
Backpatched to v15, where the incorrect synopsis was introduced.
Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHut+PsPu+47Q7b0o6h1r-qSt90U3zgbAHMHUag5o5E1Lo+=uw@mail.gmail.com
Backpatch-through: 15
Previously, error messages for oversized injection point names, libraries,
and functions showed buffer sizes (64, 128, 128) instead of the usable
character limits (63, 127, 127) as it did not count for the
zero-terminated byte, which was confusing. These messages are adjusted
to show better the reality.
The limit enforced for the private area was also too strict by one byte,
as specifying a zone worth exactly INJ_PRIVATE_MAXLEN should be able to
work because three is no zero-terminated byte in this case.
This is a stylistic change (well, mostly, a private_area size of exactly
1024 bytes can be defined with this change, something that nobody seem
to care about based on the lack of complaints). However, this is a
testing facility let's keep the logic consistent across all the branches
where this code exists, as there is an argument in favor of out-of-core
extensions that use injection points.
Author: Xuneng Zhou <xunengzhou@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CABPTF7VxYp4Hny1h+7ejURY-P4O5-K8WZg79Q3GUx13cQ6B2kg@mail.gmail.com
Backpatch-through: 17
A similar check existed in the MSVC scripts that have been removed in
v17 by 1301c80b21, but nothing of the kind was checked in meson when
building with a 4-byte off_t.
This commit adds a check to fail the builds when trying to use a
relation file size higher than 1GB when off_t is 4 bytes, like
./configure, rather than detecting these failures at runtime because the
code is not able to handle large files in this case.
Backpatch down to v16, where meson has been introduced.
Discussion: https://postgr.es/m/aQ0hG36IrkaSGfN8@paquier.xyz
Backpatch-through: 16
This omission allowed table owners to create statistics in any
schema, potentially leading to unexpected naming conflicts. For
ALTER TABLE commands that require re-creating statistics objects,
skip this check in case the user has since lost CREATE on the
schema. The addition of a second parameter to CreateStatistics()
breaks ABI compatibility, but we are unaware of any impacted
third-party code.
Reported-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Co-authored-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Security: CVE-2025-12817
Backpatch-through: 13
Several functions could overflow their size calculations, when presented
with very large inputs from remote and/or untrusted locations, and then
allocate buffers that were too small to hold the intended contents.
Switch from int to size_t where appropriate, and check for overflow
conditions when the inputs could have plausibly originated outside of
the libpq trust boundary. (Overflows from within the trust boundary are
still possible, but these will be fixed separately.) A version of
add_size() is ported from the backend to assist with code that performs
more complicated concatenation.
Reported-by: Aleksey Solovev (Positive Technologies)
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Security: CVE-2025-12818
Backpatch-through: 13
generic-gcc.h maps our read and write barriers to C11 acquire and
release fences using compiler builtins, for platforms where we don't
have our own hand-rolled assembler. This is apparently enough for GCC,
but the C11 memory model is only defined in terms of atomic accesses,
and our barriers for non-atomic, non-volatile accesses were not always
respected under Clang's stricter interpretation of the standard.
This explains the occasional breakage observed on new RISC-V + Clang
animal greenfly in lock-free PgAioHandle manipulation code containing a
repeating pattern of loads and read barriers. The problem can also be
observed in code generated for MIPS and LoongAarch, though we aren't
currently testing those with Clang, and on x86, though we use our own
assembler there. The scariest aspect is that we use the generic version
on very common ARM systems, but it doesn't seem to reorder the relevant
code there (or we'd have debugged this long ago).
Fix by inserting an explicit compiler barrier. It expands to an empty
assembler block declared to have memory side-effects, so registers are
flushed and reordering is prevented. In those respects this is like the
architecture-specific assembler versions, but the compiler is still in
charge of generating the appropriate fence instruction. Done for write
barriers on principle, though concrete problems have only been observed
with read barriers.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Tested-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/d79691be-22bd-457d-9d90-18033b78c40a%40gmail.com
Backpatch-through: 13
As usual, the release notes for other branches will be made by cutting
these down, but put them up for community review first.
Also as usual for a .1 release, there are some entries here that
are not really relevant for v18 because they already appeared in 18.0.
Those'll be removed later.
The following parameters can only be set at server start because
their context is PGC_POSTMASTER, but this information was missing
or incorrectly documented. This commit adds or corrects
that information for the following parameters:
* debug_io_direct
* dynamic_shared_memory_type
* event_source
* huge_pages
* io_max_combine_limit
* max_notify_queue_pages
* shared_memory_type
* track_commit_timestamp
* wal_decode_buffer_size
Backpatched to all supported branches.
Author: Karina Litskevich <litskevichkarina@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwGfPzcin-_6XwPgVbWTOUFVZgHF5g9ROrwLUdCTfjy=0A@mail.gmail.com
Backpatch-through: 13
XLogRecPtrIsInvalid() is inconsistent with the affirmative form of
macros used for other datatypes, and leads to awkward double negatives
in a few places. This commit introduces XLogRecPtrIsValid(), which
allows code to be written more naturally.
This patch only adds the new macro. XLogRecPtrIsInvalid() is left in
place, and all existing callers remain untouched. This means all
supported branches can accept hypothetical bug fixes that use the new
macro, and at the same time any code that compiled with the original
formulation will continue to silently compile just fine.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/aQB7EvGqrbZXrMlg@ip-10-97-1-34.eu-west-3.compute.internal
Since OpenBSD core dumps do not embed executable paths, the script now
searches for the corresponding binary manually within the specified
directory before invoking LLDB. This is imperfect but should find the
right executable in practice, as needed for meaningful backtraces.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ36R74TZ8RKsFueYwLxGKDAm3LU2FHM_ZUCSB6imd3vYA@mail.gmail.com
Backpatch-through: 18
postgres_fdw supports EvalPlanQual testing by using the infrastructure
provided by the core with the RecheckForeignScan callback routine (cf.
commits 5fc4c26db and 385f337c9), but there has been no test coverage
for that, except that recent commit 12609fbac, which fixed an issue in
commit 385f337c9, added a test case to exercise only a code path added
by that commit to the core infrastructure. So let's add test cases to
exercise other code paths as well at this time.
Like commit 12609fbac, back-patch to all supported branches.
Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Discussion: https://postgr.es/m/CAPmGK15%2B6H%3DkDA%3D-y3Y28OAPY7fbAdyMosVofZZ%2BNc769epVTQ%40mail.gmail.com
Backpatch-through: 13
If any shell command fails, the whole script should fail. To avoid
future omissions, add this even for single-command scripts that use su
with heredoc syntax, as they might be extended or copied-and-pasted.
Extracted from a larger patch that wanted to use #error during
compilation, leading to the diagnosis of this problem.
Reviewed-by: Tristan Partin <tristan@partin.io> (earlier version)
Discussion: https://postgr.es/m/DDZP25P4VZ48.3LWMZBGA1K9RH%40partin.io
Backpatch-through: 15
In generate_orderedappend_paths(), there is an assumption that a child
relation's row estimate is always greater than zero. There is an
Assert verifying this assumption, and the estimate is also used to
convert an absolute tuple count into a fraction.
However, this assumption is not always valid -- for example, upper
relations can have their row estimates unset, resulting in a value of
zero. This can cause an assertion failure in debug builds or lead to
the tuple fraction being computed as infinity in production builds.
To fix, use the row estimate from the cheapest_total path to compute
the tuple fraction. The row estimate in this path should already have
been forced to a valid value.
In passing, update the comment for generate_orderedappend_paths() to
note that the function also considers the cheapest-fractional case
when not all tuples need to be retrieved. That is, it collects all
the cheapest fractional paths and builds an ordered append path for
each interesting ordering.
Backpatch to v18, where this issue was introduced.
Bug: #19102
Reported-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com>
Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/19102-93480667e1200169@postgresql.org
Backpatch-through: 18
The test introduced by 17b2d5ec75 verifies that a WAL receiver
survives across a timeline jump by searching the server logs for
termination messages. However, it called restart() before the timeline
switch, which kills the WAL receiver and may log the exact message being
checked, hence failing the test. As TAP tests reuse the same log file
across restarts, a rotate_logfile() is used before the restart so as the
log matching check is not impacted by log entries generated by a
previous shutdown.
Recent changes to file handle inheritance altered I/O timing enough to
make this fail consistently while testing another patch.
While on it, this adds an extra check based on a PID comparison. This
test may lead to false positives as it could be possible that the WAL
receiver has processed a timeline jump before the initial PID is
grabbed, but it should be good enough in most cases.
Like 17b2d5ec75, backpatch down to v13.
Author: Bryan Green <dbryan.green@gmail.com>
Co-authored-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/9d00b597-d64a-4f1e-802e-90f9dc394c70@gmail.com
Backpatch-through: 13
The comment for ChangeVarNodes() refers to a parameter named
change_RangeTblRef, which does not exist in the code.
The comment for ChangeVarNodesExtended() contains an extra space,
while the comment for replace_relid_callback() has an awkward line
break and a typo.
This patch fixes these issues and revises some sentences for smoother
wording.
Oversights in commits ab42d643c and fc069a3a6.
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAMbWs480j16HC1JtjKCgj5WshivT8ZJYkOfTyZAM0POjFomJkg@mail.gmail.com
Backpatch-through: 18
First, the assertions in assign_io_method() were the wrong way round. Second,
the lengthof() assertion checked the length of io_method_options, which is the
wrong array to check and is always longer than pgaio_method_ops_table.
While add it, add a static assert to ensure pgaio_method_ops_table and
io_method_options stay in sync.
Per coverity and Tom Lane.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Backpatch-through: 18
In 2a0faed9d7, which added JIT compilation support for expressions, I
accidentally used sizeof(LLVMBasicBlockRef *) instead of
sizeof(LLVMBasicBlockRef) as part of computing the size of an allocation. That
turns out to have no real negative consequences due to LLVMBasicBlockRef being
a pointer itself (and thus having the same size). It still is wrong and
confusing, so fix it.
Reported by coverity.
Backpatch-through: 13
Allow pg_newlocale_from_collation(C_COLLATION_OID) to work even if
there's no catalog access, which some extensions expect.
Not known to be a bug without extensions involved, but backport to 18.
Also corrects an issue in master with dummy_c_locale (introduced in
commit 5a38104b36) where deterministic was not set. That wasn't a bug,
but could have been if that structure was used more widely.
Reported-by: Alexander Kukushkin <cyberdemn@gmail.com>
Reviewed-by: Alexander Kukushkin <cyberdemn@gmail.com>
Discussion: https://postgr.es/m/CAFh8B=nj966ECv5vi_u3RYij12v0j-7NPZCXLYzNwOQp9AcPWQ@mail.gmail.com
Backpatch-through: 18
This commit adds CHECK_FOR_INTERRUPTS to the shared buffer iteration
loops in EvictRelUnpinnedBuffers and EvictAllUnpinnedBuffers. These
functions, used by pg_buffercache's pg_buffercache_evict_relation and
pg_buffercache_evict_all, can now be interrupted during long-running
operations.
Backpatch to version 18, where these functions and their corresponding
pg_buffercache functions were introduced.
Author: Yuhang Qiu <iamqyh@gmail.com>
Discussion: https://postgr.es/m/8DC280D4-94A2-4E7B-BAB9-C345891D0B78%40gmail.com
Backpatch-through: 18
Commit a95e3d84c0 added ActiveSnapshot push+pop when processing
work-items (BRIN autosummarization), but forgot to handle the case of
a transaction failing during the run, which drops the snapshot untimely.
Fix by making the pop conditional on an element being actually there.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Backpatch-through: 13
Discussion: https://postgr.es/m/202511041648.nofajnuddmwk@alvherre.pgsql
When building intermediate TID lists during parallel GIN builds, split
the sorted lists into smaller chunks, to limit the amount of memory
needed when merging the chunks later.
The leader may need to keep in memory up to one chunk per worker, and
possibly one extra chunk (before evicting some of the data). The code
processing item pointers uses regular palloc/repalloc calls, which means
it's subject to the MaxAllocSize (1GB) limit.
We could fix this by allowing huge allocations, but that'd require
changes in many places without much benefit. Larger chunks do not
actually improve performance, so the memory usage would be wasted.
Fixed by limiting the chunk size to not hit MaxAllocSize. Each worker
gets a fair share.
This requires remembering the number of participating workers, in a
place that can be accessed from the callback. Luckily, the bs_worker_id
field in GinBuildState was unused, so repurpose that.
Report by Greg Smith, investigation and fix by me. Batchpatched to 18,
where parallel GIN builds were introduced.
Reported-by: Gregory Smith <gregsmithpgsql@gmail.com>
Discussion: https://postgr.es/m/CAHLJuCWDwn-PE2BMZE4Kux7x5wWt_6RoWtA0mUQffEDLeZ6sfA@mail.gmail.com
Backpatch-through: 18
It's possible to define BRIN indexes on functions that require a
snapshot to run, but the autosummarization feature introduced by commit
7526e10224 fails to provide one. This causes autovacuum to leave a
BRIN placeholder tuple behind after a failed work-item execution, making
such indexes less efficient. Repair by obtaining a snapshot prior to
running the task, and add a test to verify this behavior.
Author: Álvaro Herrera <alvherre@kurilemu.de>
Reported-by: Giovanni Fabris <giovanni.fabris@icon.it>
Reported-by: Arthur Nascimento <tureba@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/202511031106.h4fwyuyui6fz@alvherre.pgsql
Commit b4f584f9d2 (affecting v15~, later backpatched down to 13 as of
3635a0a35a) introduced an unconditional WAL receiver shutdown when
switching from streaming to archive WAL sources. This causes problems
during a timeline switch, when a WAL receiver enters WALRCV_WAITING
state but remains alive, waiting for instructions.
The unconditional shutdown can break some monitoring scenarios as the
WAL receiver gets repeatedly terminated and re-spawned, causing
pg_stat_wal_receiver.status to show a "streaming" instead of "waiting"
status, masking the fact that the WAL receiver is waiting for a new TLI
and a new LSN to be able to continue streaming.
This commit changes the WAL receiver behavior so as the shutdown becomes
conditional, with InstallXLogFileSegmentActive being always reset to
prevent the regression fixed by b4f584f9d2: only terminate the WAL
receiver when it is actively streaming (WALRCV_STREAMING,
WALRCV_STARTING, or WALRCV_RESTARTING). When in WALRCV_WAITING state,
just reset InstallXLogFileSegmentActive flag to allow archive
restoration without killing the process. WALRCV_STOPPED and
WALRCV_STOPPING are not reachable states in this code path. For the
latter, the startup process is the one in charge of setting
WALRCV_STOPPING via ShutdownWalRcv(), waiting for the WAL receiver to
reach a WALRCV_STOPPED state after switching walRcvState, so
WaitForWALToBecomeAvailable() cannot be reached while a WAL receiver is
in a WALRCV_STOPPING state.
A regression test is added to check that a WAL receiver is not stopped
on timeline jump, that fails when the fix of this commit is reverted.
Reported-by: Ryan Bird <ryanzxg@gmail.com>
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/19093-c4fff49a608f82a0@postgresql.org
Backpatch-through: 13
Add comments explaining when and where it is safe for nbtree to treat
row compare keys as if they were simple scalar inequality keys on the
row's most significant column. This is particularly important within
_bt_advance_array_keys, which deals with required inequality keys in a
general and uniform way, without any special handling for row compares.
Also spell out the implications of _bt_check_rowcompare's approach of
_conditionally_ evaluating lower-order row compare subkeys, particularly
when one of its lower-order subkeys might see NULL index tuple values
(these may or may not affect whether the qual as a whole is satisfied).
The behavior in this area isn't particularly intuitive, so these issues
seem worth going into.
In passing, add a few more defensive/documenting row comparison related
assertions to _bt_first and _bt_check_rowcompare.
Follow-up to commits bd3f59fd and ec986020.
Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-By: Victor Yegorov <vyegorov@gmail.com>
Reviewed-By: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAH2-Wznwkak_K7pcAdv9uH8ZfNo8QO7+tHXOaCUddMeTfaCCFw@mail.gmail.com
Backpatch-through: 18
_bt_first reliably uses the same equality key (on each index column) for
initial positioning purposes as the one that _bt_checkkeys can use to
end the scan following commit f09816a0. _bt_first no longer applies its
own independent rules to determine which initial positioning key to use
on each column (for equality and inequality keys alike). Preprocessing
is now fully in control of determining which keys start and end each
scan, ensuring that _bt_first and _bt_checkkeys have symmetric behavior.
Remove obsolete comments that described why _bt_first was expected to
use at least one of the available required equality keys for initial
positioning purposes. The rules in this area are now maximally strict
and uniform, so there's no reason to draw attention to equality keys.
Any column with a required equality key cannot have a redundant required
inequality key (nor can it have a redundant required equality key).
Oversight in commit f09816a0, which removed similar comments from
_bt_first, but missed these comments.
Author: Peter Geoghegan <pg@bowt.ie>
Backpatch-through: 18