1
0
mirror of https://github.com/postgres/postgres.git synced 2026-01-26 09:41:40 +03:00
Commit Graph

12610 Commits

Author SHA1 Message Date
Andres Freund
0b96e734c5 heapam: Add batch mode mvcc check and use it in page mode
There are two reasons for doing so:

1) It is generally faster to perform checks in a batched fashion and making
   sequential scans faster is nice.

2) We would like to stop setting hint bits while pages are being written
   out. The necessary locking becomes visible for page mode scans, if done for
   every tuple. With batching, the overhead can be amortized to only happen
   once per page.

There are substantial further optimization opportunities along these
lines:

- Right now HeapTupleSatisfiesMVCCBatch() simply uses the single-tuple
  HeapTupleSatisfiesMVCC(), relying on the compiler to inline it. We could
  instead write an explicitly optimized version that avoids repeated xid
  tests.

- Introduce batched version of the serializability test

- Introduce batched version of HeapTupleSatisfiesVacuum

Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/6rgb2nvhyvnszz4ul3wfzlf5rheb2kkwrglthnna7qhe24onwr@vw27225tkyar
2026-01-12 13:22:04 -05:00
Álvaro Herrera
225d1df1d2 Stop including {brin,gin}_tuple.h in tuplesort.h
Doing this meant that those two headers, which are supposed to be
internal to their corresponding index AMs, were being included pretty
much universally, because tuplesort.h is included by execnodes.h which
is very widely used.  Stop that, and fix fallout.

We also change indexing.h to no longer include execnodes.h (tuptable.h
is sufficient), and relscan.h to no longer include buf.h (pointless
since c2fe139c20).

Author: Mario González <gonzalemario@gmail.com>
Discussion: https://postgr.es/m/CAFsReFUcBFup=Ohv_xd7SNQ=e73TXi8YNEkTsFEE2BW7jS1noQ@mail.gmail.com
2026-01-12 18:09:49 +01:00
Álvaro Herrera
2defd00062 Move instrumentation-related structs to instrument_node.h
Some structs and enums related to parallel query instrumentation had
organically grown scattered across various files, and were causing
header pollution especially through execnodes.h.  Create a single file
where they can live together.

This only moves the structs to the new file; cleaning up the pollution
by removing no-longer-necessary cross-header inclusion will be done in
future commits.

Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de>
Co-authored-by: Mario González <gonzalemario@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/202510051642.wwmn4mj77wch@alvherre.pgsql
Discussion: https://postgr.es/m/CAFsReFUr4KrQ60z+ck9cRM4WuUw1TCghN7EFwvV0KvuncTRc2w@mail.gmail.com
2026-01-12 16:59:28 +01:00
Andres Freund
e5a5e0a907 instrumentation: Keep time fields as instrtime, convert in callers
Previously the instrumentation logic always converted to seconds, only for
many of the callers to do unnecessary division to get to milliseconds. As an
upcoming refactoring will split the Instrumentation struct, utilize instrtime
always to keep things simpler. It's also a bit faster to not have to first
convert to a double in functions like InstrEndLoop(), InstrAggNode().

Author: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAP53PkzZ3UotnRrrnXWAv=F4avRq9MQ8zU+bxoN9tpovEu6fGQ@mail.gmail.com
2026-01-09 13:38:00 -05:00
Heikki Linnakangas
bba81f9d3d Inline ginCompareAttEntries for speed
It is called in tight loops during GIN index build.

Author: David Geier <geidav.pg@gmail.com>
Discussion: https://www.postgresql.org/message-id/5d366878-2007-4d31-861e-19294b7a583b@gmail.com
2026-01-09 20:31:43 +02:00
Peter Eisentraut
69d76fb2ab Decouple C++ support in Meson's PGXS from LLVM enablement
This is important for Postgres extensions that are written in C++,
such as pg_duckdb, which uses PGXS as the build system currently.  In
the autotools build, C++ is not coupled to LLVM.  If the autotools
build is configured without --with-llvm, the C++ compiler and the
various flags get persisted into the Makefile.global.

Author: Tristan Partin <tristan@partin.io>
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://www.postgresql.org/message-id/flat/D98JHQF7H2A8.VSE3I4CJBTAB%40partin.io
2026-01-09 10:25:02 +01:00
Tom Lane
b352d3d80b Mark GiST inet_ops as opcdefault, and deal with ensuing fallout.
This patch completes the transition to making inet_ops be default for
inet/cidr columns, rather than btree_gist's opclasses.  Once we do
that, though, pg_upgrade has a big problem.  A dump from an older
version will see btree_gist's opclasses as being default, so it will
not mention the opclass explicitly in CREATE INDEX commands, which
would cause the restore to create the indexes using inet_ops.  Since
that's not compatible with what's actually in the files, havoc would
ensue.

This isn't readily fixable, because the CREATE INDEX command strings
are built by the older server's pg_get_indexdef() function; pg_dump
hasn't nearly enough knowledge to modify those strings successfully.
Even if we cared to put in the work to make that happen in pg_dump,
it would be counterproductive because the end goal here is to get
people off of these opclasses.  Allowing such indexes to persist
through pg_upgrade wouldn't advance that goal.

Therefore, this patch just adds code to pg_upgrade to detect indexes
that would be problematic and refuse to upgrade.

There's another issue too: even without any indexes to worry about,
pg_dump in binary-upgrade mode will reproduce the "CREATE OPERATOR
CLASS ... DEFAULT" commands for btree_gist's opclasses, and those
will fail because now we have a built-in opclass that provides a
conflicting default.  We could ask users to drop the btree_gist
extension altogether before upgrading, but that would carry very
severe penalties.  It would affect perfectly-valid indexes for other
data types, and it would drop operators that might be relied on in
views or other database objects.  Instead, put a hack in DefineOpClass
to ignore the DEFAULT clauses for these opclasses when in
binary-upgrade mode.  This will result in installing a version of
btree_gist that isn't quite the version it claims to be, but that can
be fixed by issuing ALTER EXTENSION UPDATE afterwards.

Since we don't apply that hack when not in binary-upgrade mode,
it is now impossible to install any version of btree_gist
less than 1.9 via CREATE EXTENSION.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/2483812.1754072263@sss.pgh.pa.us
2026-01-08 14:03:56 -05:00
Heikki Linnakangas
ad853bb877 Fix misc typos, mostly in comments
The only user-visible change is the fix in the "malformed
pg_dependencies" error detail. That one is new in commit e1405aa5e3,
so no backpatching required.
2026-01-08 18:10:08 +02:00
Peter Eisentraut
5e7abdac99 strnlen() is now required
Remove all configure checks and workarounds for strnlen() missing.  It
is required by POSIX 2008.

Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/98ce805c-6103-421b-adc3-fcf8f3dddbe3%40eisentraut.org
2026-01-08 08:51:20 +01:00
Nathan Bossart
a516b3f00d MSVC: Support building for AArch64.
This commit does the following to get tests passing for
MSVC/AArch64:

* Implements spin_delay() with an ISB instruction (like we do for
gcc/clang on AArch64).

* Sets USE_ARMV8_CRC32C unconditionally.  Vendor-supported versions
of Windows for AArch64 require at least ARMv8.1, which is where CRC
extension support became mandatory.

* Implements S_UNLOCK() with _InterlockedExchange().  The existing
implementation for MSVC uses _ReadWriteBarrier() (a compiler
barrier), which is insufficient for this purpose on non-TSO
architectures.

There are likely other changes required to take full advantage of
the hardware (e.g., atomics/arch-arm.h, simd.h,
pg_popcount_aarch64.c), but those can be dealt with later.

Author: Niyas Sait <niyas.sait@linaro.org>
Co-authored-by: Greg Burd <greg@burd.me>
Co-authored-by: Dave Cramer <davecramer@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Tested-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/A6152C7C-F5E3-4958-8F8E-7692D259FF2F%40greg.burd.me
Discussion: https://postgr.es/m/CAFPTBD-74%2BAEuN9n7caJ0YUnW5A0r-KBX8rYoEJWqFPgLKpzdg%40mail.gmail.com
2026-01-07 13:42:57 -06:00
Michael Paquier
b139bd3b6e Add data type oid8, 64-bit unsigned identifier
This new identifier type provides support for 64-bit unsigned values,
to be used in catalogs, like OIDs.  An advantage of a new data type is
that it becomes easier to grep for it in the code when assigning this
type to a catalog attribute, linking it to dedicated APIs and internal
structures.

The following operators are added in this commit, with dedicated tests:
- Casts with integer types and OID.
- btree and hash operators
- min/max functions.
- C type with related macros and defines, named around "Oid8".

This has been mentioned as useful on its own on the thread to add
support for 64-bit TOAST values, so as it becomes possible to attach
this data type to the TOAST code and catalog definitions.  However, as
this concept can apply to many more areas, it is implemented as its own
independent change.  This is based on a discussion with Andres Freund
and Tom Lane.

Bump catalog version.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com>
Discussion: https://postgr.es/m/1891064.1754681536@sss.pgh.pa.us
2026-01-07 11:37:00 +09:00
Jeff Davis
af2d4ca191 Clean up ICU includes.
Remove ICU includes from pg_locale.h, and instead include them in the
few C files that need ICU. Clean up a few other includes in passing.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/48911db71d953edec66df0d2ce303563d631fbe0.camel@j-davis.com
2026-01-06 17:19:51 -08:00
Jeff Davis
c4ff35f104 ICU: use UTF8-optimized case conversion API
Initializes a UCaseMap object once for use across calls, and uses
UTF8-optimized APIs.

Author: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: zengman <zengman@halodbtech.com>
Discussion: https://postgr.es/m/5a010b27-8ed9-4739-86fe-1562b07ba564@proxel.se
2026-01-06 14:09:07 -08:00
Michael Paquier
f1e251be80 Allow bgworkers to be terminated for database-related commands
Background workers gain a new flag, called BGWORKER_INTERRUPTIBLE, that
offers the possibility to terminate the workers when these are connected
to a database that is involved in one of the following commands:
ALTER DATABASE RENAME TO
ALTER DATABASE SET TABLESPACE
CREATE DATABASE
DROP DATABASE

This is useful to give background workers the same behavior as backends
and autovacuum workers, which are stopped when these commands are
executed.  The default behavior, that exists since 9.3, is still to
never terminate bgworkers connected to the database involved in any of
these commands.  The new flag has to be set to terminate the workers.

A couple of tests are added to worker_spi to track the commands that
impact the termination of the workers.  There is a test case for a
non-interruptible worker, additionally, that relies on an injection
point to make the wait time in CountOtherDBBackends() reduced from 5s to
0.3s for faster test runs.  The tests rely on the contents of the server
logs to check if a worker has been started or terminated:
- LOG generated by worker_spi_main() at startup, once connection to
database is done.
- FATAL in bgworker_die() when terminated.
A couple of tests run in the CI have showed that this method is stable
enough.  The safe_psql() calls that scan pg_stat_activity could be
replaced with some poll_query_until() for more stability, if the current
method proves to be an issue in the buildfarm.

Author: Aya Iwata <iwata.aya@fujitsu.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Ryo Matsumura <matsumura.ryo@fujitsu.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/OS7PR01MB11964335F36BE41021B62EAE8EAE4A@OS7PR01MB11964.jpnprd01.prod.outlook.com
2026-01-06 14:24:29 +09:00
David Rowley
c5af141cd4 Clarify where various catcache.h dlist_nodes are used
Also remove a comment which mentions we don't currently divide the
per-cache lists into hash buckets.  Since 473182c95, we do.

Author: ChangAo Chen <cca5507@qq.com>
Discussion: https://postgr.es/m/tencent_7732789707C8768EA13785A7B5EA29103208@qq.com
2026-01-06 14:39:36 +13:00
Tom Lane
7dc95cc3b9 Update to latest Snowball sources.
It's been almost a year since we last did this, and upstream has
been busy.  They've added stemmers for Polish and Esperanto,
and also deprecated their old Dutch stemmer in favor of the
Kraaij-Pohlmann algorithm.  (The "dutch" stemmer is now the
latter, and "dutch_porter" is the old algorithm.)

Upstream also decided to rename their internal header "header.h"
to something less generic: "snowball_runtime.h".  Seems like a good
thing, but it complicates this patch a bit because we were relying on
interposing our own version of "header.h" to control system header
inclusion order.  (We're partially failing at that now, because now the
generated stemmer files include <stddef.h> before snowball_runtime.h.
I think that'll be okay, but if the buildfarm complains then we'll
have to do more-extensive editing of the generated files.)

I realized that we weren't documenting the available stemmers in
any user-visible place, except indirectly through sample \dFd output.
That's incomplete because we only provide built-in dictionaries for
the recommended stemmers for each language, not alternative stemmers
such as dutch_porter.  So I added a list to the documentation.

I did not do anything with the stopword lists.  If those are still
available from snowballstem.org, they are mighty well hidden.

Discussion: https://postgr.es/m/1185975.1767569534@sss.pgh.pa.us
2026-01-05 15:22:37 -05:00
Alexander Korotkov
7a39f43d88 Extend xlogwait infrastructure with write and flush wait types
Add support for waiting on WAL write and flush LSNs in addition to the
existing replay LSN wait type. This provides the foundation for
extending the WAIT FOR command with MODE parameter.

Key changes are following.
- Add WAIT_LSN_TYPE_STANDBY_WRITE and WAIT_LSN_TYPE_STANDBY_FLUSH to
  WaitLSNType.
- Add GetCurrentLSNForWaitType() to retrieve the current LSN for each wait
  type.
- Add new wait events WAIT_EVENT_WAIT_FOR_WAL_WRITE and
  WAIT_EVENT_WAIT_FOR_WAL_FLUSH for pg_stat_activity visibility.
- Update WaitForLSN() to use GetCurrentLSNForWaitType() internally.

Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>
2026-01-05 19:56:19 +02:00
Michael Paquier
7d854bdc5b Remove unneeded defines from pg_config.h.in
This commit removes HAVE_ATOMIC_H and HAVE_MBARRIER_H from
pg_config.h.in, cleanup that could have been done in 25f36066dd.

Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/b2c0d0b7-3944-487d-a03d-d155851958ff@gmail.com
2026-01-05 09:27:19 +09:00
Michael Paquier
b8cfcb9e00 Fix typos and inconsistencies in code and comments
This change is a cocktail of harmonization of function argument names,
grammar typos, renames for better consistency and unused code (see
ltree).  All of these have been spotted by the author.

Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/b2c0d0b7-3944-487d-a03d-d155851958ff@gmail.com
2026-01-05 09:19:15 +09:00
Tom Lane
ba75f71752 Include error location in errors from ComputeIndexAttrs().
Make use of IndexElem's new location field to localize these
errors better.

Author: jian he <jian.universality@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxH3OgXF1hrzGAaWyNtye2jHEmk9JbtrtGv-KJK6tsGo5w@mail.gmail.com
2026-01-04 14:16:20 -05:00
Tom Lane
62299bbd90 Add parse location to IndexElem.
This patch mostly just fills in the field, although a few error
reports in resolve_unique_index_expr() are adjusted to use it.
The next commit will add more uses.

catversion bump out of an abundance of caution: I'm not sure
IndexElem can appear in stored rules, but I'm not sure it can't
either.

Author: Álvaro Herrera <alvherre@kurilemu.de>
Co-authored-by: jian he <jian.universality@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxH3OgXF1hrzGAaWyNtye2jHEmk9JbtrtGv-KJK6tsGo5w@mail.gmail.com
Discussion: https://postgr.es/m/202512121327.f2zimsr6guso@alvherre.pgsql
2026-01-04 14:16:20 -05:00
Peter Eisentraut
0eadf1767a Remove bogus const qualifier on PageGetItem() argument
The function ends up casting away the const qualifier, so it was a
lie.  No callers appear to rely on the const qualifier on the
argument, so the simplest solution is to just remove it.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/beusplf77varvhip6ryuhd2fchsx26qmmhduqz432bnglq634b%402dx4k6yxj4cm
2026-01-04 16:00:15 +01:00
Bruce Momjian
451c43974f Update copyright for 2026
Backpatch-through: 14
2026-01-01 13:24:10 -05:00
Andrew Dunstan
f3c9e341cd Add paths of extensions to pg_available_extensions
Add a new "location" column to the pg_available_extensions and
pg_available_extension_versions views, exposing the directory where
the extension is located.

The default system location is shown as '$system', the same value
that can be used to configure the extension_control_path GUC.

User-defined locations are only visible for super users, otherwise
'<insufficient privilege>' is returned as a column value, the same
behaviour that we already use in pg_stat_activity.

I failed to resist the temptation to do a little extra editorializing of
the TAP test script.

Catalog version bumped.

Author: Matheus Alcantara <mths.dev@pm.me>
Reviewed-By: Chao Li <li.evan.chao@gmail.com>
Reviewed-By: Rohit Prasad <rohit.prasad@arm.com>
Reviewed-By: Michael Banck <mbanck@gmx.net>
Reviewed-By: Manni Wood <manni.wood@enterprisedb.com>
Reviewed-By: Euler Taveira <euler@eulerto.com>
Reviewed-By: Quan Zongliang <quanzongliang@yeah.net>
2026-01-01 12:13:59 -05:00
Tom Lane
bc6374cd76 Change IndexAmRoutines to be statically-allocated structs.
Up to now, index amhandlers were expected to produce a new, palloc'd
struct on each call.  That requires palloc/pfree overhead, and creates
a risk of memory leaks if the caller fails to pfree, and the time
taken to fill such a large structure isn't nil.  Moreover, we were
storing these things in the relcache, eating several hundred bytes for
each cached index.  There is not anything in these structs that needs
to vary at runtime, so let's change the definition so that an
amhandler can return a pointer to a "static const" struct of which
there's only one copy per index AM.  Mark all the core code's
IndexAmRoutine pointers const so that we catch anyplace that might
still try to change or pfree one.

(This is similar to the way we were already handling TableAmRoutine
structs.  This commit does fix one comment that was infelicitously
copied-and-pasted into tableamapi.c.)

This commit needs to be called out in the v19 release notes as an API
change for extension index AMs.  An un-updated AM will still work
(as of now, anyway) but it risks memory leaks and will be slower than
necessary.

Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAEoWx2=vApYk2LRu8R0DdahsPNEhWUxGBZ=rbZo1EXE=uA+opQ@mail.gmail.com
2025-12-30 18:26:23 -05:00
Thomas Munro
1a28b4b455 jit: Drop redundant LLVM configure probes.
We currently require LLVM 14, so these probes for LLVM 9 functions
always succeeded.  Even when the features aren't enabled in an LLVM
build, dummy functions are defined (a problem for a later commit).

The whole PGAC_CHECK_LLVM_FUNCTIONS macro and Meson equivalent are
removed, because we switched to testing LLVM_VERSION_MAJOR at compile
time in subsequent work and these were the last holdouts.  That suits
the nature of LLVM API evolution better, and also allows for strictly
mechanical pruning in future commits like 820b5af7 and 972c2cd2.  They
advanced the minimum LLVM version but failed to spot these.

Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CA%2BhUKGJgB6gvrdDohgwLfCwzVQm%3DVMtb9m0vzQn%3DCwWn-kwG9w%40mail.gmail.com
2025-12-30 20:24:42 +13:00
Michael Paquier
97b101776c Add pg_get_multixact_stats()
This new function exposes at SQL level some information related to
multixacts, not available until now.  This data is useful for monitoring
purposes, especially for workloads that make a heavy use of multixacts:
- num_mxids, number of MultiXact IDs in use.
- num_members, number of member entries in use.
- members_size, bytes used by num_members in pg_multixact/members/.
- oldest_multixact: oldest MultiXact still needed.

This patch has been originally proposed when MultiXactOffset was still
32 bits, to monitor wraparound.  This part is not relevant anymore since
bd8d9c9bdf that has widen MultiXactOffset to 64 bits.  The monitoring
of disk space usage for the members is still relevant.

Some tests are added to check this function, in the shape of one
isolation test with concurrent transactions that take a ROW SHARE lock,
and some SQL tests for pg_read_all_stats.  Some documentation is added
to explain some patterns that can come from the information provided by
the function.

Bump catalog version.

Author: Naga Appani <nagnrik@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com
2025-12-30 15:38:50 +09:00
Michael Paquier
0e3ad4b96a Add MultiXactOffsetStorageSize() to multixact_internal.h
This function calculates in bytes the storage taken between two
multixact offsets.  This will be used in an upcoming patch, introduced
separately here as this piece can be useful on its own.

Author: Naga Appani <nagnrik@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aUyTvZMq2CLgNEB4@paquier.xyz
2025-12-30 14:13:40 +09:00
Michael Paquier
9cf746a453 Change GetMultiXactInfo() to return the next multixact offset
This routine returned a number of members as a MultiXactOffset,
calculated based on the difference between the next-to-be-assigned
offset and the oldest offset.  However, this number is not actually an
offset but a number.

This type confusion comes from the original implementation of
MultiXactMemberFreezeThreshold(), in 53bb309d2d.  The number of
members is now defined as a uint64, large enough for MultiXactOffset.
This change will be used in a follow-up patch.

Reviewed-by: Naga Appani <nagnrik@gmail.com>
Discussion: https://postgr.es/m/aUyTvZMq2CLgNEB4@paquier.xyz
2025-12-30 14:03:49 +09:00
Richard Guo
ad66f705fa Strip PlaceHolderVars from index operands
When pulling up a subquery, we may need to wrap its targetlist items
in PlaceHolderVars to enforce separate identity or as a result of
outer joins.  However, this causes any upper-level WHERE clauses
referencing these outputs to contain PlaceHolderVars, which prevents
indxpath.c from recognizing that they could be matched to index
columns or index expressions, potentially affecting the planner's
ability to use indexes.

To fix, explicitly strip PlaceHolderVars from index operands.  A
PlaceHolderVar appearing in a relation-scan-level expression is
effectively a no-op.  Nevertheless, to play it safe, we strip only
PlaceHolderVars that are not marked nullable.

The stripping is performed recursively to handle cases where
PlaceHolderVars are nested or interleaved with other node types.  To
minimize performance impact, we first use a lightweight walker to
check for the presence of strippable PlaceHolderVars.  The expensive
mutator is invoked only if a candidate is found, avoiding unnecessary
memory allocation and tree copying in the common case where no
PlaceHolderVars are present.

Back-patch to v18.  Although this issue exists before that, changes in
this version made it common enough to notice.  Given the lack of field
reports for older versions, I am not back-patching further.

Reported-by: Haowu Ge <gehaowu@bitmoe.com>
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/62af586c-c270-44f3-9c5e-02c81d537e3d.gehaowu@bitmoe.com
Backpatch-through: 18
2025-12-29 11:38:49 +09:00
Peter Eisentraut
b7057e4346 Change some Datum to void * for opaque pass-through pointer
Here, Datum was used to pass around an opaque pointer between a group
of functions.  But one might as well use void * for that; the use of
Datum doesn't achieve anything here and is just distracting.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/1c5d23cb-288b-4154-b1cd-191fe2301707%40eisentraut.org
2025-12-28 14:34:12 +01:00
Michael Paquier
9adf32da6b Split some long Makefile lists
This change makes more readable code diffs when adding new items or
removing old items, while ensuring that lines do not get excessively
long.  Some SUBDIRS, PROGRAMS and REGRESS lists are split.

Note that there are a few more REGRESS lists that could be split,
particularly in contrib/.

Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Co-Authored-By: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Man Zeng <zengman@halodbtech.com>
Discussion: https://postgr.es/m/DF6HDGB559U5.3MPRFCWPONEAE@jeltef.nl
2025-12-28 09:17:42 +09:00
Peter Eisentraut
b63443718a Remove MsgType type
Presumably, the C type MsgType was meant to hold the protocol message
type in the pre-version-3 era, but this was never fully developed even
then, and the name is pretty confusing nowadays.  It has only one
vestigial use for cancel requests that we can get rid of.  Since a
cancel request is indicated by a special protocol version number, we
can use the ProtocolVersion type, which MsgType was based on.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/505e76cb-0ca2-4e22-ba0f-772b5dc3f230%40eisentraut.org
2025-12-27 23:46:28 +01:00
Michael Paquier
213a1b8952 Move attribute statistics functions to stat_utils.c
Many of the operations done for attribute stats in attribute_stats.c
share the same logic as extended stats, as done by a patch under
discussion to add support for extended stats import and export.  All the
pieces necessary for extended statistics are moved to stats_utils.c,
which is the file where common facilities are shared for stats files.

The following renames are done:
* get_attr_stat_type() -> statatt_get_type()
* init_empty_stats_tuple() -> statatt_init_empty_tuple()
* set_stats_slot() -> statatt_set_slot()
* get_elem_stat_type() -> statatt_get_elem_type()

While on it, this commit adds more documentation for all these
functions, describing more their internals and the dependencies that
have been implied for attribute statistics.  The same concepts apply to
extended statistics, at some degree.

Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Yu Wang <wangyu_runtime@163.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CADkLM=dpz3KFnqP-dgJ-zvRvtjsa8UZv8wDAQdqho=qN3kX0Zg@mail.gmail.com
2025-12-25 15:13:39 +09:00
Richard Guo
325808cac9 Fix planner error with SRFs and grouping sets
If there are any SRFs in a PathTarget, we must separate it into
SRF-computing and SRF-free targets.  This is because the executor can
only handle SRFs that appear at the top level of the targetlist of a
ProjectSet plan node.

If we find a subexpression that matches an expression already computed
in the previous plan level, we should treat it like a Var and should
not split it again.  setrefs.c will later replace the expression with
a Var referencing the subplan output.

However, when processing the grouping target for grouping sets, the
planner can fail to recognize that an expression is already computed
in the scan/join phase.  The root cause is a mismatch in the
nullingrels bits.  Expressions in the grouping target carry the
grouping nulling bit in their nullingrels to indicate that they can be
nulled by the grouping step.  However, the corresponding expressions
in the scan/join target do not have these bits.

As a result, the exact match check in list_member() fails, leading the
planner to incorrectly believe that the expression needs to be
re-evaluated from its arguments, which are often not available in the
subplan.  This can lead to planner errors such as "variable not found
in subplan target list".

To fix, ignore the grouping nulling bit when checking whether an
expression from the grouping target is available in the pre-grouping
input target.  This aligns with the matching logic in setrefs.c.

Backpatch to v18, where this issue was introduced.

Bug: #19353
Reported-by: Marian MULLER REBEYROL <marian.muller@serli.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/19353-aaa179bba986a19b@postgresql.org
Backpatch-through: 18
2025-12-25 12:12:52 +09:00
Masahiko Sawada
67c20979ce Toggle logical decoding dynamically based on logical slot presence.
Previously logical decoding required wal_level to be set to 'logical'
at server start. This meant that users had to incur the overhead of
logical-level WAL logging even when no logical replication slots were
in use.

This commit adds functionality to automatically control logical
decoding availability based on logical replication slot presence. The
newly introduced module logicalctl.c allows logical decoding to be
dynamically activated when needed when wal_level is set to
'replica'.

When the first logical replication slot is created, the system
automatically increases the effective WAL level to maintain
logical-level WAL records. Conversely, after the last logical slot is
dropped or invalidated, it decreases back to 'replica' WAL level.

While activation occurs synchronously right after creating the first
logical slot, deactivation happens asynchronously through the
checkpointer process. This design avoids a race condition at the end
of recovery; a concurrent deactivation could happen while the startup
process enables logical decoding at the end of recovery, but WAL
writes are still not permitted until recovery fully completes. The
checkpointer will handle it after recovery is done. Asynchronous
deactivation also avoids excessive toggling of the logical decoding
status in workloads that repeatedly create and drop a single logical
slot. On the other hand, this lazy approach can delay changes to
effective_wal_level and the disabling logical decoding, especially
when the checkpointer is busy with other tasks. We chose this lazy
approach in all deactivation paths to keep the implementation simple,
even though laziness is strictly required only for end-of-recovery
cases. Future work might address this limitation either by using a
dedicated worker instead of the checkpointer, or by implementing
synchronous waiting during slot drops if workloads are significantly
affected by the lazy deactivation of logical decoding.

The effective WAL level, determined internally by XLogLogicalInfo, is
allowed to change within a transaction until an XID is assigned. Once
an XID is assigned, the value becomes fixed for the remainder of the
transaction. This behavior ensures that the logging mode remains
consistent within a writing transaction, similar to the behavior of
GUC parameters.

A new read-only GUC parameter effective_wal_level is introduced to
monitor the actual WAL level in effect. This parameter reflects the
current operational WAL level, which may differ from the configured
wal_level setting.

Bump PG_CONTROL_VERSION as it adds a new field to CheckPoint struct.

Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAD21AoCVLeLYq09pQPaWs+Jwdni5FuJ8v2jgq-u9_uFbcp6UbA@mail.gmail.com
2025-12-23 10:13:16 -08:00
Michael Paquier
e5f3839af6 Switch buffile.c/h to use pgoff_t instead of off_t
off_t was previously used for offsets, which is 4 bytes on Windows,
hence limiting the backend code to a hard limit for files longer than
2GB.  This leads to some simplification in these files, removing some
casts based on long, also 4 bytes on Windows.

This commit removes one comment introduced in db3c4c3a2d, not relevant
anymore as pgoff_t is a safe 8-byte alternative on Windows.

This change is surprisingly not invasive, as the callers of
BufFileTell(), BufFileSeek() and BufFileTruncateFileSet() (worker.c,
tuplestore.c, etc.) track offsets in local structures that just to
switch from off_t to pgoff_t for the most part.

The file is still relying on a maximum file size of
MAX_PHYSICAL_FILESIZE (1GB).  This change allows the code to make this
maximum potentially larger in the future, or larger on a per-demand
basis.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aUStrqoOCDRFAq1M@paquier.xyz
2025-12-23 07:41:34 +09:00
Heikki Linnakangas
47a9f61fca Use proper type for RestoreTransactionSnapshot's PGPROC arg
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/08cbaeb5-aaaf-47b6-9ed8-4f7455b0bc4b@iki.fi
2025-12-19 13:40:02 +02:00
Michael Paquier
167cb26718 Fix const correctness in pgstat data serialization callbacks
4ba012a8ed defined the "header" (pointer to the stats data) of
from_serialized_data() as a const, even though it is fine (and
expected!) for the callback to modify the shared memory entry when
loading the stats at startup.

While on it, this commit updates the callback to_serialized_data() in
the test module test_custom_stats to make the data extracted from the
"header" parameter a const since it should never be modified: the stats
are written to disk and no modifications are expected in the shared
memory entry.

This clarifies the API contract of these new callbacks.

Reported-By: Peter Eisentraut <peter@eisentraut.org>
Author: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/d87a93b0-19c7-4db6-b9c0-d6827e7b2da1@eisentraut.org
2025-12-18 07:33:40 +09:00
Michael Paquier
f4e797171e Change pgstat_report_vacuum() to use Relation
This change makes pgstat_report_vacuum() more consistent with
pgstat_report_analyze(), that also uses a Relation.  This enforces a
policy that callers of this routine should open and lock the relation
whose statistics are updated before calling this routine.  We will
unlikely have a lot of callers of this routine in the tree, but it seems
like a good idea to imply this requirement in the long run.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Suggested-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aUEA6UZZkDCQFgSA@ip-10-97-1-34.eu-west-3.compute.internal
2025-12-17 11:26:17 +09:00
Jeff Davis
0a90df58cf Avoid global LC_CTYPE dependency in pg_locale_icu.c.
ICU still depends on libc for compatibility with certain historical
behavior for single-byte encodings. Make the dependency explicit by
holding a locale_t object when required.

We should consider a better solution in the future, such as decoding
the text to UTF-32 and using u_tolower(). That would be a behavior
change and require additional infrastructure though; so for now, just
avoid the global LC_CTYPE dependency.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-12-16 15:32:57 -08:00
Jeff Davis
87b2968df0 downcase_identifier(): use method table from locale provider.
Previously, libc's tolower() was always used for lowercasing
identifiers, regardless of the database locale (though only characters
beyond 127 in single-byte encodings were affected). Refactor to allow
each provider to supply its own implementation of identifier
downcasing.

For historical compatibility, when using a single-byte encoding, ICU
still relies on tolower().

One minor behavior change is that, before the database default locale
is initialized, it uses ASCII semantics to downcase the
identifiers. Previously, it would use the postmaster's LC_CTYPE
setting from the environment. While that could have some effect during
GUC processing, for example, it would have been fragile to rely on the
environment setting anyway. (Also, it only matters when the encoding
is single-byte.)

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-12-16 15:32:41 -08:00
Jeff Davis
24bf379cb1 Clarify a #define introduced in 8d299052fe.
The value is the same, but use the right symbol for clarity.
2025-12-16 12:48:53 -08:00
Nathan Bossart
48d4a1423d Allow passing a pointer to GetNamedDSMSegment()'s init callback.
This commit adds a new "void *arg" parameter to
GetNamedDSMSegment() that is passed to the initialization callback
function.  This is useful for reusing an initialization callback
function for multiple DSM segments.

Author: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAN4CZFMjh8TrT9ZhWgjVTzBDkYZi2a84BnZ8bM%2BfLPuq7Cirzg%40mail.gmail.com
2025-12-15 14:27:16 -06:00
Noah Misch
64bf53dd61 Revisit cosmetics of "For inplace update, send nontransactional invalidations."
This removes a never-used CacheInvalidateHeapTupleInplace() parameter.
It adds README content about inplace update visibility in logical
decoding.  It rewrites other comments.

Back-patch to v18, where commit 243e9b40f1
first appeared.  Since this removes a CacheInvalidateHeapTupleInplace()
parameter, expect a v18 ".abi-compliance-history" edit to follow.  PGXN
contains no calls to that function.

Reported-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reported-by: Ilyasov Ian <ianilyasov@outlook.com>
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Surya Poondla <s_poondla@apple.com>
Discussion: https://postgr.es/m/CA+renyU+LGLvCqS0=fHit-N1J-2=2_mPK97AQxvcfKm+F-DxJA@mail.gmail.com
Backpatch-through: 18
2025-12-15 12:19:49 -08:00
Jeff Davis
95a19fefdc Remove incorrect declarations in pg_wchar.h.
Oversight in commit 9acae56ce0.

Discussion: https://postgr.es/m/541F240E-94AD-4D65-9794-7D6C316BC3FF@gmail.com
2025-12-15 10:38:55 -08:00
Jeff Davis
54c41a6deb Remove unused single-byte char_is_cased() API.
https://postgr.es/m/450ceb6260cad30d7afdf155d991a9caafee7c0d.camel@j-davis.com
2025-12-15 10:24:57 -08:00
Peter Eisentraut
17f446784d Refactor static_assert() support.
HAVE__STATIC_ASSERT was really a test for GCC statement expressions,
as needed for StaticAssertExpr() now that _Static_assert could be
assumed to be available through our C11 requirement.  This
artificially prevented Visual Studio from being able to use
static_assert() in other contexts.

Instead, make a new test for HAVE_STATEMENT_EXPRESSIONS, and use that
to control only whether StaticAssertExpr() uses fallback code, not the
other variants.  This improves the quality of failure messages in the
(much more common) other variants under Visual Studio.

Also get rid of the two separate implementations for C++, since the C
implementation is also also valid as C++11.  While it is a stretch to
apply HAVE_STATEMENT_EXPRESSIONS tested with $CC to a C++ compiler,
the previous C++ coding assumed that the C++ compiler had them
unconditionally, so it isn't a new stretch.  In practice, the C and
C++ compilers are very likely to agree, and if a combination is ever
reported that falsifies this assumption we can always reconsider that.

Author: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGKvr0x_oGmQTUkx%3DODgSksT2EtgCA6LmGx_jQFG%3DsDUpg%40mail.gmail.com
2025-12-15 11:54:23 +01:00
Michael Paquier
4ba012a8ed Allow cumulative statistics to read/write auxiliary data from/to disk
Cumulative stats kinds gain the capability to write additional per-entry
data when flushing the stats at shutdown, and read this data when
loading back the stats at startup.  This can be fit for example in the
case of variable-length data (like normalized query strings), so as it
becomes possible to link the shared memory stats entries to data that is
stored in a different area, like a DSA segment.

Three new optional callbacks are added to PgStat_KindInfo, available to
variable-numbered stats kinds:
* to_serialized_data: writes auxiliary data for an entry.
* from_serialized_data: reads auxiliary data for an entry.
* finish: performs actions after read/write/discard operations.  This is
invoked after processing all the entries of a kind, allowing extensions
to close file handles and clean up resources.

Stats kinds have the option to store this data in the existing pgstats
file, but can as well store it in one or more additional files whose
names can be built upon the entry keys.  The new serialized callbacks
are called once an entry key is read or written from the main stats
file.  A file descriptor to the main pgstats file is available in the
arguments of the callbacks.

Author: Sami Imseih <samimseih@gmail.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0s9SDOu+Z6veoJCHWk+kDeTktAtC-KY9fQ9Z6BJdDUirQ@mail.gmail.com
2025-12-15 09:40:56 +09:00
Tom Lane
58dad7f349 Update typedefs.list to match what the buildfarm currently reports.
The current list from the buildfarm includes quite a few typedef
names that it used to miss.  The reason is a bit obscure, but it
seems likely to have something to do with our recent increased
use of palloc_object and palloc_array.  In any case, this makes
the relevant struct declarations be much more nicely formatted,
so I'll take it.  Install the current list and re-run pgindent
to update affected code.

Syncing with the current list also removes some obsolete
typedef names and fixes some alphabetization errors.

Discussion: https://postgr.es/m/1681301.1765742268@sss.pgh.pa.us
2025-12-14 17:03:53 -05:00