1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-22 12:22:45 +03:00
Commit Graph

9018 Commits

Author SHA1 Message Date
Melanie Plageman
052026c9b9 Eagerly scan all-visible pages to amortize aggressive vacuum
Aggressive vacuums must scan every unfrozen tuple in order to advance
the relfrozenxid/relminmxid. Because data is often vacuumed before it is
old enough to require freezing, relations may build up a large backlog
of pages that are set all-visible but not all-frozen in the visibility
map. When an aggressive vacuum is triggered, all of these pages must be
scanned. These pages have often been evicted from shared buffers and
even from the kernel buffer cache. Thus, aggressive vacuums often incur
large amounts of extra I/O at the expense of foreground workloads.

To amortize the cost of aggressive vacuums, eagerly scan some
all-visible but not all-frozen pages during normal vacuums.

All-visible pages that are eagerly scanned and set all-frozen in the
visibility map are counted as successful eager freezes and those not
frozen are counted as failed eager freezes.

If too many eager scans fail in a row, eager scanning is temporarily
suspended until a later portion of the relation. The number of failures
tolerated is configurable globally and per table.

To effectively amortize aggressive vacuums, we cap the number of
successes as well. Capping eager freeze successes also limits the amount
of potentially wasted work if these pages are modified again before the
next aggressive vacuum. Once we reach the maximum number of blocks
successfully eager frozen, eager scanning is disabled for the remainder
of the vacuum of the relation.

Original design idea from Robert Haas, with enhancements from
Andres Freund, Tomas Vondra, and me

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com
2025-02-11 13:53:48 -05:00
Andres Freund
4dd09a1d41 config: Rename "Asynchronous Behavior" to "I/O"
"I/O" seems more descriptive than "Asynchronous Behavior", given that some of
the GUCs in the section don't relate to anything asynchronous.

Most other abbreviations in the config sections are un-abbreviated, but
"Input/Output" seems less likely to be helpful than just IO or I/O.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/x3tlw2jk5gm3r3mv47hwrshffyw7halpczkfbk3peksxds7bvc@lguk43z3bsyq
2025-02-11 12:53:40 -05:00
Andres Freund
740766d37c config: Split "Worker Processes" out of "Asynchronous Behavior"
Having all the worker related GUCs in the same section as IO controlling GUCs
doesn't really make sense. Create a separate section for them.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/x3tlw2jk5gm3r3mv47hwrshffyw7halpczkfbk3peksxds7bvc@lguk43z3bsyq
2025-02-11 12:53:40 -05:00
Tom Lane
c366d2bdba Allow extension functions to participate in in-place updates.
Commit 1dc5ebc90 allowed PL/pgSQL to perform in-place updates
of expanded-object variables that are being updated with
assignments like "x := f(x, ...)".  However this was allowed
only for a hard-wired list of functions f(), since we need to
be sure that f() will not modify the variable if it fails.
It was always envisioned that we should make that extensible,
but at the time we didn't have a good way to do so.  Since
then we've invented the idea of "support functions" to allow
attaching specialized optimization knowledge to functions,
and that is a perfect mechanism for doing this.

Hence, adjust PL/pgSQL to use a support function request instead
of hard-wired logic to decide if in-place update is safe.
Preserve the previous optimizations by creating support functions
for the three functions that were previously hard-wired.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Discussion: https://postgr.es/m/CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com
2025-02-11 12:49:34 -05:00
Peter Eisentraut
9926f854d0 Cache NO ACTION foreign keys separately from RESTRICT foreign keys
Now that we generate different SQL for temporal NO ACTION vs RESTRICT
foreign keys, we should cache their query plans with different keys.
Since the key also includes the constraint oid, this shouldn't be
necessary, but we have been seeing build farm failures that suggest we
might be sometimes using a cached NO ACTION plan to implement a RESTRICT
constraint.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2025-02-09 13:43:56 +01:00
Peter Eisentraut
83ea6c5402 Virtual generated columns
This adds a new variant of generated columns that are computed on read
(like a view, unlike the existing stored generated columns, which are
computed on write, like a materialized view).

The syntax for the column definition is

    ... GENERATED ALWAYS AS (...) VIRTUAL

and VIRTUAL is also optional.  VIRTUAL is the default rather than
STORED to match various other SQL products.  (The SQL standard makes
no specification about this, but it also doesn't know about VIRTUAL or
STORED.)  (Also, virtual views are the default, rather than
materialized views.)

Virtual generated columns are stored in tuples as null values.  (A
very early version of this patch had the ambition to not store them at
all.  But so much stuff breaks or gets confused if you have tuples
where a column in the middle is completely missing.  This is a
compromise, and it still saves space over being forced to use stored
generated columns.  If we ever find a way to improve this, a bit of
pg_upgrade cleverness could allow for upgrades to a newer scheme.)

The capabilities and restrictions of virtual generated columns are
mostly the same as for stored generated columns.  In some cases, this
patch keeps virtual generated columns more restricted than they might
technically need to be, to keep the two kinds consistent.  Some of
that could maybe be relaxed later after separate careful
considerations.

Some functionality that is currently not supported, but could possibly
be added as incremental features, some easier than others:

- index on or using a virtual column
- hence also no unique constraints on virtual columns
- extended statistics on virtual columns
- foreign-key constraints on virtual columns
- not-null constraints on virtual columns (check constraints are supported)
- ALTER TABLE / DROP EXPRESSION
- virtual column cannot have domain type
- virtual columns are not supported in logical replication

The tests in generated_virtual.sql have been copied over from
generated_stored.sql with the keyword replaced.  This way we can make
sure the behavior is mostly aligned, and the differences can be
visible.  Some tests for currently not supported features are
currently commented out.

Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Tested-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/a368248e-69e4-40be-9c07-6c3b5880b0a6@eisentraut.org
2025-02-07 09:46:59 +01:00
Nathan Bossart
527f8fec22 Fix autovacuum_vacuum_max_threshold's GUC description.
Most GUCs that accept a special value to disable the feature
mention it in their GUC description.  This commit adds that
information to autovacuum_vacuum_max_threshold's description.

Oversight in commit 306dc520b9.
2025-02-06 11:59:12 -06:00
Nathan Bossart
306dc520b9 Introduce autovacuum_vacuum_max_threshold.
One way autovacuum chooses tables to vacuum is by comparing the
number of updated or deleted tuples with a value calculated using
autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor.
The threshold specifies the base value for comparison, and the
scale factor specifies the fraction of the table size to add to it.
This strategy ensures that smaller tables are vacuumed after fewer
updates/deletes than larger tables, which is reasonable in many
cases but can result in infrequent vacuums on very large tables.
This is undesirable for a couple of reasons, such as very large
tables incurring a huge amount of bloat between vacuums.

This new parameter provides a way to set a limit on the value
calculated with autovacuum_vacuum_threshold and
autovacuum_vacuum_scale_factor so that very large tables are
vacuumed more frequently.  By default, it is set to 100,000,000
tuples, but it can be disabled by setting it to -1.  It can also be
adjusted for individual tables by changing storage parameters.

Author: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Frédéric Yhuel <frederic.yhuel@dalibo.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Reviewed-by: Joe Conway <mail@joeconway.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Vinícius Abrahão <vinnix.bsd@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/956435f8-3b2f-47a6-8756-8c54ded61802%40dalibo.com
2025-02-05 15:48:18 -06:00
Michael Paquier
a051e71e28 Add data for WAL in pg_stat_io and backend statistics
This commit adds WAL IO stats to both pg_stat_io view and per-backend IO
statistics (pg_stat_get_backend_io()).  This change is possible since
f92c854cf4, as WAL IO is not counted in blocks in some code paths
where its stats data is measured (like WAL read in xlogreader.c).

IOContext gains IOCONTEXT_INIT and IOObject IOOBJECT_WAL, with the
following combinations allowed:
- IOOBJECT_WAL/IOCONTEXT_NORMAL is used to track I/O operations done on
already-created WAL segments.
- IOOBJECT_WAL/IOCONTEXT_INIT is used for tracking I/O operations done
when initializing WAL segments.

The core changes are done in pg_stat_io.c, backend statistics inherit
them.  Backend statistics and pg_stat_io are now available for the WAL
writer, the WAL receiver and the WAL summarizer processes.

I/O timing data is controlled by the GUC track_io_timing, like the
existing data of pg_stat_io for consistency.  The timings related to
IOOBJECT_WAL show up if the GUC is enabled (disabled by default).

Bump pgstats file version, due to the additions in IOObject and
IOContext, impacting the amount of data written for the fixed-numbered
IO stats kind in the pgstats file.

Author: Nazir Bilal Yavuz
Reviewed-by: Bertrand Drouvot, Nitin Jadhav, Amit Kapila, Michael
Paquier, Melanie Plageman, Bharath Rupireddy
Discussion: https://postgr.es/m/CAN55FZ3AiQ+ZMxUuXnBpd0Rrh1YhwJ5FudkHg=JU0P+-W8T4Vg@mail.gmail.com
2025-02-04 16:50:00 +09:00
Michael Paquier
b998fedab7 Improve comment on top of pgstat_count_io_op_time()
This commit adds more documentation to pgstat_count_io_op_time() in
pgstat_io.c, explaining its internals for pgstat_count_buffer_*(),
pgBufferUsage and the contexts where these are used.

Extracted from a larger patch by the same author.

Author: Nazir Bilal Yavuz
Discussion: https://postgr.es/m/CAN55FZ3AiQ+ZMxUuXnBpd0Rrh1YhwJ5FudkHg=JU0P+-W8T4Vg@mail.gmail.com
2025-02-03 11:19:58 +09:00
Michael Paquier
d61b9662b0 Mention jsonlog in description of logging_collector in GUC table
logging_collector was only mentioning stderr and csvlog, and forgot
about jsonlog.  Oversight in dc686681e0, that has added support for
jsonlog in log_destination.

While on it, the description in the GUC table is tweaked to be more
consistent with the documentation and postgresql.conf.sample.

Author: Umar Hayat
Reviewed-by: Ashutosh Bapat, Tom Lane
Discussion: https://postgr.es/m/CAD68Dp1K_vBYqBEukHw=1jF7e76t8aszGZTFL2ugi=H7r=a7MA@mail.gmail.com
Backpatch-through: 13
2025-02-02 11:31:21 +09:00
Peter Eisentraut
43493cceda Add get_opfamily_name() function
This refactors and simplifies various existing code to make use of the
new function.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-02-01 10:42:58 +01:00
Tom Lane
d4c3a6b8ad Remove obsolete restriction on the range of log_rotation_size.
When syslogger.c was first written, we didn't want to assume that
all platforms have 64-bit ftello.  But we've been assuming that
since v13 (cf commit 799d22461), so let's use that in syslogger.c
and allow log_rotation_size to range up to INT_MAX kilobytes.

The old code effectively limited log_rotation_size to 2GB regardless
of platform.  While nobody's complained, that doesn't seem too far
away from what might be thought reasonable these days.

I noticed this while searching for instances of "1024L" in connection
with commit 041e8b95b.  These were the last such instances.
(We still have instances of L-suffixed literals, but most of them
are associated with wait intervals for pg_usleep or similar functions.
I don't see any urgent reason to change that.)
2025-01-31 14:36:56 -05:00
Tom Lane
041e8b95b8 Get rid of our dependency on type "long" for memory size calculations.
Consistently use "Size" (or size_t, or in some places int64 or double)
as the type for variables holding memory allocation sizes.  In most
places variables' data types were fine already, but we had an ancient
habit of computing bytes from kilobytes-units GUCs with code like
"work_mem * 1024L".  That risks overflow on Win64 where they did not
make "long" as wide as "size_t".  We worked around that by restricting
such GUCs' ranges, so you couldn't set work_mem et al higher than 2GB
on Win64.  This patch removes that restriction, after replacing such
calculations with "work_mem * (Size) 1024" or variants of that.

It should be noted that this patch was constructed by searching
outwards from the GUCs that have MAX_KILOBYTES as upper limit.
So I can't positively guarantee there are no other places doing
memory-size arithmetic in int or long variables.  I do however feel
pretty confident that increasing MAX_KILOBYTES on Win64 is safe now.
Also, nothing in our code should be dealing in multiple-gigabyte
allocations without authorization from a relevant GUC, so it seems
pretty likely that this search caught everything that could be at
risk of overflow.

Author: Vladlen Popolitov <v.popolitov@postgrespro.ru>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1a01f0-66ec2d80-3b-68487680@27595217
2025-01-31 13:52:40 -05:00
Amit Kapila
f41d8468dd Raise an error while trying to acquire an invalid slot.
Once a replication slot is invalidated, it cannot be altered or used to
fetch changes. However, a process could still acquire an invalid slot and
fail later.

For example, if a process acquires a logical slot that was invalidated due
to wal_removed, it will eventually fail in CreateDecodingContext() when
attempting to access the removed WAL. Similarly, for physical replication
slots, even if the slot is invalidated and invalidation_reason is set to
wal_removed, the walsender does not currently check for invalidation when
starting physical replication. Instead, replication starts, and an error
is only reported later while trying to access WAL. Similarly, we prohibit
modifying slot properties for invalid slots but give the error for the
same after acquiring the slot.

This patch improves error handling by detecting invalid slots earlier at
the time of slot acquisition which is the first step. This also helped in
unifying different ERROR messages at different places and gave a
consistent message for invalid slots. This means that the message for
invalid slots will change to a generic message.

This will also be helpful for future patches where we are planning to
invalidate slots due to more reasons like idle_timeout because we don't
have to modify multiple places in such cases and avoid the chances of
missing out on a particular place.

Author: Nisha Moond <nisha.moond412@gmail.com>
Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CABdArM6pBL5hPnSQ+5nEVMANcF4FCH7LQmgskXyiLY75TMnKpw@mail.gmail.com
2025-01-31 10:27:35 +05:30
Michael Paquier
ce5c620fb6 Add pgstat_drop_matching_entries() to pgstats
This allows users of the cumulative statistics to drop entries in the
shared hash stats table, deleting as well local references.  Callers of
this function can optionally define a callback able to filter which
entries to drop, similarly to pgstat_reset_matching_entries() with its
callback do_reset().

pgstat_drop_all_entries() is refactored so as it uses this new function.

Author: Lukas Fitti
Discussion: https://postgr.es/m/CAP53PkwuFbo3NkwZgxwNRMjMfqPEqidD-SggaoQ4ijotBVLJAA@mail.gmail.com
2025-01-31 12:27:19 +09:00
Tom Lane
b9d232b9de Use "ssize_t" not "long" in max_stack_depth-related code.
This change adapts these functions to the machine's address width
without depending on "long" to be the right size.  (It isn't on
Win64, for example.)  While it seems unlikely anyone would care
to run with a stack depth limit exceeding 2GB, this is part of a
general push to avoid using type "long" to represent memory sizes.

It's convenient to use ssize_t rather than the perhaps-more-obvious
choice of size_t/Size, because the code involved depends on working
with a signed data type.  Our MAX_KILOBYTES limit already ensures
that ssize_t will be sufficient to represent the maximum value of
max_stack_depth.

Extracted from a larger patch by Vladlen, plus additional hackery
by me.

Author: Vladlen Popolitov <v.popolitov@postgrespro.ru>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1a01f0-66ec2d80-3b-68487680@27595217
2025-01-30 16:44:47 -05:00
John Naylor
128897b101 Fix grammatical typos around possessive "its"
Some places spelled it "it's", which is short for "it is".
In passing, fix a couple other nearby grammatical errors.

Author: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaAO8g1KJCV0T48=CkJMjAnnfTGLWOATz+2aCh40c2Nm+g@mail.gmail.com
2025-01-29 14:39:14 +07:00
Amit Kapila
75eb9766ec Rename pubgencols_type to pubgencols in pg_publication.
The column added in commit e65dbc9927, pubgencols_type, was inconsistent
with the naming conventions of other columns in the pg_publication
catalog.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/CALDaNm1u-ufVOW-RUsXSooqzkpohxfZYy=z78fbcr_9Pq5hbCg@mail.gmail.com
2025-01-28 10:42:46 +05:30
Michael Paquier
30a6ed0ce4 Track per-relation cumulative time spent in [auto]vacuum and [auto]analyze
This commit adds four fields to the statistics of relations, aggregating
the amount of time spent for each operation on a relation:
- total_vacuum_time, for manual vacuum.
- total_autovacuum_time, for vacuum done by the autovacuum daemon.
- total_analyze_time, for manual analyze.
- total_autoanalyze_time, for analyze done by the autovacuum daemon.

This gives users the option to derive the average time spent for these
operations with the help of the related "count" fields.

Bump catalog version (for the catalog changes) and PGSTAT_FILE_FORMAT_ID
(for the additions in PgStat_StatTabEntry).

Author: Sami Imseih
Reviewed-by: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/CAA5RZ0uVOGBYmPEeGF2d1B_67tgNjKx_bKDuL+oUftuoz+=Y1g@mail.gmail.com
2025-01-28 09:57:32 +09:00
Noah Misch
d28cd3e7b2 At update of non-LP_NORMAL TID, fail instead of corrupting page header.
The right mix of DDL and VACUUM could corrupt a catalog page header such
that PageIsVerified() durably fails, requiring a restore from backup.
This affects only catalogs that both have a syscache and have DDL code
that uses syscache tuples to construct updates.  One of the test
permutations shows a variant not yet fixed.

This makes !TransactionIdIsValid(TM_FailureData.xmax) possible with
TM_Deleted.  I think core and PGXN are indifferent to that.

Per bug #17821 from Alexander Lakhin.  Back-patch to v13 (all supported
versions).  The test case is v17+, since it uses INJECTION_POINT.

Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org
2025-01-25 11:28:14 -08:00
Noah Misch
81772a495e Merge copies of converting an XID to a FullTransactionId.
Assume twophase.c is the performance-sensitive caller, and preserve its
choice of unlikely() branch hint.  Add some retrospective rationale for
that choice.  Back-patch to v17, for the next commit to use it.

Reviewed (in earlier versions) by Michael Paquier.

Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org
Discussion: https://postgr.es/m/20250116010051.f3.nmisch@google.com
2025-01-25 11:28:14 -08:00
Andres Freund
87a6690cc6 Change shutdown sequence to terminate checkpointer last
The main motivation for this change is to have a process that can serialize
stats after all other processes have terminated. Serializing stats already
happens in checkpointer, even though walsenders can be active longer.

The only reason the current shutdown sequence does not actively cause problems
is that walsender currently does not generate any stats. However, there is an
upcoming patch changing that.

Another need for this change originates in the AIO patchset, where IO
workers (which, in some edge cases, can emit stats of their own) need to run
while the shutdown checkpoint is being written.

This commit changes the shutdown sequence so checkpointer is signalled (via
SIGINT) to trigger writing the shutdown checkpoint without also causing
checkpointer to exit.  Once checkpointer wrote the shutdown checkpoint it
notifies postmaster via PMSIGNAL_XLOG_IS_SHUTDOWN and waits for the
termination signal (SIGUSR2, as before).  Checkpointer now is terminated after
all children, other than dead-end children and logger, have been terminated,
tracked using the new PM_WAIT_CHECKPOINTER PMState.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-25 11:37:13 -05:00
Jeff Davis
bfc5992069 Add SQL function CASEFOLD().
Useful for caseless matching. Similar to LOWER(), but avoids edge-case
problems with using LOWER() for caseless matching.

For collations that support it, CASEFOLD() handles characters with
more than two case variations or multi-character case variations. Some
characters may fold to uppercase. The results of case folding are also
more stable across Unicode versions than LOWER() or UPPER().

Discussion: https://postgr.es/m/a1886ddfcd8f60cb3e905c93009b646b4cfb74c5.camel%40j-davis.com
Reviewed-by: Ian Lawrence Barwick
2025-01-24 14:56:22 -08:00
Tom Lane
a5579a90af Make jsonb casts to scalar types translate JSON null to SQL NULL.
Formerly, these cases threw an error "cannot cast jsonb null to type
<whatever>".  That seems less than helpful though.  It's also
inconsistent with the behavior of the ->> operator, which translates
JSON null to SQL NULL, as do some other jsonb functions.

Discussion: https://postgr.es/m/3851203.1722552717@sss.pgh.pa.us
2025-01-24 13:20:44 -05:00
Peter Eisentraut
b15b8c5cf8 Add some const decorations (htup.h)
Discussion: https://www.postgresql.org/message-id/flat/5b558da8-99fb-0a99-83dd-f72f05388517@enterprisedb.com
2025-01-23 12:12:08 +01:00
Amit Kapila
e65dbc9927 Change publication's publish_generated_columns option type to enum.
The current boolean publish_generated_columns option only supports a
binary choice, which is insufficient for future enhancements where
generated columns can be of different types (e.g., stored or virtual). The
supported values for the publish_generated_columns option are 'none' and
'stored'.

Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/d718d219-dd47-4a33-bb97-56e8fc4da994@eisentraut.org
Discussion: https://postgr.es/m/B80D17B2-2C8E-4C7D-87F2-E5B4BE3C069E@gmail.com
2025-01-23 15:28:37 +05:30
Tom Lane
172e6b3adb Support RN (roman-numeral format) in to_number().
We've long had roman-numeral output support in to_char(),
but lacked the reverse conversion.  Here it is.

Author: Hunaid Sohail <hunaidpgml@gmail.com>
Reviewed-by: Maciek Sakrejda <m.sakrejda@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Tomas Vondra <tomas@vondra.me>
Discussion: https://postgr.es/m/CAMWA6ybh4M1VQqpmnu2tfSwO+3gAPeA8YKnMHVADeB=XDEvT_A@mail.gmail.com
2025-01-22 15:18:50 -05:00
Álvaro Herrera
db19a5061c Reword recent error messages: "should" -> "must"
Most were introduced in the 17 timeframe.  The ones in wparser_def.c are
very old.

I also changed "JSON path expression for column \"%s\" should return
single item without wrapper" to "JSON path expression for column \"%s\"
must return single item when no wrapper is requested" to avoid
ambiguity.

Backpatch to 17.

Crickets: https://postgr.es/m/202501131819.26ors7oouafu@alvherre.pgsql
2025-01-21 15:24:49 +01:00
Peter Eisentraut
1772d554b0 Fix NO ACTION temporal foreign keys when the referenced endpoints change
If a referenced UPDATE changes the temporal start/end times, shrinking
the span the row is valid, we get a false return from
ri_Check_Pk_Match(), but overlapping references may still be valid, if
their reference didn't overlap with the removed span.

We need to consider what span(s) are still provided in the referenced
table.  Instead of returning that from ri_Check_Pk_Match(), we can
just look it up in the main SQL query.

Reported-by: Sam Gabrielsson <sam@movsom.se>
Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2025-01-21 14:39:24 +01:00
Michael Paquier
4feba03d8b Rework handling of pending data for backend statistics
9aea73fc61 has added support for backend statistics, relying on
PgStat_EntryRef->pending for its data pending for flush.  This design
lacks in flexibility, because the pending list does some memory
allocation, making it unsuitable if incrementing counters in critical
sections.

Pending data of backend statistics is reworked so the implementation
does not depend on PgStat_EntryRef->pending anymore, relying on a static
area of memory to store the counters that are flushed when stats are
reported to the pgstats dshash.  An advantage of this approach is to
allow the pending data to be manipulated in critical sections; some
patches are under discussion and require that.

The pending data is tracked by PendingBackendStats, local to
pgstat_backend.c.  Two routines are introduced to allow IO statistics to
update the backend-side counters.  have_static_pending_cb and
flush_static_cb are used for the flush, instead of flush_pending_cb.

Author: Bertrand Drouvot, Michael Paquier
Discussion: https://postgr.es/m/66efowskppsns35v5u2m7k4sdnl7yoz5bo64tdjwq7r5lhplrz@y7dme5xwh2r5
2025-01-21 11:30:42 +09:00
Michael Paquier
28de66cee5 Rename some pgstats callbacks related to flush of entries
The two callbacks have_fixed_pending_cb and flush_fixed_cb have been
introduced in fc415edf8c to provide a way for fixed-numbered
statistics to control the flush of their data.  These are renamed to
respectively have_static_pending_cb and flush_static_cb.  The
restriction that these only apply to fixed-numbered stats is removed.

A follow-up patch will make use of them for backend statistics.  This
stats kind is variable-numbered, and patches are under discussion to
track WAL data for IO and backend stats which cannot use
PgStat_EntryRef->pending as pending data would be touched in critical
sections, where no memory allocation can happen.

Per discussion with Andres Freund.

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/66efowskppsns35v5u2m7k4sdnl7yoz5bo64tdjwq7r5lhplrz@y7dme5xwh2r5
2025-01-21 10:12:39 +09:00
Jeff Davis
d3d0983169 Support PG_UNICODE_FAST locale in the builtin collation provider.
The PG_UNICODE_FAST locale uses code point sort order (fast,
memcmp-based) combined with Unicode character semantics. The character
semantics are based on Unicode full case mapping.

Full case mapping can map a single codepoint to multiple codepoints,
such as "ß" uppercasing to "SS". Additionally, it handles
context-sensitive mappings like the "final sigma", and it uses
titlecase mappings such as "Dž" when titlecasing (rather than plain
uppercase mappings).

Importantly, the uppercasing of "ß" as "SS" is specifically mentioned
by the SQL standard. In Postgres, UCS_BASIC uses plain ASCII semantics
for case mapping and pattern matching, so if we changed it to use the
PG_UNICODE_FAST locale, it would offer better compliance with the
standard. For now, though, do not change the behavior of UCS_BASIC.

Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org
Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-17 15:56:30 -08:00
Jeff Davis
286a365b9c Support Unicode full case mapping and conversion.
Generate tables from Unicode SpecialCasing.txt to support more
sophisticated case mapping behavior:

 * support case mappings to multiple codepoints, such as "ß"
   uppercasing to "SS"
 * support conditional case mappings, such as the "final sigma"
 * support titlecase variants, such as "dž" uppercasing to "DŽ" but
   titlecasing to "Dž"

Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com
Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org
Reviewed-by: Peter Eisentraut, Daniel Verite
2025-01-17 15:56:20 -08:00
John Naylor
e24d77080b Speed up hex_encode with bytewise lookup
Previously, hex_encode looked up each nibble of the input
separately. We now use a larger lookup table containing the two-byte
encoding of every possible input byte, resulting in a 1/3 reduction
in encoding time.

Reviewed by Tom Lane, Michael Paquier, Nathan Bossart, David Rowley

Discussion: https://postgr.es/m/CANWCAZZvXuJMgqMN4u068Yqa19CEjS31tQKZp_qFFFbgYfaXqQ%40mail.gmail.com
2025-01-17 16:29:25 +07:00
Peter Eisentraut
0869ea43e9 Remove flex version checks
Remove the flex version checks from configure and meson.  The cutoff
versions are all so ancient that this is no longer relevant, and what
the actual cutoff should be is a bit fuzzy.

This also removes the ancient behavior that configure would also
accept a "lex" program if it is actuall flex.  This aligns the check
with meson in this respect.

For future reference, as of this commit, these are relevant flex
versions:

- The hard required minimum is flex 2.5.34 as of commit b1ef48980d,
  but this has not actually been tested.

- Prior to this, the minimum enforced by configure/meson was flex
  2.5.35, which is the oldest present in the buildfarm right now.

- As of commit 6fdd5d9563, the oldest version that will compile
  without warnings due to flex-generated code is flex 2.5.36.

- The oldest version that probably still has some practical relevance
  is flex 2.5.37, which ships with CentOS/RHEL 7.

Discussion: https://www.postgresql.org/message-id/1a204ccd-7ae6-478c-a431-407b5c48ccc6@eisentraut.org
2025-01-17 09:30:42 +01:00
Michael Paquier
7b6468cc95 Rework macro pgstat_is_ioop_tracked_in_bytes()
As written, it was triggering a compilation warning for old versions of
clang, as reported by buildfarm members ayu, batfish and demoiselle.
Forcing a cast with "unsigned int" should fix the warning.

While on it, the macro is moved to pgstat.h, closer to the declaration
of IOOp, per suggestion from Tom Lane.

Reported-by: Tom Lane
Reviewed-by: Bertrand Drouvot, Tom Lane, Nazir Bilal Yavuz
Discussion: https://postgr.es/m/1272824.1736961543@sss.pgh.pa.us
2025-01-17 08:26:17 +09:00
Tom Lane
d7674c9fab Seek zone abbreviations in the IANA data before timezone_abbreviations.
If a time zone abbreviation used in datetime input is defined in
the currently active timezone, use that definition in preference
to looking in the timezone_abbreviations list.  That allows us to
correctly handle abbreviations that have different meanings in
different timezones.  Also, it eliminates an inconsistency between
datetime input and datetime output: the non-ISO datestyles for
timestamptz have always printed abbreviations taken from the IANA
data, not from timezone_abbreviations.  Before this fix, it was
possible to demonstrate cases where casting a timestamp to text
and back fails or changes the value significantly because of that
inconsistency.

While this change removes the ability to override the IANA data about
an abbreviation known in the current zone, it's not clear that there's
any real use-case for doing so.  But it is clear that this makes life
a lot easier for dealing with abbreviations that have conflicts across
different time zones.

Also update the pg_timezone_abbrevs view to report abbreviations
that are recognized via the IANA data, and *not* report any
timezone_abbreviations entries that are thereby overridden.
Under the hood, there are now two SRFs, one that pulls the IANA
data and one that pulls timezone_abbreviations entries.  They're
combined by logic in the view.  This approach was useful for
debugging (since the functions can be called on their own).
While I don't intend to document the functions explicitly,
they might be useful to call directly.

Also improve DecodeTimezoneAbbrev's caching logic so that it can
cache zone abbreviations found in the IANA data.  Without that,
this patch would have caused a noticeable degradation of the
runtime of timestamptz_in.

Per report from Aleksander Alekseev and additional investigation.

Discussion: https://postgr.es/m/CAJ7c6TOATjJqvhnYsui0=CO5XFMF4dvTGH+skzB--jNhqSQu5g@mail.gmail.com
2025-01-16 14:11:19 -05:00
Dean Rasheed
80feb727c8 Add OLD/NEW support to RETURNING in DML queries.
This allows the RETURNING list of INSERT/UPDATE/DELETE/MERGE queries
to explicitly return old and new values by using the special aliases
"old" and "new", which are automatically added to the query (if not
already defined) while parsing its RETURNING list, allowing things
like:

  RETURNING old.colname, new.colname, ...

  RETURNING old.*, new.*

Additionally, a new syntax is supported, allowing the names "old" and
"new" to be changed to user-supplied alias names, e.g.:

  RETURNING WITH (OLD AS o, NEW AS n) o.colname, n.colname, ...

This is useful when the names "old" and "new" are already defined,
such as inside trigger functions, allowing backwards compatibility to
be maintained -- the interpretation of any existing queries that
happen to already refer to relations called "old" or "new", or use
those as aliases for other relations, is not changed.

For an INSERT, old values will generally be NULL, and for a DELETE,
new values will generally be NULL, but that may change for an INSERT
with an ON CONFLICT ... DO UPDATE clause, or if a query rewrite rule
changes the command type. Therefore, we put no restrictions on the use
of old and new in any DML queries.

Dean Rasheed, reviewed by Jian He and Jeff Davis.

Discussion: https://postgr.es/m/CAEZATCWx0J0-v=Qjc6gXzR=KtsdvAE7Ow=D=mu50AgOe+pvisQ@mail.gmail.com
2025-01-16 14:57:35 +00:00
Peter Eisentraut
b6463ea6ef Downgrade error in object_aclmask_ext() to internal
The "does not exist" error in object_aclmask_ext() was written as
ereport(), suggesting that it is user-facing.  This is problematic:
get_object_class_descr() is meant to be for internal errors only and
does not support translation.

For the has_xxx_privilege functions, the error has not been
user-facing since commit 403ac226dd.  The remaining users are
pg_database_size() and pg_tablespace_size().  The call stack here is
pretty deep and this dependency is not obvious.  Here we can put in an
explicit existence check with a bespoke error message early in the
function.

Then we can downgrade the error in object_aclmask_ext() to a normal
"cache lookup failed" internal error.

Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/flat/da2f8942-be6d-48d0-ac1c-a053370a6b1f@eisentraut.org
2025-01-15 16:58:44 +01:00
Peter Eisentraut
6339f6468e Rename RowCompareType to CompareType
RowCompareType served as a way to describe the fundamental meaning of
an operator, notionally independent of an operator class (although so
far this was only really supported for btrees).  Its original purpose
was for use inside RowCompareExpr, and it has also found some small
use outside, such as for get_op_btree_interpretation().

We want to expand this now, as a more general way to describe operator
semantics for other index access methods, including gist (to improve
GistTranslateStratnum()) and others not written yet.  To avoid future
confusion, we rename the type to CompareType and the symbols from
ROWCOMPARE_XXX to COMPARE_XXX to reflect their more general purpose.

Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/E72EAA49-354D-4C2E-8EB9-255197F55330@enterprisedb.com
2025-01-15 08:44:01 +01:00
Melanie Plageman
2ae98ea5ab Synchronize guc_tables.c categories with vacuum docs categories
ca9c6a5680 consolidated most of the vacuum-related GUCs' documentation
into a new subsection. af2317652d then enforced this order in
postgresql.conf.sample. This commit reorganizes the GUC groups in
guc_tables.c/h to match the updated ordering in the docs.

Reported-by: Álvaro Herrera
Reviewed-by: Álvaro Herrera, Alena Rybakina
Discussion: https://postgr.es/m/202501132046.m4mcvxxswznu%40alvherre.pgsql
2025-01-14 15:31:00 -05:00
Dean Rasheed
4cb560b53f Consistently spell "leakproof" without a hyphen.
The overwhelming majority of places already did this, but a small
handful of places had a hyphen.

Yugo Nagata.

Discussion: https://postgr.es/m/CAEZATCXnnuORE2BoGwHw2zbtVvsPOLhbfVmEk9GxRzK%2Bx3OW-Q%40mail.gmail.com
2025-01-14 13:50:54 +00:00
Heikki Linnakangas
af8cd1639a Fix catcache invalidation of a list entry that's being built
If a new catalog tuple is inserted that belongs to a catcache list
entry, and cache invalidation happens while the list entry is being
built, the list entry might miss the newly inserted tuple.

To fix, change the way we detect concurrent invalidations while a
catcache entry is being built. Keep a stack of entries that are being
built, and apply cache invalidation to those entries in addition to
the real catcache entries. This is similar to the in-progress list in
relcache.c.

Back-patch to all supported versions.

Reviewed-by: Noah Misch
Discussion: https://www.postgresql.org/message-id/2234dc98-06fe-42ed-b5db-ac17384dc880@iki.fi
2025-01-14 14:28:49 +02:00
Michael Paquier
d2181b3218 Remove assertion in pgstat_count_io_op()
An equivalent check is done with pgstat_is_ioop_tracked_in_bytes(), so
there is no need for this extra one.  Small cleanup that should have
been included in f92c854cf4.

Author: Nazir Bilal Yavuz
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/CAN55FZ0oqxBaaHAEsj=xFqkzE3n5P=3RA1V_igXwL-RV7QRzyw@mail.gmail.com
2025-01-14 12:19:51 +09:00
Michael Paquier
f92c854cf4 Make pg_stat_io count IOs as bytes instead of blocks for some operations
Currently in pg_stat_io view, IOs are counted as blocks of size
BLCKSZ.  There are two limitations with this design:
* The actual number of I/O requests sent to the kernel is lower because
I/O requests may be merged before being sent.  Additionally, it gives
the impression that all I/Os are done in block size, which shadows the
benefits of merging I/O requests.
* Some patches are under work to extend pg_stat_io for the tracking of
operations that may not be linked to the block size.  For example, WAL
read IOs are done in variable bytes and it is not possible to correctly
show these IOs in pg_stat_io view, and we want to keep all this data in
a single system view rather than spread it across multiple relations to
ease monitoring.

WaitReadBuffers() can now be tracked as a single read operation
worth N blocks.  Same for ExtendBufferedRelShared() and
ExtendBufferedRelLocal() for extensions.

Three columns are added to pg_stat_io for reads, writes and extensions
for the byte calculations.  op_bytes, which was always hardcoded to
BLCKSZ, is removed.  IO backend statistics are updated to reflect these
changes.

Bump catalog version.

Author: Nazir Bilal Yavuz
Reviewed-by: Bertrand Drouvot, Melanie Plageman
Discussion: https://postgr.es/m/CAN55FZ0oqxBaaHAEsj=xFqkzE3n5P=3RA1V_igXwL-RV7QRzyw@mail.gmail.com
2025-01-14 12:14:29 +09:00
Melanie Plageman
af2317652d Reorder vacuum GUCs in postgresql.conf.sample to match docs
ca9c6a5680 consolidated most of vacuum-related GUCs' documentation into
a new subsection. It neglected, however, to reorganize
postgresql.conf.sample to match the new order. Do this now.

Reported-by: Álvaro Herrera
Discussion: https://postgr.es/m/202501110902.5banlseavz7c%40alvherre.pgsql
2025-01-13 15:21:04 -05:00
Daniel Gustafsson
97698cc517 Fix HBA option count
Commit 27a1f8d108 missed updating the max HBA option count to
account for the new option added.  Fix by bumping the counter
and adjust the relevant comment to match.  Backpatch down to
all supported branches like the erroneous commit.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/286764.1736697356@sss.pgh.pa.us
Backpatch-through: v13
2025-01-12 23:44:39 +01:00
Dean Rasheed
a93e2a1e25 Fix JsonExpr deparsing to quote variable names in the PASSING clause.
When deparsing a JsonExpr, variable names in the PASSING clause were
not quoted. However, since they are parsed as ColLabel tokens, some
variable names require double quotes to ensure that they are properly
interpreted. Fix by using quote_identifier() in the deparsing code.

This oversight was limited to the SQL/JSON query functions
JSON_EXISTS(), JSON_QUERY(), and JSON_VALUE().

Back-patch to v17, where these functions were added.

Dean Rasheed, reviewed by Tom Lane.

Discussion: https://postgr.es/m/CAEZATCXTpAS%3DncfLNTZ7YS6O5puHeLg_SUYAit%2Bcs7wsrd9Msg%40mail.gmail.com
2025-01-12 13:35:12 +00:00
Dean Rasheed
d673eefd41 Fix XMLTABLE() deparsing to quote namespace names if necessary.
When deparsing an XMLTABLE() expression, XML namespace names were not
quoted. However, since they are parsed as ColLabel tokens, some names
require double quotes to ensure that they are properly interpreted.
Fix by using quote_identifier() in the deparsing code.

Back-patch to all supported versions.

Dean Rasheed, reviewed by Tom Lane.

Discussion: https://postgr.es/m/CAEZATCXTpAS%3DncfLNTZ7YS6O5puHeLg_SUYAit%2Bcs7wsrd9Msg%40mail.gmail.com
2025-01-12 12:54:32 +00:00