While this doesn't significantly change runtime now, it arranges for
STRATEGY=WAL_LOG to benefit automatically from future optimizations to
the read_stream subsystem. For large tables in the template database,
this does read 16x as many bytes per system call. Platforms with high
per-call overhead, if any, may see an immediate benefit.
Nazir Bilal Yavuz
Discussion: https://postgr.es/m/CAN55FZ0JKL6vk1xQp6rfOXiNFV1u1H0tJDPPGHWoiO3ea2Wc=A@mail.gmail.com
Reaching that code would have required multiple processes performing
relation extension during recovery, which does not happen. That caller
has the persistence available, so pass it. This was dead code as soon
as commit 210622c60e1a9db2e2730140b8106ab57d259d15 added it.
Discussion: https://postgr.es/m/CAN55FZ0JKL6vk1xQp6rfOXiNFV1u1H0tJDPPGHWoiO3ea2Wc=A@mail.gmail.com
If vacuum fails to prune a tuple killed before OldestXmin, it will
decide to freeze its xmax and later error out in pre-freeze checks.
Add a test reproducing this scenario to the recovery suite which creates
a table on a primary, updates the table to generate dead tuples for
vacuum, and then, during the vacuum, uses a replica to force
GlobalVisState->maybe_needed on the primary to move backwards and
precede the value of OldestXmin set at the beginning of vacuuming the
table.
This commit is separate from the fix in case there are test stability
issues.
Author: Melanie Plageman
Reviewed-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAAKRu_apNU2MPBK96V%2BbXjTq0RiZ-%3DA4ZTaysakpx9jxbq1dbQ%40mail.gmail.com
If vacuum fails to remove a tuple with xmax older than
VacuumCutoffs->OldestXmin and younger than GlobalVisState->maybe_needed,
it may attempt to freeze the tuple's xmax and then ERROR out in
pre-freeze checks with "cannot freeze committed xmax".
Fix this by having vacuum always remove tuples older than OldestXmin.
It is possible for GlobalVisState->maybe_needed to precede OldestXmin if
maybe_needed is forced to go backward while vacuum is running. This can
happen if a disconnected standby with a running transaction older than
VacuumCutoffs->OldestXmin reconnects to the primary after vacuum
initially calculates GlobalVisState and OldestXmin.
In back branches starting with 14, the first version using
GlobalVisState, failing to remove tuples older than OldestXmin during
pruning caused vacuum to infinitely loop in lazy_scan_prune(), as
investigated on this [1] thread. After 1ccc1e05ae removed the retry loop
in lazy_scan_prune() and stopped comparing tuples to OldestXmin, the
hang could no longer happen, but we could still attempt to freeze dead
tuples with xmax older than OldestXmin -- resulting in an ERROR.
Fix this by always removing dead tuples with xmax older than
VacuumCutoffs->OldestXmin. This is okay because the standby won't replay
the tuple removal until the tuple is removable. Thus, the worst that can
happen is a recovery conflict.
[1] https://postgr.es/m/20240415173913.4zyyrwaftujxthf2%40awork3.anarazel.de#1b216b7768b5bd577a3d3d51bd5aadee
Back-patch through 14
Author: Melanie Plageman
Reviewed-by: Peter Geoghegan, Robert Haas, Andres Freund, Heikki Linnakangas, and Noah Misch
Discussion: https://postgr.es/m/CAAKRu_bDD7oq9ZwB2OJqub5BovMG6UjEYsoK2LVttadjEqyRGg%40mail.gmail.com
Only the LLVM specific code uses it since resource owners were made
extensible in commit b8bff07daa85c837a2747b4d35cd5a27e73fb7b2. This is
new in v17, so backpatch there to keep the branches from diverging
just yet.
Author: Andreas Karlsson <andreas@proxel.se>
Discussion: https://www.postgresql.org/message-id/fd3a2a00-6605-4e30-a118-48418b478e6e@proxel.se
There was no coverage for the code path to unwrap an array before
applying ".*" to it, so add tests to provide more coverage for both
objects and arrays.
This shows, for example, that no results are returned for an array of
scalars, and what results are returned when the array contains an
object. A few more scenarios are covered with the strict/lax modes and
the operator "@?".
Author: David Wheeler
Reported-by: David G. Johnston, Stepan Neretin
Discussion: https://postgr.es/m/A95346F9-6147-46E0-809E-532A485D71D6@justatheory.com
Commit d844cd75a disallowed rewind in a non-scrollable cursor to resolve
anomalies arising from such a cursor operation. However, this failed to
take into account the assumption in postgres_fdw that when rescanning a
foreign relation, it can rewind the cursor created for scanning the
foreign relation without specifying the SCROLL option, regardless of its
scrollability, causing this error when it tried to do such a rewind in a
non-scrollable cursor. Fix by modifying postgres_fdw to instead
recreate the cursor, regardless of its scrollability, when rescanning
the foreign relation. (If we had a way to check its scrollability, we
could improve this by rewinding it if it is scrollable and recreating it
if not, but we do not have it, so this commit modifies it to recreate it
in any case.)
Per bug #17889 from Eric Cyr. Devrim Gunduz also reported this problem.
Back-patch to v15 where that commit enforced the prohibition.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/17889-e8c39a251d258dda%40postgresql.org
Discussion: https://postgr.es/m/b415ac3255f8352d1ea921cf3b7ba39e0587768a.camel%40gunduz.org
For utility statements defined within a function, the query tree is
copied to a PlannedStmt as utility commands do not require planning.
However, the query ID was missing from the information passed down.
This leads to plugins relying on the query ID like pg_stat_statements to
not be able to track utility statements within function calls. Tests
are added to check this behavior, depending on pg_stat_statements.track.
This is an old bug. Now, query IDs for utilities are compiled using
their parsed trees rather than the query string since v16
(3db72ebcbe20), leading to less bloat with utilities, so backpatch down
only to this version.
Author: Anthonin Bonnefoy
Discussion: https://postgr.es/m/CAO6_XqrGp-uwBqi3vBPLuRULKkddjC7R5QZCgsFren=8E+m2Sg@mail.gmail.com
Backpatch-through: 16
If pg_ctl tries to start the postmaster, but the postmaster shuts down
because it completed a point-in-time recovery, pg_ctl used to report
a message that indicated a failure. It's not really a failure, so
instead say "server shut down because of recovery target settings".
Zhao Junwang, Crisp Lee, Laurenz Albe
Discussion: https://postgr.es/m/CAGHPtV7GttPZ-HvxZuYRy70jLGQMEm5=LQc4fKGa=J74m2VZbg@mail.gmail.com
RAISE accepts either = or := in the USING clause, so fix the
syntax synopsis to show that.
Rearrange and wordsmith the descriptions of the different syntax
variants, in hopes of improving clarity.
Igor Gnatyuk, reviewed by Jian He and Laurenz Albe; minor additional
wordsmithing by me
Discussion: https://postgr.es/m/CAEu6iLvhF5sdGeat2x4_L0FvWW_SiN--ma8ya7CZd-oJoV+yqQ@mail.gmail.com
To do this, we must include the wal_level in the first WAL record
covered by each summary file; so add wal_level to struct Checkpoint
and the payload of XLOG_CHECKPOINT_REDO and XLOG_END_OF_RECOVERY.
This, in turn, requires bumping XLOG_PAGE_MAGIC and, since the
Checkpoint is also stored in the control file, also
PG_CONTROL_VERSION. It's not great to do that so late in the release
cycle, but the alternative seems to ship v17 without robust
protections against this scenario, which could result in corrupted
incremental backups.
A side effect of this patch is that, when a server with
wal_level=replica is started with summarize_wal=on for the first time,
summarization will no longer begin with the oldest WAL that still
exists in pg_wal, but rather from the first checkpoint after that.
This change should be harmless, because a WAL summary for a partial
checkpoint cycle can never make an incremental backup possible when
it would otherwise not have been.
Report by Fujii Masao. Patch by me. Review and/or testing by Jakub
Wartak and Fujii Masao.
Discussion: http://postgr.es/m/6e30082e-041b-4e31-9633-95a66de76f5d@oss.nttdata.com
This new macro is able to perform a direct lookup from the local cache
of injection points (refreshed each time a point is loaded or run),
without touching the shared memory state of injection points at all.
This works in combination with INJECTION_POINT_LOAD(), and it is better
than INJECTION_POINT() in a critical section due to the fact that it
would avoid all memory allocations should a concurrent detach happen
since a LOAD(), as it retrieves a callback from the backend-private
memory.
The documentation is updated to describe in more details how to use this
new macro with a load. Some tests are added to the module
injection_points based on a new SQL function that acts as a wrapper of
INJECTION_POINT_CACHED().
Based on a suggestion from Heikki Linnakangas.
Author: Heikki Linnakangas, Michael Paquier
Discussion: https://postgr.es/m/58d588d0-e63f-432f-9181-bed29313dece@iki.fi
Commit f4b54e1ed9, which introduced macros for protocol characters,
missed updating a few places. It also did not introduce macros for
messages sent from parallel workers to their leader processes.
This commit adds a new section in protocol.h for those.
Author: Aleksander Alekseev
Discussion: https://postgr.es/m/CAJ7c6TNTd09AZq8tGaHS3LDyH_CCnpv0oOz2wN1dGe8zekxrdQ%40mail.gmail.com
Backpatch-through: 17
This essentially reverts c2d93c3802b except tests. The problem with
c2d93c3802b was that it only changed the casting behavior for types
with typmod, and had coding issues noted in the post-commit review.
This commit changes coerceJsonFuncExpr() to use assignment-level casts
instead of explicit casts to coerce the result of JSON constructor
functions to the specified or the default RETURNING type. Using
assignment-level casts fixes the problem that using explicit casts was
leading to the wrong typmod / length coercion behavior -- truncating
results longer than the specified length instead of erroring out --
which c2d93c3802b aimed to solve.
That restricts the set of allowed target types to string types, the
same set that's currently allowed.
Discussion: https://postgr.es/m/202406291824.reofujy7xdj3@alvherre.pgsql
This switches the pgstats write code to use durable_rename() rather than
rename(). This ensures that the stats file's data is durable when the
statistics are written, which is something only happening at shutdown
now with the checkpointer doing the job.
This could cause the statistics to be lost even after PostgreSQL is shut
down, should a host failure happen, for example.
Suggested-by: Konstantin Knizhnik
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/ZpDQTZ0cAz0WEbh7@paquier.xyz
Previously, CREATE MATERIALIZED VIEW ... WITH DATA populated the MV
the same way as CREATE TABLE ... AS.
Instead, reuse the REFRESH logic, which locks down security-restricted
operations and restricts the search_path. This reduces the chance that
a subsequent refresh will fail.
Reported-by: Noah Misch
Backpatch-through: 17
Discussion: https://postgr.es/m/20240630222344.db.nmisch@google.com
This test was added by commit d2b74882ca, but fails if
log_error_verbosity is set to verbose. Adjust the regex that checks the
error message to allow for it containing an SQL status code.
Using <replaceable>text</replaceable> inside parantheses is not a
common or good style, so rephrase a sentence to avoid that style.
Also rephrase the text in that paragraph a bit while at it.
Reported-by: Marcos Pegoraro <marcos@f10.com.br>
Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CAB-JLwZqH3Yec6Kz-4-+pa0ZG9QJBsxjJZwYcMZYzEDR_fXnKw@mail.gmail.com
This commit provides testig coverage for ccd38024bc3c, checking that a
role granted pg_signal_autovacuum_worker is able to stop a vacuum
worker.
An injection point with a wait is placed at the beginning of autovacuum
worker startup to make sure that a worker is still alive when sending
and processing the signal sent.
Author: Anthony Leung, Michael Paquier, Kirill Reshke
Reviewed-by: Andrey Borodin, Nathan Bossart
Discussion: https://postgr.es/m/CALdSSPiQPuuQpOkF7x0g2QkA5eE-3xXt7hiJFvShV1bHKDvf8w@mail.gmail.com
gcc emits a warning for LLVM 14 code outside of our control. To avoid that,
update to a newer LLVM version. Do so both in the CompilerWarnings and normal
tasks - the latter don't fail, but the warnings make it more likely that we'd
miss other warnings.
We might want to backpatch this eventually. The higher priority right now is
to unbreak CI though - which is only broken on master, due to 0c3930d0768
interacting badly with c8a6ec206a9 (mea culpa, I should have noticed this
before pushing, but I missed it due to another, independent CI failure).
Discussion: https://postgr.es/m/20240715193754.awdxgrzurxnwwu2t@awork3.anarazel.de
Bullseye is getting long in the tooth, upgrade to the current stable version.
Backpatch to all versions with CI support, we don't want to generate CI images
for multiple Debian versions.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ0fY5EFHXLKCO_%3Dp4pwFmHRoVom_qSE_7B48gpchfAqzw%40mail.gmail.com
Backpatch: 15-, where CI was added
Before this change guc_var_compare() cast the input arguments to
const struct config_generic *. That's not quite right however, as the input
on one side is often just a char * on one side.
Instead just use char *, the first field in config_generic.
This fixes a -Warray-bounds warning with some versions of gcc. While the
warning is only known to be triggered for <= 15, the issue the warning points
out seems real, so apply the fix everywhere.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Reported-by: Erik Rijkers <er@xs4all.nl>
Suggested-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/a74a1a0d-0fd2-3649-5224-4f754e8f91aa%40xs4all.nl
Point out that savepoint commands cannot be issued in PL/pgSQL,
and suggest that exception blocks can usually be used instead.
Add a caveat to the discussion of cursor loops vs. transactions,
pointing out that any locks taken by the cursor query will be lost
at COMMIT. This is implicit in what's already said, but the existing
text leaves the distinct impression that the auto-hold behavior is
transparent, which it's not really.
Per a couple of recent complaints (one unsigned, and one in bug #18531
from Dzmitry Jachnik). Back-patch to v17, just so this makes it into
current docs in less than a year-and-a-half.
Discussion: https://postgr.es/m/172076354433.736586.14347210271966220018@wrigleys.postgresql.org
Discussion: https://postgr.es/m/18531-c6dddd33b8555fd2@postgresql.org
The problem fixed by commit 53c8d6c9 would have been noticed if we'd
been running LLVM's verify pass on generated IR. Doing so also reveals
a complaint about incorrect name mangling, fixed here. Only enabled for
LLVM 17+ because it uses the new pass manager API.
Suggested-by: Dmitry Dolgov <9erthalion6@gmail.com>
Discussion: https://postgr.es/m/CAFj8pRACpVFr7LMdVYENUkScG5FCYMZDDdSGNU-tch%2Bw98OxYg%40mail.gmail.com
The tests added by commit c086896625 were unstable due to
missing schema names when checking pg_tables and pg_indexes.
Backpatch to v17.
Reported by buildfarm.
As commit ca4103025d stated, new partitions without a specified tablespace
should inherit the parent relation's tablespace. However, previously,
ALTER TABLE MERGE PARTITIONS and ALTER TABLE SPLIT PARTITION commands
always created new partitions in the default tablespace, ignoring
the parent's tablespace. This commit ensures new partitions inherit
the parent's tablespace.
Backpatch to v17 where these commands were introduced.
Author: Fujii Masao
Reviewed-by: Masahiko Sawada
Discussion: https://postgr.es/m/abaf390b-3320-40a5-8815-ef476db5cfe7@oss.nttdata.com
If we intend to generate a Memoize node on top of a path, we need
cache keys of some sort. Currently we search for the cache keys in
the parameterized clauses of the path as well as the lateral_vars of
its parent. However, it turns out that this is not sufficient because
there might be lateral references derived from PlaceHolderVars, which
we fail to take into consideration.
This oversight can cause us to miss opportunities to utilize the
Memoize node. Moreover, in some plans, failing to recognize all the
cache keys could result in performance regressions. This is because
without identifying all the cache keys, we would need to purge the
entire cache every time we get a new outer tuple during execution.
This patch fixes this issue by extracting lateral Vars from within
PlaceHolderVars and subsequently including them in the cache keys.
In passing, this patch also includes a comment clarifying that Memoize
nodes are currently not added on top of join relation paths. This
explains why this patch only considers PlaceHolderVars that are due to
be evaluated at baserels.
Author: Richard Guo
Reviewed-by: Tom Lane, David Rowley, Andrei Lepikhov
Discussion: https://postgr.es/m/CAMbWs48jLxn0pAPZpJ50EThZ569Xrw+=4Ac3QvkpQvNszbeoNg@mail.gmail.com
checkWellFormedRecursion would issue "missing recursive reference"
if a WITH RECURSIVE query contained a single self-reference but
that self-reference was inside a top-level WITH, ORDER BY, LIMIT,
etc, rather than inside the second arm of the UNION as expected.
We already intended to throw more-on-point errors for such cases,
but those error checks must be done before examining the UNION arm
in order to have the desired results. So this patch need only
move some code (and improve the comments).
Per bug #18536 from Alexander Lakhin. Back-patch to all supported
branches.
Discussion: https://postgr.es/m/18536-0a342ec07901203e@postgresql.org
Such queries don't expand automatically updatable views, and ModifyTable
uses the wholerow attribute unconditionally. The user-visible behavior
is fine, so change to more-specific assertions. Commit
d5f788b41dc2cbdde6e7694c70dda54d829a5ed5 added the wrong assertion.
Back-patch to v17, where commit 5f2e179bd31e5f5803005101eb12a8d7bf8db8f3
introduced MERGE view_name.
Reported by Alexander Lakhin.
Discussion: https://postgr.es/m/e4b40a88-c134-6926-3196-bc4501cb87a2@gmail.com
ANALYZE sets relhassubclass=f when a partitioned table no longer has
partitions. An ANALYZE doing that proceeded to apply the inplace update
of pg_class.reltuples to the old pg_class tuple instead of the new
tuple, losing that reltuples=0 change if the ANALYZE committed.
Non-partitioning inheritance trees were unaffected. Back-patch to v14,
where commit 375aed36ad83f0e021e9bdd3a0034c0c992c66dc introduced
maintenance of partitioned table pg_class.reltuples.
Reported by Alexander Lakhin.
Discussion: https://postgr.es/m/a295b499-dcab-6a99-c06e-01cf60593344@gmail.com
The current code can have pg_isready unexpectedly succeed if there is a
server running on the default port. To avoid this we delay running the
test until after a node has been created but before it starts, and then
use that node's port, so we are fairly sure there is nothing running on
the port.
Backpatch to all live branches.
Winsock only signals an FD_CLOSE event once if the other end of the
socket shuts down gracefully. Because each WaitLatchOrSocket() call
constructs and destroys a new event handle every time, with unlucky
timing we can lose it and hang. We get away with this only if the other
end disconnects non-gracefully, because FD_CLOSE is repeatedly signaled
in that case.
To fix this design flaw in our Windows socket support fundamentally,
we'd probably need to rearchitect it so that a single event handle
exists for the lifetime of a socket, or switch to completely different
multiplexing or async I/O APIs. That's going to be a bigger job
and probably wouldn't be back-patchable.
This brute force kludge closes the race by explicitly polling with
MSG_PEEK before sleeping.
Back-patch to all supported releases. This should hopefully clear up
some random build farm and CI hang failures reported over the years. It
might also allow us to try using graceful shutdown in more places again
(reverted in commit 29992a6) to fix instability in the transmission of
FATAL error messages, but that isn't done by this commit.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/176008.1715492071%40sss.pgh.pa.us
When a partitioned table has an index that doesn't support a constraint,
but a partition has an equivalent index that does, then a DETACH
operation would misbehave: a crash in assertion-enabled systems (because
we fail to find the constraint in the parent that we expect to), or a
broken coninhcount value (-1) in production systems (because we blindly
believe that we've successfully detached the parent).
While we should reject an ATTACH of a partition with such an index, we
have failed to do so in existing releases, so adding an error in stable
releases might break the (unlikely) existing applications that rely on
this behavior. At this point I don't even want to reject them in
master, because it'd break pg_upgrade if such databases exist, and there
would be no easy way to fix existing databases without expensive index
rebuilds.
(Later on we could add ALTER TABLE ... ADD CONSTRAINT USING INDEX to
partitioned tables, which would allow the user to fix such patterns. At
that point we could add more restrictions to prevent the problem from
its root.)
Also, add a test case that leaves one table in this condition, so that
we can verify that pg_upgrade continues to work if we later decide to
change the policy on the master branch.
Backpatch to all supported branches.
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18500-62948b6fe5522f56@postgresql.org
This routine can currently only be called from the postmaster in
single-user mode or the checkpointer, but there was no sanity check to
make sure that this was always the case.
This has proved to be useful when hacking the zone (at least to me), to
make sure that the write of the pgstats file happens at shutdown, as
wanted by design, in the correct process context.
Discussion: https://postgr.es/m/ZnEiqAITL-VgZDoY@paquier.xyz
The slot synchronization failed because the local slot's (created during
slot synchronization) catalog_xmin on standby is ahead of remote slot.
This happens because the INSERT before slot synchronization results in the
generation of a new xid that could be replicated to the standby. Now
before the xmin of the physical slot on the primary catches up via
hot_standby_feedback, the test has created a logical slot that got some
prior value of catalog_xmin.
To fix this we could try to ensure that the physical slot's catalog_xmin
is caught up to latest value before creating a logical slot but we took a
simpler path to move the INSERT after synchronizing the logical slot.
Reported-by: Alexander Lakhin as per buildfarm
Diagnosed-by: Amit Kapila, Hou Zhijie, Alexander Lakhin
Author: Hou Zhijie
Backpatch-through: 17
Discussion: https://postgr.es/m/bde6ac67-69cc-c104-5ab6-dd4f5deadf24@gmail.com
When generating non-parallel nestloop paths for each available outer
path, we always consider materializing the cheapest inner path if
feasible. Similarly, in this patch, we also consider materializing
the cheapest inner path when building partial nestloop paths. This
approach potentially reduces the need to rescan the inner side of a
partial nestloop path for each outer tuple.
Author: Tender Wang
Reviewed-by: Richard Guo, Robert Haas, David Rowley, Alena Rybakina
Reviewed-by: Tomasz Rybak, Paul Jungwirth, Yuki Fujii
Discussion: https://postgr.es/m/CAHewXNkPmtEXNfVQMou_7NqQmFABca9f4etjBtdbbm0ZKDmWvw@mail.gmail.com
The comment at the top of pgstat_read_statsfile() mentioned that the
stats are read from the on-disk file into the pgstats dshash. This is
incorrect for fix-numbered stats as these are loaded directly into
shared memory. This commit simplifies the comment to be more general.
Author: Bertrand Drouvot
Discussion: https://postgr.es/m/Zo/eJIHUcqKxeSgv@ip-10-97-1-34.eu-west-3.compute.internal