1
0
mirror of https://github.com/postgres/postgres.git synced 2025-12-21 05:21:08 +03:00
Commit Graph

2361 Commits

Author SHA1 Message Date
Tom Lane
126ec0bc76 Restore relmapper state early enough in parallel workers.
We need to do RestoreRelationMap before loading catalog-derived
state, else the worker may end up with catalog relcache entries
containing stale relfilenode data.  Move up RestoreReindexState
too, on the principle that that should also happen before we
do much of any catalog access.

I think ideally these things would happen even before InitPostgres,
but there are various problems standing in the way of that, notably
that the relmapper thinks "active" mappings should be discarded at
transaction end.  The implication of this is that InitPostgres and
RestoreLibraryState will see the same catalog state as an independent
backend would see, which is probably fine; at least, it's been like
that all along.

Per report from Justin Pryzby.  There is a case to be made that
this should be back-patched.  But given the lack of complaints
before 6e086fa2e and the short amount of time remaining before
17.0 wraps, I'll just put it in HEAD for now.

Discussion: https://postgr.es/m/ZuoU_8EbSTE14o1U@pryzbyj2023
2024-09-19 20:58:21 -04:00
Alexander Korotkov
014f9f34d2 Move pg_wal_replay_wait() to xlogfuncs.c
This commit moves pg_wal_replay_wait() procedure to be a neighbor of
WAL-related functions in xlogfuncs.c.  The implementation of LSN waiting
continues to reside in the same place.

By proposal from Michael Paquier.

Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/18c0fa64-0475-415e-a1bd-665d922c5201%40eisentraut.org
2024-09-19 14:26:11 +03:00
Alexander Korotkov
85b98b8d5a Minor cleanup related to pg_wal_replay_wait() procedure
* Rename $node_standby1 to $node_standby in 043_wal_replay_wait.pl as there
   is only one standby.
 * Remove useless debug printing in 043_wal_replay_wait.pl.
 * Fix typo in one check description in 043_wal_replay_wait.pl.
 * Fix some wording in comments and documentation.

Reported-by: Alexander Lakhin
Discussion: https://postgr.es/m/1d7b08f2-64a2-77fb-c666-c9a74c68eeda%40gmail.com
Reviewed-by: Alexander Lakhin
2024-09-17 22:50:43 +03:00
Michael Paquier
fba49d5293 Remove emode argument from XLogFileRead() and XLogFileReadAnyTLI()
This change makes the code slightly easier to reason about, because
there is actually no need to know if a specific caller of one of these
routines should fail hard on a PANIC, or just let it go through with a
DEBUG2.

The only caller of XLogFileReadAnyTLI() used DEBUG2, and XLogFileRead()
has never used its emode.  This can be simplified since 1bb2558046
that has introduced XLogFileReadAnyTLI(), splitting both.

Author: Yugo Nagata
Discussion: https://postgr.es/m/20240906201043.a640f3b44e755d4db2b6943e@sraoss.co.jp
2024-09-10 08:44:31 +09:00
Michael Paquier
a68159ff2b Unify some error messages to ease work of translators
This commit updates a couple of error messages around control file data,
GUCs and server settings, unifying to the same message where possible.
This reduces the translation burden a bit.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2024-09-04 14:53:18 +09:00
Michael Paquier
b4db64270e Apply more quoting to GUC names in messages
This is a continuation of 17974ec259.  More quotes are applied to
GUC names in error messages and hints, taking care of what seems to be
all the remaining holes currently in the tree for the GUCs.

Author: Peter Smith
Discussion: https://postgr.es/m/CAHut+Pv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w@mail.gmail.com
2024-09-04 13:50:44 +09:00
Thomas Munro
813fde73d4 Standardize "read-ahead advice" terminology.
Commit 6654bb920 added macOS's equivalent of POSIX_FADV_WILLNEED, and
changed some explicit references to posix_fadvise to use this more
general name for the concept.  Update some remaining references.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/0827edec-1317-4917-a186-035eb1e3241d%40eisentraut.org
2024-09-04 10:28:53 +12:00
Peter Eisentraut
2b5f57977f Add const qualifiers to XLogRegister*() functions
Add const qualifiers to XLogRegisterData() and XLogRegisterBufData().
Several unconstify() calls can be removed.

Reviewed-by: Aleksander Alekseev <aleksander@timescale.com>
Discussion: https://www.postgresql.org/message-id/dd889784-9ce7-436a-b4f1-52e4a5e577bd@eisentraut.org
2024-09-03 08:06:03 +02:00
Michael Paquier
4236825197 Fix typos and grammar in code comments and docs
Author: Alexander Lakhin
Discussion: https://postgr.es/m/f7e514cf-2446-21f1-a5d2-8c089a6e2168@gmail.com
2024-09-03 14:49:04 +09:00
Michael Paquier
c7cd2d6ed0 Define PG_TBLSPC_DIR for path pg_tblspc/ in data folder
Similarly to 2065ddf5e3, this introduces a define for "pg_tblspc".
This makes the style more consistent with the existing PG_STAT_TMP_DIR,
for example.

There is a difference with the other cases with the introduction of
PG_TBLSPC_DIR_SLASH, required in two places for recovery and backups.

Author: Bertrand Drouvot
Reviewed-by: Ashutosh Bapat, Álvaro Herrera, Yugo Nagata, Michael
Paquier
Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal
2024-09-03 09:11:54 +09:00
Peter Eisentraut
edee0c621d Message style improvements 2024-08-29 14:43:34 +02:00
Michael Paquier
94a3373ac5 Rework new SLRU test with injection points
Rather than the SQL injection_points_load(), this commit changes the
injection point test introduced in 768a9fd553 to rely on the two
macros INJECTION_POINT_LOAD() and INJECTION_POINT_CACHED(), that have
been originally introduced for the sake of this test.

This runs the test as a two-step process: load the injection point, then
run its callback directly from the local cache loaded.  What the test
did originally was also fine, but the point here is to have an example
in core of how to use these new macros.

While on it, fix the header ordering in multixact.c, as pointed out by
Alexander Korotkov.  This was an oversight in 768a9fd553.

Per discussion with Álvaro Herrera.

Author: Michael Paquier
Discussion: https://postgr.es/m/ZsUnJUlSOBNAzwW1@paquier.xyz
Discussion: https://postgr.es/m/CAPpHfduzaBz7KMhwuVOZMTpG=JniPG4aUosXPZCxZydmzq_oEQ@mail.gmail.com
2024-08-23 12:11:36 +09:00
Alvaro Herrera
768a9fd553 Add injection-point test for new multixact CV usage
Before commit a0e0fb1ba5, multixact.c contained a case in the
multixact-read path where it would loop sleeping 1ms each time until
another multixact-create path completed, which was uncovered by any
tests.  That commit changed the code to rely on a condition variable
instead.  Add a test now, which relies on injection points and "loading"
thereof (because of it being in a critical section), per commit
4b211003ec.

Author: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/0925F9A9-4D53-4B27-A87E-3D83A757B0E0@yandex-team.ru
2024-08-20 14:21:34 -04:00
Michael Paquier
bd06cc338d Fix more holes with SLRU code in need of int64 for segment numbers
This is a continuation of c9e2457390, containing changes included into
the proposed patch that have been missed in the actual commit.  I have
managed to miss these diffs while doing a rebase of the original patch.

Thanks to Noah Misch, Peter Eisentraut and Alexander Korotkov for the
pokes.

Discussion: https://postgr.es/m/92fe572d-638e-4162-aef6-1c42a2936f25@eisentraut.org
Discussion: https://postgr.es/m/20240810175055.cd.nmisch@google.com
Backpatch-through: 17
2024-08-19 12:34:18 +09:00
Alvaro Herrera
7b063ff26a Search for SLRU page only in its own bank
One of the two slot scans in SlruSelectLRUPage was not walking only the
slots in the specific bank where the buffer could be; change it to do
that.

Oversight in 53c2a97a92.

Author: Sergey Sargsyan <sergey.sargsyan.2001@gmail.com>
Discussion: https://postgr.es/m/18582-5f301dd30ba91a38@postgresql.org
2024-08-18 20:49:57 -04:00
David Rowley
ffabb56c94 Fix a series of typos and outdated references
Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/c1d63754-cb85-2d8a-8409-bde2c4d2d04b@gmail.com
2024-08-12 23:27:09 +12:00
Heikki Linnakangas
8de5ca1dc9 Fix bad indentation introduced in commit f011e82c2c 2024-08-12 10:57:03 +03:00
Heikki Linnakangas
f011e82c2c Initialize HASHCTL differently, to suppress Coverity warning
Coverity complained that the hash_create() call might access
hash_table_ctl->hctl. That's a false alarm, hash_create() only
accesses that field when passed the HASH_SHARED_MEM flag. Try to
silence it by using a plain local variable instead of a const. That's
how the HASHCTL is initialized in all the other hash_create() calls.
2024-08-11 20:21:16 +03:00
Heikki Linnakangas
fe8dd65bf2 Mark misc static global variables as const
Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/54c29fb0-edf2-48ea-9814-44e918bbd6e8@iki.fi
2024-08-06 23:04:48 +03:00
Tom Lane
6e086fa2e7 Allow parallel workers to cope with a newly-created session user ID.
Parallel workers failed after a sequence like
BEGIN;
CREATE USER foo;
SET SESSION AUTHORIZATION foo;
because check_session_authorization could not see the uncommitted
pg_authid row for "foo".  This is because we ran RestoreGUCState()
in a separate transaction using an ordinary just-created snapshot.
The same disease afflicts any other GUC that requires catalog lookups
and isn't forgiving about the lookups failing.

To fix, postpone RestoreGUCState() into the worker's main transaction
after we've set up a snapshot duplicating the leader's.  This affects
check_transaction_isolation and check_transaction_deferrable, which
think they should only run during transaction start.  Make them
act like check_transaction_read_only, which already knows it should
silently accept the value when InitializingParallelWorker.

This un-reverts commit f5f30c22e.  The original plan was to back-patch
that, but the fact that 0ae5b763e proved to be a pre-requisite shows
that the subtle API change for GUC hooks might actually break some of
them.  The problem we're trying to fix seems not worth taking such a
risk for in stable branches.

Per bug #18545 from Andrey Rachitskiy.

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-08-06 12:36:42 -04:00
Tom Lane
0ae5b763ea Clean up handling of client_encoding GUC in parallel workers.
The previous coding here threw an error from assign_client_encoding
if it was invoked in a parallel worker.  That's a very fundamental
violation of the GUC hook API: assign hooks must not throw errors.
The place to complain is in the check hook, so move the test to
there, and use the regular check-hook API (ie return false) to
report it.

The reason this coding is a problem is that it breaks GUC rollback,
which may occur after we leave InitializingParallelWorker state.
That case seems not actually reachable before now, but commit
f5f30c22e made it reachable, so we need to fix this before that
can be un-reverted.

In passing, improve the commentary in ParallelWorkerMain, and
add a check for failure of SetClientEncoding.  That's another
case that can't happen now but might become possible after
foreseeable code rearrangements (notably, if the shortcut of
skipping PrepareClientEncoding stops being OK).

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-08-06 12:21:53 -04:00
Alexander Korotkov
3c5db1d6b0 Implement pg_wal_replay_wait() stored procedure
pg_wal_replay_wait() is to be used on standby and specifies waiting for
the specific WAL location to be replayed.  This option is useful when
the user makes some data changes on primary and needs a guarantee to see
these changes are on standby.

The queue of waiters is stored in the shared memory as an LSN-ordered pairing
heap, where the waiter with the nearest LSN stays on the top.  During
the replay of WAL, waiters whose LSNs have already been replayed are deleted
from the shared memory pairing heap and woken up by setting their latches.

pg_wal_replay_wait() needs to wait without any snapshot held.  Otherwise,
the snapshot could prevent the replay of WAL records, implying a kind of
self-deadlock.  This is why it is only possible to implement
pg_wal_replay_wait() as a procedure working without an active snapshot,
not a function.

Catversion is bumped.

Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru
Author: Kartyshov Ivan, Alexander Korotkov
Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila
Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira
Reviewed-by: Heikki Linnakangas, Kyotaro Horiguchi
2024-08-02 21:16:56 +03:00
Michael Paquier
b860848232 Add redo LSN to pgstats files
This is used in the startup process to check that the pgstats file we
are reading includes the redo LSN referring to the shutdown checkpoint
where it has been written.  The redo LSN in the pgstats file needs to
match with what the control file has.

This is intended to be used for an upcoming change that will extend the
write of the stats file to happen during checkpoints, rather than only
shutdown sequences.

Bump PGSTAT_FILE_FORMAT_ID.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Zp8o6_cl0KSgsnvS@paquier.xyz
2024-08-02 01:57:28 +09:00
Tom Lane
e6a9637488 Revert "Allow parallel workers to cope with a newly-created session user ID."
This reverts commit f5f30c22ed.

Some buildfarm animals are failing with "cannot change
"client_encoding" during a parallel operation".  It looks like
assign_client_encoding is unhappy at being asked to roll back a
client_encoding setting after a parallel worker encounters a
failure.  There must be more to it though: why didn't I see this
during local testing?  In any case, it's clear that moving the
RestoreGUCState() call is not as side-effect-free as I thought.
Given that the bug f5f30c22e intended to fix has gone unreported
for years, it's not something that's urgent to fix; I'm not
willing to risk messing with it further with only days to our
next release wrap.
2024-07-31 20:57:00 -04:00
Tom Lane
f5f30c22ed Allow parallel workers to cope with a newly-created session user ID.
Parallel workers failed after a sequence like
	BEGIN;
	CREATE USER foo;
	SET SESSION AUTHORIZATION foo;
because check_session_authorization could not see the uncommitted
pg_authid row for "foo".  This is because we ran RestoreGUCState()
in a separate transaction using an ordinary just-created snapshot.
The same disease afflicts any other GUC that requires catalog lookups
and isn't forgiving about the lookups failing.

To fix, postpone RestoreGUCState() into the worker's main transaction
after we've set up a snapshot duplicating the leader's.  This affects
check_transaction_isolation and check_transaction_deferrable, which
think they should only run during transaction start.  Make them
act like check_transaction_read_only, which already knows it should
silently accept the value when InitializingParallelWorker.

Per bug #18545 from Andrey Rachitskiy.  Back-patch to all
supported branches, because this has been wrong for awhile.

Discussion: https://postgr.es/m/18545-feba138862f19aaa@postgresql.org
2024-07-31 18:54:10 -04:00
Heikki Linnakangas
da8a587e2e Replace static buf with a stack-allocated one in ReadControlFile
It's only used very locally within the function.

Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
2024-07-30 22:05:59 +03:00
Heikki Linnakangas
6151cb7876 Replace static buf with palloc in str_time()
The function is used only once in the startup process, so the leak
into current memory context is harmless.

This is a tiny step in making the server thread-safe.

Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi
2024-07-30 22:05:51 +03:00
Michael Paquier
c9e2457390 Fix more holes with SLRU code in need of int64 for segment numbers
This is a continuation of 3937cadfd4, taking care of more areas I have
managed to miss previously.

Reported-by: Noah Misch
Reviewed-by: Noah Misch
Discussion: https://postgr.es/m/20240724130059.1f.nmisch@google.com
Backpatch-through: 17
2024-07-27 07:16:52 +09:00
Robert Haas
8a53539bd6 Wait for WAL summarization to catch up before creating .partial file.
When a standby is promoted, CleanupAfterArchiveRecovery() may decide
to rename the final WAL file from the old timeline by adding ".partial"
to the name. If WAL summarization is enabled and this file is renamed
before its partial contents are summarized, WAL summarization breaks:
the summarizer gets stuck at that point in the WAL stream and just
errors out.

To fix that, first make the startup process wait for WAL summarization
to catch up before renaming the file. Generally, this should be quick,
and if it's not, the user can shut off summarize_wal and try again.
To make this fix work, also teach the WAL summarizer that after a
promotion has occurred, no more WAL can appear on the previous
timeline: previously, the WAL summarizer wouldn't switch to the new
timeline until we actually started writing WAL there, but that meant
that when the startup process was waiting for the WAL summarizer, it
was waiting for an action that the summarizer wasn't yet prepared to
take.

In the process of fixing these bugs, I realized that the logic to wait
for WAL summarization to catch up was spread out in a way that made
it difficult to reuse properly, so this code refactors things to make
it easier.

Finally, add a test case that would have caught this bug and the
previously-fixed bug that WAL summarization sometimes needs to back up
when the timeline changes.

Discussion: https://postgr.es/m/CA+TgmoZGEsZodXC4f=XZNkAeyuDmWTSkpkjCEOcF19Am0mt_OA@mail.gmail.com
2024-07-26 15:00:48 -04:00
Amit Kapila
1462aad2e4 Allow altering of two_phase option of a SUBSCRIPTION.
The two_phase option is controlled by both the publisher (as a slot
option) and the subscriber (as a subscription option), so the slot option
must also be modified.

Changing the 'two_phase' option for a subscription from 'true' to 'false'
is permitted only when there are no pending prepared transactions
corresponding to that subscription. Otherwise, the changes of already
prepared transactions can be replicated again along with their corresponding
commit leading to duplicate data or errors.

To avoid data loss, the 'two_phase' option for a subscription can only be
changed from 'false' to 'true' once the initial data synchronization is
completed. Therefore this is performed later by the logical replication worker.

Author: Hayato Kuroda, Ajin Cherian, Amit Kapila
Reviewed-by: Peter Smith, Hou Zhijie, Amit Kapila, Vitaly Davydov, Vignesh C
Discussion: https://postgr.es/m/8fab8-65d74c80-1-2f28e880@39088166
2024-07-24 10:13:36 +05:30
Michael Paquier
3937cadfd4 Use more consistently int64 for page numbers in SLRU-related code
clog.c, async.c and predicate.c included some SLRU page numbers still
handled as 4-byte integers, while int64 should be used for this purpose.

These holes have been introduced in 4ed8f0913b, that has introduced
the use of 8-byte integers for SLRU page numbers, still forgot about the
code paths updated by this commit.

Reported-by: Noah Misch
Author: Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/20240626002747.dc.nmisch@google.com
Backpatch-through: 17
2024-07-23 17:59:05 +09:00
Peter Eisentraut
935e675f3c Get rid of a global variable
bootstrap_data_checksum_version can just as easily be passed to where
it is used via function arguments.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org
2024-07-23 10:00:41 +02:00
Michael Paquier
ffb0603929 Improve comments in slru.{c,h} about segment name format
slru.h described incorrectly how SLRU segment names are formatted
depending on the segment number and if long or short segment names are
used.  This commit closes the gap with a better description, fitting
with the reality.

Reported-by: Noah Misch
Author: Aleksander Alekseev
Discussion: https://postgr.es/m/20240626002747.dc.nmisch@google.com
Backpatch-through: 17
2024-07-23 16:54:51 +09:00
Robert Haas
6a6ebb92b0 Initialize wal_level in the initial checkpoint record.
As per Coverity and Tom Lane, commit 402b586d0 (back-patched to v17
as 2b5819e2b) forgot to initialize this new structure member in this
code path.
2024-07-22 15:32:43 -04:00
Robert Haas
402b586d0a Do not summarize WAL if generated with wal_level=minimal.
To do this, we must include the wal_level in the first WAL record
covered by each summary file; so add wal_level to struct Checkpoint
and the payload of XLOG_CHECKPOINT_REDO and XLOG_END_OF_RECOVERY.

This, in turn, requires bumping XLOG_PAGE_MAGIC and, since the
Checkpoint is also stored in the control file, also
PG_CONTROL_VERSION. It's not great to do that so late in the release
cycle, but the alternative seems to ship v17 without robust
protections against this scenario, which could result in corrupted
incremental backups.

A side effect of this patch is that, when a server with
wal_level=replica is started with summarize_wal=on for the first time,
summarization will no longer begin with the oldest WAL that still
exists in pg_wal, but rather from the first checkpoint after that.
This change should be harmless, because a WAL summary for a partial
checkpoint cycle can never make an incremental backup possible when
it would otherwise not have been.

Report by Fujii Masao. Patch by me. Review and/or testing by Jakub
Wartak and Fujii Masao.

Discussion: http://postgr.es/m/6e30082e-041b-4e31-9633-95a66de76f5d@oss.nttdata.com
2024-07-18 12:09:48 -04:00
Michael Paquier
b81a71aa05 Assign error codes where missing for user-facing failures
All the errors triggered in the code paths patched here would cause the
backend to issue an internal_error errcode, which is a state that should
be used only for "can't happen" situations.  However, these code paths
are reachable by the regression tests, and could be seen by users in
valid cases.  Some regression tests expect internal errcodes as they
manipulate the backend state to cause corruption (like checksums), or
use elog() because it is more convenient (like injection points), these
have no need to change.

This reduces the number of internal failures triggered in a check-world
by more than half, while providing correct errcodes for these valid
cases.

Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/Zic_GNgos5sMxKoa@paquier.xyz
2024-07-04 09:48:40 +09:00
Daniel Gustafsson
e930c872b6 Use safe string copy routine
Using memcpy with strlen as the size parameter will not take the
NULL terminator into account, relying instead on the destination
buffer being properly initialized. Replace with strlcpy which is
a safer alternative, and more in line with how we handle copying
strings elsewhere.

Author: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://postgr.es/m/CAEudQApAsbLsQ+gGiw-hT+JwGhgogFa_=5NUkgFO6kOPxyNidQ@mail.gmail.com
2024-07-02 11:16:56 +02:00
Tom Lane
1afe31f03c Preserve CurrentMemoryContext across Start/CommitTransactionCommand.
Up to now, committing a transaction has caused CurrentMemoryContext to
get set to TopMemoryContext.  Most callers did not pay any particular
heed to this, which is problematic because TopMemoryContext is a
long-lived context that never gets reset.  If the caller assumes it
can leak memory because it's running in a limited-lifespan context,
that behavior translates into a session-lifespan memory leak.

The first-reported instance of this involved ProcessIncomingNotify,
which is called from the main processing loop that normally runs in
MessageContext.  That outer-loop code assumes that whatever it
allocates will be cleaned up when we're done processing the current
client message --- but if we service a notify interrupt, then whatever
gets allocated before the next switch to MessageContext will be
permanently leaked in TopMemoryContext.  sinval catchup interrupts
have a similar problem, and I strongly suspect that some places in
logical replication do too.

To fix this in a generic way, let's redefine the behavior as
"CommitTransactionCommand restores the memory context that was current
at entry to StartTransactionCommand".  This clearly fixes the issue
for the notify and sinval cases, and it seems to match the mental
model that's in use in the logical replication code, to the extent
that anybody thought about it there at all.

For consistency, likewise make subtransaction exit restore the context
that was current at subtransaction start (rather than always selecting
the CurTransactionContext of the parent transaction level).  This case
has less risk of resulting in a permanent leak than the outer-level
behavior has, but it would not meet the principle of least surprise
for some CommitTransactionCommand calls to restore the previous
context while others don't.

While we're here, also change xact.c so that we reset
TopTransactionContext at transaction exit and then re-use it in later
transactions, rather than dropping and recreating it in each cycle.
This probably doesn't save a lot given the context recycling mechanism
in aset.c, but it should save a little bit.  Per suggestion from David
Rowley.  (Parenthetically, the text in src/backend/utils/mmgr/README
implies that this is how I'd planned to implement it as far back as
commit 1aebc3618 --- but the code actually added in that commit just
drops and recreates it each time.)

This commit also cleans up a few places outside xact.c that were
needlessly making TopMemoryContext current, and thus risking more
leaks of the same kind.  I don't think any of them represent
significant leak risks today, but let's deal with them while the
issue is top-of-mind.

Per bug #18512 from wizardbrony.  Commit to HEAD only, as this change
seems to have some risk of breaking things for some callers.  We'll
apply a narrower fix for the known-broken cases in the back branches.

Discussion: https://postgr.es/m/3478884.1718656625@sss.pgh.pa.us
2024-07-01 11:55:19 -04:00
Robert Haas
065583cf46 Prevent summarizer hang when summarize_wal turned off and back on.
Before this commit, when the WAL summarizer started up or recovered
from an error, it would resume summarization from wherever it left
off. That was OK normally, but wrong if summarize_wal=off had been
turned off temporary, allowing some WAL to be removed, and then turned
back on again. In such cases, the WAL summarizer would simply hang
forever. This commit changes the reinitialization sequence for WAL
summarizer to rederive the starting position in the way we were
already doing at initial startup, fixing the problem.

Per report from Israel Barth Rubio. Reviewed by Tom Lane.

Discussion: http://postgr.es/m/CA+TgmoYN6x=YS+FoFOS6=nr6=qkXZFWhdiL7k0oatGwug2hcuA@mail.gmail.com
2024-06-28 08:29:05 -04:00
Heikki Linnakangas
cbfbda7841 Fix MVCC bug with prepared xact with subxacts on standby
We did not recover the subtransaction IDs of prepared transactions
when starting a hot standby from a shutdown checkpoint. As a result,
such subtransactions were considered as aborted, rather than
in-progress. That would lead to hint bits being set incorrectly, and
the subtransactions suddenly becoming visible to old snapshots when
the prepared transaction was committed.

To fix, update pg_subtrans with prepared transactions's subxids when
starting hot standby from a shutdown checkpoint. The snapshots taken
from that state need to be marked as "suboverflowed", so that we also
check the pg_subtrans.

Backport to all supported versions.

Discussion: https://www.postgresql.org/message-id/6b852e98-2d49-4ca1-9e95-db419a2696e0@iki.fi
2024-06-27 21:09:58 +03:00
Heikki Linnakangas
b1ffe3ff0b Fix bugs in MultiXact truncation
1. TruncateMultiXact() performs the SLRU truncations in a critical
section. Deleting the SLRU segments calls ForwardSyncRequest(), which
will try to compact the request queue if it's full
(CompactCheckpointerRequestQueue()). That in turn allocates memory,
which is not allowed in a critical section. Backtrace:

    TRAP: failed Assert("CritSectionCount == 0 || (context)->allowInCritSection"), File: "../src/backend/utils/mmgr/mcxt.c", Line: 1353, PID: 920981
    postgres: autovacuum worker template0(ExceptionalCondition+0x6e)[0x560a501e866e]
    postgres: autovacuum worker template0(+0x5dce3d)[0x560a50217e3d]
    postgres: autovacuum worker template0(ForwardSyncRequest+0x8e)[0x560a4ffec95e]
    postgres: autovacuum worker template0(RegisterSyncRequest+0x2b)[0x560a50091eeb]
    postgres: autovacuum worker template0(+0x187b0a)[0x560a4fdc2b0a]
    postgres: autovacuum worker template0(SlruDeleteSegment+0x101)[0x560a4fdc2ab1]
    postgres: autovacuum worker template0(TruncateMultiXact+0x2fb)[0x560a4fdbde1b]
    postgres: autovacuum worker template0(vac_update_datfrozenxid+0x4b3)[0x560a4febd2f3]
    postgres: autovacuum worker template0(+0x3adf66)[0x560a4ffe8f66]
    postgres: autovacuum worker template0(AutoVacWorkerMain+0x3ed)[0x560a4ffe7c2d]
    postgres: autovacuum worker template0(+0x3b1ead)[0x560a4ffecead]
    postgres: autovacuum worker template0(+0x3b620e)[0x560a4fff120e]
    postgres: autovacuum worker template0(+0x3b3fbb)[0x560a4ffeefbb]
    postgres: autovacuum worker template0(+0x2f724e)[0x560a4ff3224e]
    /lib/x86_64-linux-gnu/libc.so.6(+0x27c8a)[0x7f62cc642c8a]
    /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85)[0x7f62cc642d45]
    postgres: autovacuum worker template0(_start+0x21)[0x560a4fd16f31]

To fix, bail out in CompactCheckpointerRequestQueue() without doing
anything, if it's called in a critical section. That covers the above
call path, as well as any other similar cases where
RegisterSyncRequest might be called in a critical section.

2. After fixing that, another problem became apparent: Autovacuum
process doing that truncation can deadlock with the checkpointer
process. TruncateMultiXact() sets "MyProc->delayChkptFlags |=
DELAY_CHKPT_START". If the sync request queue is full and cannot be
compacted, the process will repeatedly sleep and retry, until there is
room in the queue. However, if the checkpointer is trying to start a
checkpoint at the same time, and is waiting for the DELAY_CHKPT_START
processes to finish, the queue will never shrink.

More concretely, the autovacuum process is stuck here:

    #0  0x00007fc934926dc3 in epoll_wait () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x000056220b24348b in WaitEventSetWaitBlock (set=0x56220c2e4b50, occurred_events=0x7ffe7856d040, nevents=1, cur_timeout=<optimized out>) at ../src/backend/storage/ipc/latch.c:1570
    #2  WaitEventSetWait (set=0x56220c2e4b50, timeout=timeout@entry=10, occurred_events=<optimized out>, occurred_events@entry=0x7ffe7856d040, nevents=nevents@entry=1,
        wait_event_info=wait_event_info@entry=150994949) at ../src/backend/storage/ipc/latch.c:1516
    #3  0x000056220b243224 in WaitLatch (latch=<optimized out>, latch@entry=0x0, wakeEvents=wakeEvents@entry=40, timeout=timeout@entry=10, wait_event_info=wait_event_info@entry=150994949)
        at ../src/backend/storage/ipc/latch.c:538
    #4  0x000056220b26cf46 in RegisterSyncRequest (ftag=ftag@entry=0x7ffe7856d0a0, type=type@entry=SYNC_FORGET_REQUEST, retryOnError=true) at ../src/backend/storage/sync/sync.c:614
    #5  0x000056220af9db0a in SlruInternalDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1495
    #6  0x000056220af9dab1 in SlruDeleteSegment (ctl=ctl@entry=0x56220b7beb60 <MultiXactMemberCtlData>, segno=segno@entry=11350) at ../src/backend/access/transam/slru.c:1566
    #7  0x000056220af98e1b in PerformMembersTruncation (oldestOffset=<optimized out>, newOldestOffset=<optimized out>) at ../src/backend/access/transam/multixact.c:3006
    #8  TruncateMultiXact (newOldestMulti=newOldestMulti@entry=3221225472, newOldestMultiDB=newOldestMultiDB@entry=4) at ../src/backend/access/transam/multixact.c:3201
    #9  0x000056220b098303 in vac_truncate_clog (frozenXID=749, minMulti=<optimized out>, lastSaneFrozenXid=749, lastSaneMinMulti=3221225472) at ../src/backend/commands/vacuum.c:1917
    #10 vac_update_datfrozenxid () at ../src/backend/commands/vacuum.c:1760
    #11 0x000056220b1c3f76 in do_autovacuum () at ../src/backend/postmaster/autovacuum.c:2550
    #12 0x000056220b1c2c3d in AutoVacWorkerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/autovacuum.c:1569

and the checkpointer is stuck here:

    #0  0x00007fc9348ebf93 in clock_nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
    #1  0x00007fc9348fe353 in nanosleep () from /lib/x86_64-linux-gnu/libc.so.6
    #2  0x000056220b40ecb4 in pg_usleep (microsec=microsec@entry=10000) at ../src/port/pgsleep.c:50
    #3  0x000056220afb43c3 in CreateCheckPoint (flags=flags@entry=108) at ../src/backend/access/transam/xlog.c:7098
    #4  0x000056220b1c6e86 in CheckpointerMain (startup_data=<optimized out>, startup_data_len=<optimized out>) at ../src/backend/postmaster/checkpointer.c:464

To fix, add AbsorbSyncRequests() to the loops where the checkpointer
waits for DELAY_CHKPT_START or DELAY_CHKPT_COMPLETE operations to
finish.

Backpatch to v14. Before that, SLRU deletion didn't call
RegisterSyncRequest, which avoided this failure. I'm not sure if there
are other similar scenarios on older versions, but we haven't had
any such reports.

Discussion: https://www.postgresql.org/message-id/ccc66933-31c1-4f6a-bf4b-45fef0d4f22e@iki.fi
2024-06-26 23:02:06 +03:00
Heikki Linnakangas
0099b9408e Convert confusing macros in multixact.c to static inline functions
The macros were confused about the argument data types. All the
arguments were called 'xid', and some of the macros included casts to
TransactionId, even though the arguments were actually either
MultiXactIds or MultiXactOffsets. It compiles to the same thing,
because TransactionId, MultiXactId and MultiXactOffset are all
typedefs of uint32, but it was highly misleading.

Author: Maxim Orlov <orlovmg@gmail.com>
Discussion: https://www.postgresql.org/message-id/CACG%3DezbLUG-OD1osAW3OchOMxZtdxHh2itYR9Zhh-a13wEBEQw%40mail.gmail.com
Discussion: https://www.postgresql.org/message-id/ff143b24-a093-40da-9833-d36b83726bdf%40iki.fi
2024-06-16 20:47:07 +03:00
Heikki Linnakangas
c425113eeb Clamp result of MultiXactMemberFreezeThreshold
The purpose of the function is to reduce the effective
autovacuum_multixact_freeze_max_age if the multixact members SLRU is
approaching wraparound, to make multixid freezing more aggressive.
The returned value should therefore never be greater than plain
autovacuum_multixact_freeze_max_age.

Reviewed-by: Robert Haas
Discussion: https://www.postgresql.org/message-id/85fb354c-f89f-4d47-b3a2-3cbd461c90a3@iki.fi
Backpatch-through: 12, all supported versions
2024-06-13 19:01:30 +03:00
Heikki Linnakangas
e6cd857726 Make RelationFlushRelation() work without ResourceOwner during abort
ReorderBufferImmediateInvalidation() executes invalidation messages in
an aborted transaction. However, RelationFlushRelation sometimes
required a valid resource owner, to temporarily increment the refcount
of the relache entry. Commit b8bff07daa worked around that in the main
subtransaction abort function, AbortSubTransaction(), but missed this
similar case in ReorderBufferImmediateInvalidation().

To fix, introduce a separate function to invalidate a relcache
entry. It does the same thing as RelationClearRelation(rebuild==true)
does when outside a transaction, but can be called without
incrementing the refcount.

Add regression test. Before this fix, it failed with:

ERROR: ResourceOwnerEnlarge called after release started

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://www.postgresql.org/message-id/e56be7d9-14b1-664d-0bfc-00ce9772721c@gmail.com
2024-06-06 18:56:28 +03:00
Peter Eisentraut
17974ec259 Revise GUC names quoting in messages again
After further review, we want to move in the direction of always
quoting GUC names in error messages, rather than the previous (PG16)
wildly mixed practice or the intermittent (mid-PG17) idea of doing
this depending on how possibly confusing the GUC name is.

This commit applies appropriate quotes to (almost?) all mentions of
GUC names in error messages.  It partially supersedes a243569bf6 and
8d9978a717, which had moved things a bit in the opposite direction
but which then were abandoned in a partial state.

Author: Peter Smith <smithpb2250@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com
2024-05-17 11:44:26 +02:00
David Rowley
a42fc1c903 Fix an assortment of typos
Author: Alexander Lakhin
Discussion: https://postgr.es/m/ae9f2fcb-4b24-5bb0-4240-efbbbd944ca1@gmail.com
2024-05-04 02:33:25 +12:00
Alexander Korotkov
40126ac68f Refactoring for CommitTransactionCommand()/AbortCurrentTransaction()
fefd9a3fed turned tail recursion of CommitTransactionCommand() and
AbortCurrentTransaction() into iteration.  However, it splits the handling of
cases between different functions.

This commit puts the handling of all the cases into
AbortCurrentTransactionInternal() and CommitTransactionCommandInternal().
Now CommitTransactionCommand() and AbortCurrentTransaction() are just doing
the repeated calls of internal functions.

Reported-by: Andres Freund
Discussion: https://postgr.es/m/20240415224834.w6piwtefskoh32mv%40awork3.anarazel.de
Author: Andres Freund
2024-04-18 00:29:53 +03:00
Alexander Korotkov
772faafca1 Revert: Implement pg_wal_replay_wait() stored procedure
This commit reverts 06c418e163, e37662f221, bf1e650806, 25f42429e2,
ee79928441, and 74eaf66f98 per review by Heikki Linnakangas.

Discussion: https://postgr.es/m/b155606b-e744-4218-bda5-29379779da1a%40iki.fi
2024-04-11 17:28:15 +03:00
Alvaro Herrera
a0e0fb1ba5 Use conditional variable to wait for next MultiXact offset
In one multixact.c edge case, we need a mechanism to wait for one
multixact offset to be written before being allowed to read the next
one.  We used to handle this case by sleeping for one millisecond and
retrying, but such sleeps have been reported as problematic in
production cases.  We can avoid the problem by using a condition
variable: readers sleep on it and then every creator of multixacts
broadcasts into the CV when creation is sufficiently far along.

Author: Kyotaro Horiguchi <horikyotajntt@gmail.com>
Reviewed-by: Andrey Borodin <amborodin@acm.org>
Discussion: https://postgr.es/m/47A598F4-B4E7-4029-8FEC-A06A6C3CB4B5@yandex-team.ru
Discussion: https://postgr.es/m/20200515.090333.24867479329066911.horikyota.ntt
2024-04-07 20:33:45 +02:00
Alvaro Herrera
f3ff7bf83b Add XLogCtl->logInsertResult
This tracks the position of WAL that's been fully copied into WAL
buffers by all processes emitting WAL.  (For some reason we call that
"WAL insertion").  This is updated using atomic monotonic advance during
WaitXLogInsertionsToFinish, which is not when the insertions actually
occur, but it's the only place where we know where have all the
insertions have completed.

This value is useful in WALReadFromBuffers, which can verify that
callers don't try to read past what has been inserted.  (However, more
infrastructure is needed in order to actually use WAL after the flush
point, since it could be lost.)

The value is also useful in WaitXLogInsertionsToFinish() itself, since
we can now exit quickly when all WAL has been already inserted, without
even having to take any locks.
2024-04-07 14:06:30 +02:00