1
0
mirror of https://github.com/postgres/postgres.git synced 2025-12-19 17:02:53 +03:00
Commit Graph

15 Commits

Author SHA1 Message Date
Alexander Korotkov
b27e48213f Refactor WaitLSNType enum to use a macro for type count
Change WAIT_LSN_TYPE_COUNT from an enum sentinel to a macro definition,
in a similar way to IOObject, IOContext, and BackendType enums.  Remove
explicit enum value assignments well.

Author: Xuneng Zhou <xunengzhou@gmail.com>
2025-12-14 17:18:32 +02:00
Alexander Korotkov
d6ef8ee3ee Fix incorrect assertion bound in WaitForLSN()
The assertion checking MyProcNumber used MaxBackends as the upper
bound, but the procInfos array is allocated with size
MaxBackends + NUM_AUXILIARY_PROCS. This inconsistency would cause
a false assertion failure if an auxiliary process calls WaitForLSN().

Author: Xuneng Zhou <xunengzhou@gmail.com>
2025-12-04 10:38:12 +02:00
Alexander Korotkov
75e82b2f5a Optimize shared memory usage for WaitLSNProcInfo
We need separate pairing heaps for different WaitLSNType's, because there
might be waiters for different LSN's at the same time.  However, one process
can wait only for one type of LSN at a time.  So, no need for inHeap
and heapNode fields to be arrays.

Discussion: https://postgr.es/m/CAPpHfdsBR-7sDtXFJ1qpJtKiohfGoj%3DvqzKVjWxtWsWidx7G_A%40mail.gmail.com
Author: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
2025-11-18 09:50:12 +02:00
Alexander Korotkov
ede6acef49 Fix WaitLSNWakeup() fast-path check for InvalidXLogRecPtr
WaitLSNWakeup() incorrectly returned early when called with
InvalidXLogRecPtr (meaning "wake all waiters"), because the fast-path
check compared minWaitedLSN > 0 without validating currentLSN first.
This caused WAIT FOR LSN commands to wait indefinitely during standby
promotion until random signals woke them.

Add an XLogRecPtrIsValid() check before the comparison so that
InvalidXLogRecPtr bypasses the fast-path and wakes all waiters immediately.

Discussion: https://postgr.es/m/CABPTF7UieOYbOgH3EnQCasaqcT1T4N6V2wammwrWCohQTnD_Lw%40mail.gmail.com
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
2025-11-15 12:27:42 +02:00
Alexander Korotkov
7742f99a02 Fix checking for recovery state in WaitForLSN()
We only need to do it for WAIT_LSN_TYPE_REPLAY.  WAIT_LSN_TYPE_FLUSH can work
for both primary and follower.
2025-11-07 23:34:50 +02:00
Álvaro Herrera
a2b02293bc Use XLogRecPtrIsValid() in various places
Now that commit 06edbed478 has introduced XLogRecPtrIsValid(), we can
use that instead of:

- XLogRecPtrIsInvalid()
- direct comparisons with InvalidXLogRecPtr
- direct comparisons with literal 0

This makes the code more consistent.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aQB7EvGqrbZXrMlg@ip-10-97-1-34.eu-west-3.compute.internal
2025-11-06 20:33:57 +01:00
Alexander Korotkov
3b4e53a075 Add infrastructure for efficient LSN waiting
Implement a new facility that allows processes to wait for WAL to reach
specific LSNs, both on primary (waiting for flush) and standby (waiting
for replay) servers.

The implementation uses shared memory with per-backend information
organized into pairing heaps, allowing O(1) access to the minimum
waited LSN. This enables fast-path checks: after replaying or flushing
WAL, the startup process or WAL writer can quickly determine if any
waiters need to be awakened.

Key components:
- New xlogwait.c/h module with WaitForLSNReplay() and WaitForLSNFlush()
- Separate pairing heaps for replay and flush waiters
- WaitLSN lightweight lock for coordinating shared state
- Wait events WAIT_FOR_WAL_REPLAY and WAIT_FOR_WAL_FLUSH for monitoring

This infrastructure can be used by features that need to wait for WAL
operations to complete.

Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO+BBjcirozJ6nYbOW8Q@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/flat/CABPTF7UNft368x-RgOXkfj475OwEbp%2BVVO-wEXz7StgjD_%3D6sw%40mail.gmail.com
Author: Kartyshov Ivan <i.kartyshov@postgrespro.ru>
Author: Alexander Korotkov <aekorotkov@gmail.com>
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Euler Taveira <euler@eulerto.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
2025-11-05 11:44:13 +02:00
Alexander Korotkov
3a7ae6b3d9 Revert pg_wal_replay_wait() stored procedure
This commit reverts 3c5db1d6b0, and subsequent improvements and fixes
including 8036d73ae3, 867d396ccd, 3ac3ec580c, 0868d7ae70, 85b98b8d5a,
2520226c95, 014f9f34d2, e658038772, e1555645d7, 5035172e4a, 6cfebfe88b,
73da6b8d1b, and e546989a26.

The reason for reverting is a set of remaining issues.  Most notably, the
stored procedure appears to need more effort than the utility statement
to turn the backend into a "snapshot-less" state.  This makes an approach
to use stored procedures questionable.

Catversion is bumped.

Discussion: https://postgr.es/m/Zyhj2anOPRKtb0xW%40paquier.xyz
2024-11-04 22:47:57 +02:00
Heikki Linnakangas
368d8270c8 Rename two functions that wake up other processes
Instead of talking about setting latches, which is a pretty low-level
mechanism, emphasize that they wake up other processes.

This is in preparation for replacing Latches with a new abstraction.
That's still work in progress, but this seems a little tidier anyway,
so let's get this refactoring out of the way already.

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
2024-11-01 13:47:24 +02:00
Heikki Linnakangas
a9c546a5a3 Use ProcNumbers instead of direct Latch pointers to address other procs
This is in preparation for replacing Latches with a new abstraction.
That's still work in progress, but this seems a little tidier anyway,
so let's get this refactoring out of the way already.

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
2024-11-01 13:47:20 +02:00
Peter Eisentraut
e18512c000 Remove unused #include's from backend .c files
as determined by IWYU

These are mostly issues that are new since commit dbbca2cf29.

Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-10-27 08:26:50 +01:00
Alexander Korotkov
e546989a26 Add 'no_error' argument to pg_wal_replay_wait()
This argument allow skipping throwing an error.  Instead, the result status
can be obtained using pg_wal_replay_wait_status() function.

Catversion is bumped.

Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZtUF17gF0pNpwZDI%40paquier.xyz
Reviewed-by: Pavel Borisov
2024-10-24 15:02:21 +03:00
Alexander Korotkov
73da6b8d1b Refactor WaitForLSNReplay() to return the result of waiting
Currently, WaitForLSNReplay() immediately throws an error if waiting for LSN
replay is not successful.  This commit teaches  WaitForLSNReplay() to return
the result of waiting, while making pg_wal_replay_wait() responsible for
throwing an appropriate error.

This is preparation to adding 'no_error' argument to pg_wal_replay_wait() and
new function pg_wal_replay_wait_status(), which returns the last wait result
status.

Additionally, we stop distinguishing situations when we find our instance to
be not in a recovery state before entering the waiting loop and inside
the waiting loop.  Standby promotion may happen at any moment, even between
issuing a procedure call statement and pg_wal_replay_wait() doing a first
check of recovery status.  Thus, there is no pointing distinguishing these
situations.

Also, since we may exit the waiting loop and see our instance not in recovery
without throwing an error, we need to deleteLSNWaiter() in that case. We do
this unconditionally for the sake of simplicity, even if standby was already
promoted after reaching the target LSN, the startup process surely already
deleted us.

Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZtUF17gF0pNpwZDI%40paquier.xyz
Reviewed-by: Michael Paquier, Pavel Borisov
2024-10-24 14:38:27 +03:00
Alexander Korotkov
6cfebfe88b Make WaitForLSNReplay() issue FATAL on postmaster death
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/ZvY2C8N4ZqgCFaLu%40paquier.xyz
Reviewed-by: Pavel Borisov
2024-10-24 14:38:06 +03:00
Alexander Korotkov
5035172e4a Move LSN waiting declarations and definitions to better place
3c5db1d6b implemented the pg_wal_replay_wait() stored procedure.  Due to
the patch development history, the implementation resided in
src/backend/commands/waitlsn.c (src/include/commands/waitlsn.h for headers).

014f9f34d moved pg_wal_replay_wait() itself to
src/backend/access/transam/xlogfuncs.c near to the WAL-manipulation functions.
But most of the implementation stayed in place.

The code in src/backend/commands/waitlsn.c has nothing to do with commands,
but is related to WAL.  So, this commit moves this code into
src/backend/access/transam/xlogwait.c (src/include/access/xlogwait.h for
headers).

Reported-by: Peter Eisentraut
Discussion: https://postgr.es/m/18c0fa64-0475-415e-a1bd-665d922c5201%40eisentraut.org
Reviewed-by: Pavel Borisov
2024-10-24 14:37:53 +03:00