From 9cbb1d21d67ec3cb2d5342073d220a0c1e0ad82c Mon Sep 17 00:00:00 2001 From: Amit Langote Date: Fri, 16 Jan 2026 13:01:52 +0900 Subject: [PATCH] Fix segfault from releasing locks in detached DSM segments MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit If a FATAL error occurs while holding a lock in a DSM segment (such as a dshash lock) and the process is not in a transaction, a segmentation fault can occur during process exit. The problem sequence is: 1. Process acquires a lock in a DSM segment (e.g., via dshash) 2. FATAL error occurs outside transaction context 3. proc_exit() begins, calling before_shmem_exit callbacks 4. dsm_backend_shutdown() detaches all DSM segments 5. Later, on_shmem_exit callbacks run 6. ProcKill() calls LWLockReleaseAll() 7. Segfault: the lock being released is in unmapped memory This only manifests outside transaction contexts because AbortTransaction() calls LWLockReleaseAll() during transaction abort, releasing locks before DSM cleanup. Background workers and other non-transactional code paths are vulnerable. Fix by calling LWLockReleaseAll() unconditionally at the start of shmem_exit(), before any callbacks run. Releasing locks before callbacks prevents the segfault - locks must be released before dsm_backend_shutdown() detaches their memory. This is safe because after an error, held locks are protecting potentially inconsistent data anyway, and callbacks can acquire fresh locks if needed. Also add a comment noting that LWLockReleaseAll() must be safe to call before LWLock initialization (which it is, since num_held_lwlocks will be 0), plus an Assert for the post-condition. This fix aligns with the original design intent from commit 001a573a2, which noted that backends must clean up shared memory state (including releasing lwlocks) before unmapping dynamic shared memory segments. Reported-by: Rahila Syed Author: Rahila Syed Reviewed-by: Amit Langote Reviewed-by: Dilip Kumar Reviewed-by: Andres Freund Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/CAH2L28uSvyiosL+kaic9249jRVoQiQF6JOnaCitKFq=xiFzX3g@mail.gmail.com Backpatch-through: 14 --- src/backend/storage/ipc/ipc.c | 11 +++++++++-- src/backend/storage/lmgr/lwlock.c | 6 ++++++ 2 files changed, 15 insertions(+), 2 deletions(-) diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c index 6df7413fe9b..cb944edd8df 100644 --- a/src/backend/storage/ipc/ipc.c +++ b/src/backend/storage/ipc/ipc.c @@ -29,6 +29,7 @@ #endif #include "storage/dsm.h" #include "storage/ipc.h" +#include "storage/lwlock.h" #include "tcop/tcopprot.h" @@ -229,13 +230,19 @@ shmem_exit(int code) { shmem_exit_inprogress = true; + /* + * Release any LWLocks we might be holding before callbacks run. This + * prevents accessing locks in detached DSM segments and allows callbacks + * to acquire new locks. + */ + LWLockReleaseAll(); + /* * Call before_shmem_exit callbacks. * * These should be things that need most of the system to still be up and * working, such as cleanup of temp relations, which requires catalog - * access; or things that need to be completed because later cleanup steps - * depend on them, such as releasing lwlocks. + * access. */ elog(DEBUG3, "shmem_exit(%d): %d before_shmem_exit callbacks to make", code, before_shmem_exit_index); diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c index a133c97b992..517c55375b4 100644 --- a/src/backend/storage/lmgr/lwlock.c +++ b/src/backend/storage/lmgr/lwlock.c @@ -1883,6 +1883,10 @@ LWLockReleaseClearVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val) * unchanged by this operation. This is necessary since InterruptHoldoffCount * has been set to an appropriate level earlier in error recovery. We could * decrement it below zero if we allow it to drop for each released lock! + * + * Note that this function must be safe to call even before the LWLock + * subsystem has been initialized (e.g., during early startup failures). + * In that case, num_held_lwlocks will be 0 and we do nothing. */ void LWLockReleaseAll(void) @@ -1893,6 +1897,8 @@ LWLockReleaseAll(void) LWLockRelease(held_lwlocks[num_held_lwlocks - 1].lock); } + + Assert(num_held_lwlocks == 0); }