Refactor per-page logic common to all redo routines to a new function.

Every redo routine uses the same idiom to determine what to do to a page: check if there's a backup block for it, and if not read, the buffer if the block exists, and check its LSN. Refactor that into a common function, XLogReadBufferForRedo, making all the redo routines shorter and more readable. This has no user-visible effect, and makes no changes to the WAL format. Reviewed by Andres Freund, Alvaro Herrera, Michael Paquier.
2025-11-15 03:41:20 +03:00 · 2014-08-13 15:39:08 +03:00
parent 26f8b99b24
commit f8f4227976
8 changed files with 1430 additions and 1739 deletions
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -500,33 +500,28 @@ incrementally update the page, the rdata array *must* mention the buffer
 ID at least once; otherwise there is no defense against torn-page problems.
 The standard replay-routine pattern for this case is

-	if (record->xl_info & XLR_BKP_BLOCK(N))
+	if (XLogReadBufferForRedo(lsn, record, N, rnode, blkno, &buffer) == BLK_NEEDS_REDO)
 	{
-		/* apply the change from the full-page image */
-		(void) RestoreBackupBlock(lsn, record, N, false, false);
-		return;
-	}
+		page = (Page) BufferGetPage(buffer);

-	buffer = XLogReadBuffer(rnode, blkno, false);
-	if (!BufferIsValid(buffer))
-	{
-		/* page has been deleted, so we need do nothing */
-		return;
-	}
-	page = (Page) BufferGetPage(buffer);
+		... apply the change ...

-	if (XLByteLE(lsn, PageGetLSN(page)))
-	{
-		/* changes are already applied */
+		PageSetLSN(page, lsn);
+		MarkBufferDirty(buffer);
+	}
+	if (BufferIsValid(buffer))
 		UnlockReleaseBuffer(buffer);
-		return;
-	}

-	... apply the change ...
-
-	PageSetLSN(page, lsn);
-	MarkBufferDirty(buffer);
-	UnlockReleaseBuffer(buffer);
+XLogReadBufferForRedo reads the page from disk, and checks what action needs to
+be taken to the page.  If the XLR_BKP_BLOCK(N) flag is set, it restores the
+full page image and returns BLK_RESTORED.  If there is no full page image, but
+page cannot be found or if the change has already been replayed (i.e. the
+page's LSN >= the record we're replaying), it returns BLK_NOTFOUND or BLK_DONE,
+respectively.  Usually, the redo routine only needs to pay attention to the
+BLK_NEEDS_REDO return code, which means that the routine should apply the
+incremental change.  In any case, the caller is responsible for unlocking and
+releasing the buffer.  Note that XLogReadBufferForRedo returns the buffer
+locked even if no redo is required, unless the page does not exist.

 As noted above, for a multi-page update you need to be able to determine
 which XLR_BKP_BLOCK(N) flag applies to each page.  If a WAL record reflects
@@ -539,31 +534,8 @@ per the above discussion, fully-rewritable buffers shouldn't be mentioned in
 When replaying a WAL record that describes changes on multiple pages, you
 must be careful to lock the pages properly to prevent concurrent Hot Standby
 queries from seeing an inconsistent state.  If this requires that two
-or more buffer locks be held concurrently, the coding pattern shown above
-is too simplistic, since it assumes the routine can exit as soon as it's
-known the current page requires no modification.  Instead, you might have
-something like
-
-	if (record->xl_info & XLR_BKP_BLOCK(0))
-	{
-		/* apply the change from the full-page image */
-		buffer0 = RestoreBackupBlock(lsn, record, 0, false, true);
-	}
-	else
-	{
-		buffer0 = XLogReadBuffer(rnode, blkno, false);
-		if (BufferIsValid(buffer0))
-		{
-			... apply the change if not already done ...
-			MarkBufferDirty(buffer0);
-		}
-	}
-
-	... similarly apply the changes for remaining pages ...
-
-	/* and now we can release the lock on the first page */
-	if (BufferIsValid(buffer0))
-		UnlockReleaseBuffer(buffer0);
+or more buffer locks be held concurrently, you must lock the pages in
+appropriate order, and not release the locks until all the changes are done.

 Note that we must only use PageSetLSN/PageGetLSN() when we know the action
 is serialised. Only Startup process may modify data blocks during recovery,
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -242,6 +242,87 @@ XLogCheckInvalidPages(void)
 	invalid_page_tab = NULL;
 }

+
+/*
+ * XLogReadBufferForRedo
+ *		Read a page during XLOG replay
+ *
+ * Reads a block referenced by a WAL record into shared buffer cache, and
+ * determines what needs to be done to redo the changes to it.  If the WAL
+ * record includes a full-page image of the page, it is restored.
+ *
+ * 'lsn' is the LSN of the record being replayed.  It is compared with the
+ * page's LSN to determine if the record has already been replayed.
+ * 'rnode' and 'blkno' point to the block being replayed (main fork number
+ * is implied, use XLogReadBufferForRedoExtended for other forks).
+ * 'block_index' identifies the backup block in the record for the page.
+ *
+ * Returns one of the following:
+ *
+ *	BLK_NEEDS_REDO	- changes from the WAL record need to be applied
+ *	BLK_DONE		- block doesn't need replaying
+ *	BLK_RESTORED	- block was restored from a full-page image included in
+ *					  the record
+ *	BLK_NOTFOUND	- block was not found (because it was truncated away by
+ *					  an operation later in the WAL stream)
+ *
+ * On return, the buffer is locked in exclusive-mode, and returned in *buf.
+ * Note that the buffer is locked and returned even if it doesn't need
+ * replaying.  (Getting the buffer lock is not really necessary during
+ * single-process crash recovery, but some subroutines such as MarkBufferDirty
+ * will complain if we don't have the lock.  In hot standby mode it's
+ * definitely necessary.)
+ */
+XLogRedoAction
+XLogReadBufferForRedo(XLogRecPtr lsn, XLogRecord *record, int block_index,
+					  RelFileNode rnode, BlockNumber blkno,
+					  Buffer *buf)
+{
+	return XLogReadBufferForRedoExtended(lsn, record, block_index,
+										 rnode, MAIN_FORKNUM, blkno,
+										 RBM_NORMAL, false, buf);
+}
+
+/*
+ * XLogReadBufferForRedoExtended
+ *		Like XLogReadBufferForRedo, but with extra options.
+ *
+ * If mode is RBM_ZERO or RBM_ZERO_ON_ERROR, if the page doesn't exist, the
+ * relation is extended with all-zeroes pages up to the referenced block
+ * number.  In RBM_ZERO mode, the return value is always BLK_NEEDS_REDO.
+ *
+ * If 'get_cleanup_lock' is true, a "cleanup lock" is acquired on the buffer
+ * using LockBufferForCleanup(), instead of a regular exclusive lock.
+ */
+XLogRedoAction
+XLogReadBufferForRedoExtended(XLogRecPtr lsn, XLogRecord *record,
+							  int block_index, RelFileNode rnode,
+							  ForkNumber forkno, BlockNumber blkno,
+							  ReadBufferMode mode, bool get_cleanup_lock,
+							  Buffer *buf)
+{
+	if (record->xl_info & XLR_BKP_BLOCK(block_index))
+	{
+		*buf = RestoreBackupBlock(lsn, record, block_index,
+								  get_cleanup_lock, true);
+		return BLK_RESTORED;
+	}
+	else
+	{
+		*buf = XLogReadBufferExtended(rnode, forkno, blkno, mode);
+		if (BufferIsValid(*buf))
+		{
+			LockBuffer(*buf, BUFFER_LOCK_EXCLUSIVE);
+			if (lsn <= PageGetLSN(BufferGetPage(*buf)))
+				return BLK_DONE;
+			else
+				return BLK_NEEDS_REDO;
+		}
+		else
+			return BLK_NOTFOUND;
+	}
+}
+
 /*
 * XLogReadBuffer
 *		Read a page during XLOG replay.