Clean up WAL/buffer interactions as per my recent proposal. Get rid of the

misleadingly-named WriteBuffer routine, and instead require routines that change buffer pages to call MarkBufferDirty (which does exactly what it says). We also require that they do so before calling XLogInsert; this takes care of the synchronization requirement documented in SyncOneBuffer. Note that because bufmgr takes the buffer content lock (in shared mode) while writing out any buffer, it doesn't matter whether MarkBufferDirty is executed before the buffer content change is complete, so long as the content change is completed before releasing exclusive lock on the buffer. So it's OK to set the dirtybit before we fill in the LSN. This eliminates the former kluge of needing to set the dirtybit in LockBuffer. Aside from making the code more transparent, we can also add some new debugging assertions, in particular that the caller of MarkBufferDirty must hold the buffer content lock, not merely a pin.
2025-11-12 05:01:15 +03:00 · 2006-03-31 23:32:07 +00:00
parent 89395bfa6f
commit a8b8f4db23
24 changed files with 434 additions and 537 deletions
--- a/src/backend/access/transam/README
+++ b/src/backend/access/transam/README
@@ -1,4 +1,4 @@
-$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.4 2006/03/29 21:17:37 tgl Exp $
+$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.5 2006/03/31 23:32:05 tgl Exp $

 The Transaction System
 ----------------------
@@ -297,7 +297,7 @@ The general schema for executing a WAL-logged action is
 1. Pin and exclusive-lock the shared buffer(s) containing the data page(s)
 to be modified.

-2. START_CRIT_SECTION()  (Any error during the next two steps must cause a
+2. START_CRIT_SECTION()  (Any error during the next three steps must cause a
 PANIC because the shared buffers will contain unlogged changes, which we
 have to ensure don't get to disk.  Obviously, you should check conditions
 such as whether there's enough free space on the page before you start the
@@ -305,7 +305,10 @@ critical section.)

 3. Apply the required changes to the shared buffer(s).

-4. Build a WAL log record and pass it to XLogInsert(); then update the page's
+4. Mark the shared buffer(s) as dirty with MarkBufferDirty().  (This must
+happen before the WAL record is inserted; see notes in SyncOneBuffer().)
+
+5. Build a WAL log record and pass it to XLogInsert(); then update the page's
 LSN and TLI using the returned XLOG location.  For instance,

 		recptr = XLogInsert(rmgr_id, info, rdata);
@@ -313,16 +316,9 @@ LSN and TLI using the returned XLOG location.  For instance,
 		PageSetLSN(dp, recptr);
 		PageSetTLI(dp, ThisTimeLineID);

-5. END_CRIT_SECTION()
+6. END_CRIT_SECTION()

-6. Unlock and write the buffer(s):
-
-		LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
-		WriteBuffer(buffer);
-
-(Note: WriteBuffer doesn't really "write" the buffer anymore, it just marks it
-dirty and unpins it.  The write will not happen until a checkpoint occurs or
-the shared buffer is needed for another page.)
+7. Unlock and unpin the buffer(s).

 XLogInsert's "rdata" argument is an array of pointer/size items identifying
 chunks of data to be written in the XLOG record, plus optional shared-buffer
@@ -364,8 +360,8 @@ standard replay-routine pattern for this case is

 	PageSetLSN(page, lsn);
 	PageSetTLI(page, ThisTimeLineID);
-	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
-	WriteBuffer(buffer);
+	MarkBufferDirty(buffer);
+	UnlockReleaseBuffer(buffer);

 In the case where the WAL record provides only enough information to
 incrementally update the page, the rdata array *must* mention the buffer
@@ -384,8 +380,7 @@ The standard replay-routine pattern for this case is
 	if (XLByteLE(lsn, PageGetLSN(page)))
 	{
 		/* changes are already applied */
-		LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
-		ReleaseBuffer(buffer);
+		UnlockReleaseBuffer(buffer);
 		return;
 	}

@@ -393,8 +388,8 @@ The standard replay-routine pattern for this case is

 	PageSetLSN(page, lsn);
 	PageSetTLI(page, ThisTimeLineID);
-	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
-	WriteBuffer(buffer);
+	MarkBufferDirty(buffer);
+	UnlockReleaseBuffer(buffer);

 As noted above, for a multi-page update you need to be able to determine
 which XLR_BKP_BLOCK_n flag applies to each page.  If a WAL record reflects
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.230 2006/03/29 21:17:37 tgl Exp $
+ * $PostgreSQL: pgsql/src/backend/access/transam/xlog.c,v 1.231 2006/03/31 23:32:05 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -2529,8 +2529,8 @@ RestoreBkpBlocks(XLogRecord *record, XLogRecPtr lsn)

 		PageSetLSN(page, lsn);
 		PageSetTLI(page, ThisTimeLineID);
-		LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
-		WriteBuffer(buffer);
+		MarkBufferDirty(buffer);
+		UnlockReleaseBuffer(buffer);

 		blk += BLCKSZ - bkpb.hole_length;
 	}
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -11,7 +11,7 @@
 * Portions Copyright (c) 1996-2006, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/backend/access/transam/xlogutils.c,v 1.42 2006/03/29 21:17:38 tgl Exp $
+ * $PostgreSQL: pgsql/src/backend/access/transam/xlogutils.c,v 1.43 2006/03/31 23:32:06 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -30,9 +30,9 @@
 *
 * This is functionally comparable to ReadBuffer followed by
 * LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE): you get back a pinned
- * and locked buffer.  (The lock is not really necessary, since we
- * expect that this is only done during single-process XLOG replay,
- * but in some places it simplifies sharing code with the non-XLOG case.)
+ * and locked buffer.  (Getting the lock is not really necessary, since we
+ * expect that this is only used during single-process XLOG replay, but
+ * some subroutines such as MarkBufferDirty will complain if we don't.)
 *
 * If "init" is true then the caller intends to rewrite the page fully
 * using the info in the XLOG record.  In this case we will extend the
@@ -74,7 +74,7 @@ XLogReadBuffer(Relation reln, BlockNumber blkno, bool init)
 		while (blkno >= lastblock)
 		{
 			if (buffer != InvalidBuffer)
-				ReleaseBuffer(buffer);		/* must be WriteBuffer()? */
+				ReleaseBuffer(buffer);
 			buffer = ReadBuffer(reln, P_NEW);
 			lastblock++;
 		}