mirror of
https://github.com/postgres/postgres.git
synced 2025-08-30 06:01:21 +03:00
Avoid pin scan for replay of XLOG_BTREE_VACUUM
Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require complex interlocking that matched the requirements on the master. This required an O(N) operation that became a significant problem with large indexes, causing replication delays of seconds or in some cases minutes while the XLOG_BTREE_VACUUM was replayed. This commit skips the “pin scan” that was previously required, by observing in detail when and how it is safe to do so, with full documentation. The pin scan is skipped only in replay; the VACUUM code path on master is not touched here. The current commit still performs the pin scan for toast indexes, though this can also be avoided if we recheck scans on toast indexes. Later patch will address this. No tests included. Manual tests using an additional patch to view WAL records and their timing have shown the change in WAL records and their handling has successfully reduced replication delay.
This commit is contained in:
@@ -22,6 +22,7 @@
|
||||
#include "access/relscan.h"
|
||||
#include "access/xlog.h"
|
||||
#include "catalog/index.h"
|
||||
#include "catalog/pg_namespace.h"
|
||||
#include "commands/vacuum.h"
|
||||
#include "storage/indexfsm.h"
|
||||
#include "storage/ipc.h"
|
||||
@@ -823,6 +824,11 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
|
||||
}
|
||||
|
||||
/*
|
||||
* Check to see if we need to issue one final WAL record for this index,
|
||||
* which may be needed for correctness on a hot standby node when
|
||||
* non-MVCC index scans could take place. This now only occurs when we
|
||||
* perform a TOAST scan, so only occurs for TOAST indexes.
|
||||
*
|
||||
* If the WAL is replayed in hot standby, the replay process needs to get
|
||||
* cleanup locks on all index leaf pages, just as we've been doing here.
|
||||
* However, we won't issue any WAL records about pages that have no items
|
||||
@@ -833,6 +839,7 @@ btvacuumscan(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
|
||||
* against the last leaf page in the index, if that one wasn't vacuumed.
|
||||
*/
|
||||
if (XLogStandbyInfoActive() &&
|
||||
rel->rd_rel->relnamespace == PG_TOAST_NAMESPACE &&
|
||||
vstate.lastBlockVacuumed < vstate.lastBlockLocked)
|
||||
{
|
||||
Buffer buf;
|
||||
@@ -1031,6 +1038,20 @@ restart:
|
||||
*/
|
||||
if (ndeletable > 0)
|
||||
{
|
||||
BlockNumber lastBlockVacuumed = InvalidBlockNumber;
|
||||
|
||||
/*
|
||||
* We may need to record the lastBlockVacuumed for use when
|
||||
* non-MVCC scans might be performed on the index on a
|
||||
* hot standby. See explanation in btree_xlog_vacuum().
|
||||
*
|
||||
* On a hot standby, a non-MVCC scan can only take place
|
||||
* when we access a Toast Index, so we need only record
|
||||
* the lastBlockVacuumed if we are vacuuming a Toast Index.
|
||||
*/
|
||||
if (rel->rd_rel->relnamespace == PG_TOAST_NAMESPACE)
|
||||
lastBlockVacuumed = vstate->lastBlockVacuumed;
|
||||
|
||||
/*
|
||||
* Notice that the issued XLOG_BTREE_VACUUM WAL record includes an
|
||||
* instruction to the replay code to get cleanup lock on all pages
|
||||
@@ -1043,7 +1064,7 @@ restart:
|
||||
* that.
|
||||
*/
|
||||
_bt_delitems_vacuum(rel, buf, deletable, ndeletable,
|
||||
vstate->lastBlockVacuumed);
|
||||
lastBlockVacuumed);
|
||||
|
||||
/*
|
||||
* Remember highest leaf page number we've issued a
|
||||
|
Reference in New Issue
Block a user