mirror of
https://github.com/postgres/postgres.git
synced 2025-07-02 09:02:37 +03:00
Avoid pin scan for replay of XLOG_BTREE_VACUUM
Replay of XLOG_BTREE_VACUUM during Hot Standby was previously thought to require complex interlocking that matched the requirements on the master. This required an O(N) operation that became a significant problem with large indexes, causing replication delays of seconds or in some cases minutes while the XLOG_BTREE_VACUUM was replayed. This commit skips the “pin scan” that was previously required, by observing in detail when and how it is safe to do so, with full documentation. The pin scan is skipped only in replay; the VACUUM code path on master is not touched here. The current commit still performs the pin scan for toast indexes, though this can also be avoided if we recheck scans on toast indexes. Later patch will address this. No tests included. Manual tests using an additional patch to view WAL records and their timing have shown the change in WAL records and their handling has successfully reduced replication delay.
This commit is contained in:
@ -331,8 +331,10 @@ typedef struct xl_btree_reuse_page
|
||||
* The WAL record can represent deletion of any number of index tuples on a
|
||||
* single index page when executed by VACUUM.
|
||||
*
|
||||
* The correctness requirement for applying these changes during recovery is
|
||||
* that we must do one of these two things for every block in the index:
|
||||
* For MVCC scans, lastBlockVacuumed will be set to InvalidBlockNumber.
|
||||
* For a non-MVCC index scans there is an additional correctness requirement
|
||||
* for applying these changes during recovery, which is that we must do one
|
||||
* of these two things for every block in the index:
|
||||
* * lock the block for cleanup and apply any required changes
|
||||
* * EnsureBlockUnpinned()
|
||||
* The purpose of this is to ensure that no index scans started before we
|
||||
|
Reference in New Issue
Block a user