mirror of
https://github.com/postgres/postgres.git
synced 2025-07-30 11:03:19 +03:00
Fix "base" snapshot handling in logical decoding
Two closely related bugs are fixed. First, xmin of logical slots was advanced too early. During xl_running_xacts processing, xmin of the slot was set to the oldest running xid in the record, but that's wrong: actually, snapshots which will be used for not-yet-replayed transactions might consider older txns as running too, so we need to keep xmin back for them. The problem wasn't noticed earlier because DDL which allows to delete tuple (set xmax) while some another not-yet-committed transaction looks at it is pretty rare, if not unique: e.g. all forms of ALTER TABLE which change schema acquire ACCESS EXCLUSIVE lock conflicting with any inserts. The included test case (test_decoding's oldest_xmin) uses ALTER of a composite type, which doesn't have such interlocking. To deal with this, we must be able to quickly retrieve oldest xmin (oldest running xid among all assigned snapshots) from ReorderBuffer. To fix, add another list of ReorderBufferTXNs to the reorderbuffer, where transactions are sorted by base-snapshot-LSN. This is slightly different from the existing (sorted by first-LSN) list, because a transaction can have an earlier LSN but a later Xmin, if its first record does not obtain an xmin (eg. xl_xact_assignment). Note this new list doesn't fully replace the existing txn list: we still need that one to prevent WAL recycling. The second issue concerns SnapBuilder snapshots and subtransactions. SnapBuildDistributeNewCatalogSnapshot never assigned a snapshot to a transaction that is known to be a subtxn, which is good in the common case that the top-level transaction already has one (no point in doing so), but a bug otherwise. To fix, arrange to transfer the snapshot from the subtxn to its top-level txn as soon as the kinship gets known. test_decoding's snapshot_transfer verifies this. Also, fix a minor memory leak: refcount of toplevel's old base snapshot was not decremented when the snapshot is transferred from child. Liberally sprinkle code comments, and rewrite a few existing ones. This part is my (Álvaro's) contribution to this commit, as I had to write all those comments in order to understand the existing code and Arseny's patch. Reported-by: Arseny Sher <a.sher@postgrespro.ru> Diagnosed-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Arseny Sher <a.sher@postgrespro.ru> Co-authored-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://postgr.es/m/87lgdyz1wj.fsf@ars-thinkpad
This commit is contained in:
@ -830,9 +830,9 @@ SnapBuildDistributeNewCatalogSnapshot(SnapBuild *builder, XLogRecPtr lsn)
|
||||
* all. We'll add a snapshot when the first change gets queued.
|
||||
*
|
||||
* NB: This works correctly even for subtransactions because
|
||||
* ReorderBufferCommitChild() takes care to pass the parent the base
|
||||
* snapshot, and while iterating the changequeue we'll get the change
|
||||
* from the subtxn.
|
||||
* ReorderBufferAssignChild() takes care to transfer the base snapshot
|
||||
* to the top-level transaction, and while iterating the changequeue
|
||||
* we'll get the change from the subtxn.
|
||||
*/
|
||||
if (!ReorderBufferXidHasBaseSnapshot(builder->reorder, txn->xid))
|
||||
continue;
|
||||
@ -1074,7 +1074,7 @@ SnapBuildCommitTxn(SnapBuild *builder, XLogRecPtr lsn, TransactionId xid,
|
||||
/* refcount of the snapshot builder for the new snapshot */
|
||||
SnapBuildSnapIncRefcount(builder->snapshot);
|
||||
|
||||
/* add a new Snapshot to all currently running transactions */
|
||||
/* add a new catalog snapshot to all currently running transactions */
|
||||
SnapBuildDistributeNewCatalogSnapshot(builder, lsn);
|
||||
}
|
||||
}
|
||||
@ -1094,6 +1094,7 @@ void
|
||||
SnapBuildProcessRunningXacts(SnapBuild *builder, XLogRecPtr lsn, xl_running_xacts *running)
|
||||
{
|
||||
ReorderBufferTXN *txn;
|
||||
TransactionId xmin;
|
||||
|
||||
/*
|
||||
* If we're not consistent yet, inspect the record to see whether it
|
||||
@ -1126,15 +1127,21 @@ SnapBuildProcessRunningXacts(SnapBuild *builder, XLogRecPtr lsn, xl_running_xact
|
||||
/* Remove transactions we don't need to keep track off anymore */
|
||||
SnapBuildPurgeCommittedTxn(builder);
|
||||
|
||||
elog(DEBUG3, "xmin: %u, xmax: %u, oldestrunning: %u",
|
||||
builder->xmin, builder->xmax,
|
||||
running->oldestRunningXid);
|
||||
|
||||
/*
|
||||
* Increase shared memory limits, so vacuum can work on tuples we
|
||||
* prevented from being pruned till now.
|
||||
* Advance the xmin limit for the current replication slot, to allow
|
||||
* vacuum to clean up the tuples this slot has been protecting.
|
||||
*
|
||||
* The reorderbuffer might have an xmin among the currently running
|
||||
* snapshots; use it if so. If not, we need only consider the snapshots
|
||||
* we'll produce later, which can't be less than the oldest running xid in
|
||||
* the record we're reading now.
|
||||
*/
|
||||
LogicalIncreaseXminForSlot(lsn, running->oldestRunningXid);
|
||||
xmin = ReorderBufferGetOldestXmin(builder->reorder);
|
||||
if (xmin == InvalidTransactionId)
|
||||
xmin = running->oldestRunningXid;
|
||||
elog(DEBUG3, "xmin: %u, xmax: %u, oldest running: %u, oldest xmin: %u",
|
||||
builder->xmin, builder->xmax, running->oldestRunningXid, xmin);
|
||||
LogicalIncreaseXminForSlot(lsn, xmin);
|
||||
|
||||
/*
|
||||
* Also tell the slot where we can restart decoding from. We don't want to
|
||||
|
Reference in New Issue
Block a user