1
0
mirror of https://github.com/postgres/postgres.git synced 2025-06-27 23:21:58 +03:00

At update of non-LP_NORMAL TID, fail instead of corrupting page header.

The right mix of DDL and VACUUM could corrupt a catalog page header such
that PageIsVerified() durably fails, requiring a restore from backup.
This affects only catalogs that both have a syscache and have DDL code
that uses syscache tuples to construct updates.  One of the test
permutations shows a variant not yet fixed.

This makes !TransactionIdIsValid(TM_FailureData.xmax) possible with
TM_Deleted.  I think core and PGXN are indifferent to that.

Per bug #17821 from Alexander Lakhin.  Back-patch to v13 (all supported
versions).  The test case is v17+, since it uses INJECTION_POINT.

Discussion: https://postgr.es/m/17821-dd8c334263399284@postgresql.org
This commit is contained in:
Noah Misch
2025-01-25 11:28:14 -08:00
parent 216294ba59
commit dc02b98bd1
2 changed files with 46 additions and 2 deletions

View File

@ -72,6 +72,7 @@
#include "utils/relcache.h"
#include "utils/snapmgr.h"
#include "utils/spccache.h"
#include "utils/syscache.h"
static HeapTuple heap_prepare_insert(Relation relation, HeapTuple tup,
@ -3242,7 +3243,49 @@ heap_update(Relation relation, ItemPointer otid, HeapTuple newtup,
LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
lp = PageGetItemId(page, ItemPointerGetOffsetNumber(otid));
Assert(ItemIdIsNormal(lp));
/*
* Usually, a buffer pin and/or snapshot blocks pruning of otid, ensuring
* we see LP_NORMAL here. When the otid origin is a syscache, we may have
* neither a pin nor a snapshot. Hence, we may see other LP_ states, each
* of which indicates concurrent pruning.
*
* Failing with TM_Updated would be most accurate. However, unlike other
* TM_Updated scenarios, we don't know the successor ctid in LP_UNUSED and
* LP_DEAD cases. While the distinction between TM_Updated and TM_Deleted
* does matter to SQL statements UPDATE and MERGE, those SQL statements
* hold a snapshot that ensures LP_NORMAL. Hence, the choice between
* TM_Updated and TM_Deleted affects only the wording of error messages.
* Settle on TM_Deleted, for two reasons. First, it avoids complicating
* the specification of when tmfd->ctid is valid. Second, it creates
* error log evidence that we took this branch.
*
* Since it's possible to see LP_UNUSED at otid, it's also possible to see
* LP_NORMAL for a tuple that replaced LP_UNUSED. If it's a tuple for an
* unrelated row, we'll fail with "duplicate key value violates unique".
* XXX if otid is the live, newer version of the newtup row, we'll discard
* changes originating in versions of this catalog row after the version
* the caller got from syscache. See syscache-update-pruned.spec.
*/
if (!ItemIdIsNormal(lp))
{
Assert(RelationSupportsSysCache(RelationGetRelid(relation)));
UnlockReleaseBuffer(buffer);
Assert(!have_tuple_lock);
if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);
tmfd->ctid = *otid;
tmfd->xmax = InvalidTransactionId;
tmfd->cmax = InvalidCommandId;
bms_free(hot_attrs);
bms_free(key_attrs);
bms_free(id_attrs);
/* modified_attrs not yet initialized */
bms_free(interesting_attrs);
return TM_Deleted;
}
/*
* Fill in enough data in oldtup for HeapDetermineColumnsInfo to work