mirror of
https://github.com/postgres/postgres.git
synced 2025-04-25 21:42:33 +03:00
Don't take ProcArrayLock while exiting a transaction that has no XID; there is
no need for serialization against snapshot-taking because the xact doesn't affect anyone else's snapshot anyway. Per discussion. Also, move various info about the interlocking of transactions and snapshots out of code comments and into a hopefully-more-cohesive discussion in access/transam/README. Also, remove a couple of now-obsolete comments about having to force some WAL to be written to persuade RecordTransactionCommit to do its thing.
This commit is contained in:
parent
85e79a4a83
commit
0a51e7073c
@ -8,7 +8,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/access/heap/heapam.c,v 1.238 2007/09/05 18:10:47 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/access/heap/heapam.c,v 1.239 2007/09/07 20:59:26 tgl Exp $
|
||||||
*
|
*
|
||||||
*
|
*
|
||||||
* INTERFACE ROUTINES
|
* INTERFACE ROUTINES
|
||||||
@ -1546,9 +1546,8 @@ UpdateXmaxHintBits(HeapTupleHeader tuple, Buffer buffer, TransactionId xid)
|
|||||||
* If use_wal is false, the new tuple is not logged in WAL, even for a
|
* If use_wal is false, the new tuple is not logged in WAL, even for a
|
||||||
* non-temp relation. Safe usage of this behavior requires that we arrange
|
* non-temp relation. Safe usage of this behavior requires that we arrange
|
||||||
* that all new tuples go into new pages not containing any tuples from other
|
* that all new tuples go into new pages not containing any tuples from other
|
||||||
* transactions, that the relation gets fsync'd before commit, and that the
|
* transactions, and that the relation gets fsync'd before commit.
|
||||||
* transaction emits at least one WAL record to ensure RecordTransactionCommit
|
* (See also heap_sync() comments)
|
||||||
* will decide to WAL-log the commit. (See also heap_sync() comments)
|
|
||||||
*
|
*
|
||||||
* use_fsm is passed directly to RelationGetBufferForTuple, which see for
|
* use_fsm is passed directly to RelationGetBufferForTuple, which see for
|
||||||
* more info.
|
* more info.
|
||||||
|
@ -1,4 +1,4 @@
|
|||||||
$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.7 2007/09/05 18:10:47 tgl Exp $
|
$PostgreSQL: pgsql/src/backend/access/transam/README,v 1.8 2007/09/07 20:59:26 tgl Exp $
|
||||||
|
|
||||||
The Transaction System
|
The Transaction System
|
||||||
----------------------
|
----------------------
|
||||||
@ -221,6 +221,110 @@ InvalidSubTransactionId.) Note that subtransactions do not have their
|
|||||||
own VXIDs; they use the parent top transaction's VXID.
|
own VXIDs; they use the parent top transaction's VXID.
|
||||||
|
|
||||||
|
|
||||||
|
Interlocking transaction begin, transaction end, and snapshots
|
||||||
|
--------------------------------------------------------------
|
||||||
|
|
||||||
|
We try hard to minimize the amount of overhead and lock contention involved
|
||||||
|
in the frequent activities of beginning/ending a transaction and taking a
|
||||||
|
snapshot. Unfortunately, we must have some interlocking for this, because
|
||||||
|
we must ensure consistency about the commit order of transactions.
|
||||||
|
For example, suppose an UPDATE in xact A is blocked by xact B's prior
|
||||||
|
update of the same row, and xact B is doing commit while xact C gets a
|
||||||
|
snapshot. Xact A can complete and commit as soon as B releases its locks.
|
||||||
|
If xact C's GetSnapshotData sees xact B as still running, then it had
|
||||||
|
better see xact A as still running as well, or it will be able to see two
|
||||||
|
tuple versions - one deleted by xact B and one inserted by xact A. Another
|
||||||
|
reason why this would be bad is that C would see (in the row inserted by A)
|
||||||
|
earlier changes by B, and it would be inconsistent for C not to see any
|
||||||
|
of B's changes elsewhere in the database.
|
||||||
|
|
||||||
|
Formally, the correctness requirement is "if A sees B as committed,
|
||||||
|
and B sees C as committed, then A must see C as committed".
|
||||||
|
|
||||||
|
What we actually enforce is strict serialization of commits and rollbacks
|
||||||
|
with snapshot-taking: we do not allow any transaction to exit the set of
|
||||||
|
running transactions while a snapshot is being taken. (This rule is
|
||||||
|
stronger than necessary for consistency, but is relatively simple to
|
||||||
|
enforce, and it assists with some other issues as explained below.) The
|
||||||
|
implementation of this is that GetSnapshotData takes the ProcArrayLock in
|
||||||
|
shared mode (so that multiple backends can take snapshots in parallel),
|
||||||
|
but xact.c must take the ProcArrayLock in exclusive mode while clearing
|
||||||
|
MyProc->xid at transaction end (either commit or abort).
|
||||||
|
|
||||||
|
GetSnapshotData must in fact acquire ProcArrayLock before it calls
|
||||||
|
ReadNewTransactionId. Otherwise it would be possible for a transaction A
|
||||||
|
postdating the xmax to commit, and then an existing transaction B that saw
|
||||||
|
A as committed to commit, before GetSnapshotData is able to acquire
|
||||||
|
ProcArrayLock and finish taking its snapshot. This would violate the
|
||||||
|
consistency requirement, because A would be still running and B not
|
||||||
|
according to this snapshot.
|
||||||
|
|
||||||
|
In short, then, the rule is that no transaction may exit the set of
|
||||||
|
currently-running transactions between the time we fetch xmax and the time
|
||||||
|
we finish building our snapshot. However, this restriction only applies
|
||||||
|
to transactions that have an XID --- read-only transactions can end without
|
||||||
|
acquiring ProcArrayLock, since they don't affect anyone else's snapshot.
|
||||||
|
|
||||||
|
Transaction start, per se, doesn't have any interlocking with these
|
||||||
|
considerations, since we no longer assign an XID immediately at transaction
|
||||||
|
start. But when we do decide to allocate an XID, we must require
|
||||||
|
GetNewTransactionId to store the new XID into the shared ProcArray before
|
||||||
|
releasing XidGenLock. This ensures that when GetSnapshotData calls
|
||||||
|
ReadNewTransactionId (which also takes XidGenLock), all active XIDs before
|
||||||
|
the returned value of nextXid are already present in the ProcArray and
|
||||||
|
can't be missed by GetSnapshotData. Unfortunately, we can't have
|
||||||
|
GetNewTransactionId take ProcArrayLock to do this, else it could deadlock
|
||||||
|
against GetSnapshotData. Therefore, we simply let GetNewTransactionId
|
||||||
|
store into MyProc->xid without any lock. We are thereby relying on
|
||||||
|
fetch/store of an XID to be atomic, else other backends might see a
|
||||||
|
partially-set XID. (NOTE: for multiprocessors that need explicit memory
|
||||||
|
access fence instructions, this means that acquiring/releasing XidGenLock
|
||||||
|
is just as necessary as acquiring/releasing ProcArrayLock for
|
||||||
|
GetSnapshotData to ensure it sees up-to-date xid fields.) This also means
|
||||||
|
that readers of the ProcArray xid fields must be careful to fetch a value
|
||||||
|
only once, rather than assume they can read it multiple times and get the
|
||||||
|
same answer each time.
|
||||||
|
|
||||||
|
Another important activity that uses the shared ProcArray is GetOldestXmin,
|
||||||
|
which must determine a lower bound for the oldest xmin of any active MVCC
|
||||||
|
snapshot, system-wide. Each individual backend advertises the smallest
|
||||||
|
xmin of its own snapshots in MyProc->xmin, or zero if it currently has no
|
||||||
|
live snapshots (eg, if it's between transactions or hasn't yet set a
|
||||||
|
snapshot for a new transaction). GetOldestXmin takes the MIN() of the
|
||||||
|
valid xmin fields. It does this with only shared lock on ProcArrayLock,
|
||||||
|
which means there is a potential race condition against other backends
|
||||||
|
doing GetSnapshotData concurrently: we must be certain that a concurrent
|
||||||
|
backend that is about to set its xmin does not compute an xmin less than
|
||||||
|
what GetOldestXmin returns. We ensure that by including all the active
|
||||||
|
XIDs into the MIN() calculation, along with the valid xmins. The rule that
|
||||||
|
transactions can't exit without taking exclusive ProcArrayLock ensures that
|
||||||
|
concurrent holders of shared ProcArrayLock will compute the same minimum of
|
||||||
|
currently-active XIDs: no xact, in particular not the oldest, can exit
|
||||||
|
while we hold shared ProcArrayLock. So GetOldestXmin's view of the minimum
|
||||||
|
active XID will be the same as that of any concurrent GetSnapshotData, and
|
||||||
|
so it can't produce an overestimate. If there is no active transaction at
|
||||||
|
all, GetOldestXmin returns the result of ReadNewTransactionId. Note that
|
||||||
|
two concurrent executions of GetOldestXmin might not see the same result
|
||||||
|
from ReadNewTransactionId --- but if there is a difference, the intervening
|
||||||
|
execution(s) of GetNewTransactionId must have stored their XIDs into the
|
||||||
|
ProcArray, so the later execution of GetOldestXmin will see them and
|
||||||
|
compute the same global xmin anyway.
|
||||||
|
|
||||||
|
GetSnapshotData also performs an oldest-xmin calculation (which had better
|
||||||
|
match GetOldestXmin's) and stores that into RecentGlobalXmin, which is used
|
||||||
|
for some tuple age cutoff checks where a fresh call of GetOldestXmin seems
|
||||||
|
too expensive. Note that while it is certain that two concurrent
|
||||||
|
executions of GetSnapshotData will compute the same xmin for their own
|
||||||
|
snapshots, as argued above, it is not certain that they will arrive at the
|
||||||
|
same estimate of RecentGlobalXmin. This is because we allow XID-less
|
||||||
|
transactions to clear their MyProc->xmin asynchronously (without taking
|
||||||
|
ProcArrayLock), so one execution might see what had been the oldest xmin,
|
||||||
|
and another not. This is OK since RecentGlobalXmin need only be a valid
|
||||||
|
lower bound. As noted above, we are already assuming that fetch/store
|
||||||
|
of the xid fields is atomic, so assuming it for xmin as well is no extra
|
||||||
|
risk.
|
||||||
|
|
||||||
|
|
||||||
pg_clog and pg_subtrans
|
pg_clog and pg_subtrans
|
||||||
-----------------------
|
-----------------------
|
||||||
|
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.248 2007/09/05 18:10:47 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/access/transam/xact.c,v 1.249 2007/09/07 20:59:26 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -747,6 +747,8 @@ AtSubStart_ResourceOwner(void)
|
|||||||
|
|
||||||
/*
|
/*
|
||||||
* RecordTransactionCommit
|
* RecordTransactionCommit
|
||||||
|
*
|
||||||
|
* This is exported only to support an ugly hack in VACUUM FULL.
|
||||||
*/
|
*/
|
||||||
void
|
void
|
||||||
RecordTransactionCommit(void)
|
RecordTransactionCommit(void)
|
||||||
@ -1552,47 +1554,54 @@ CommitTransaction(void)
|
|||||||
*/
|
*/
|
||||||
RecordTransactionCommit();
|
RecordTransactionCommit();
|
||||||
|
|
||||||
/*----------
|
PG_TRACE1(transaction__commit, MyProc->lxid);
|
||||||
|
|
||||||
|
/*
|
||||||
* Let others know about no transaction in progress by me. Note that
|
* Let others know about no transaction in progress by me. Note that
|
||||||
* this must be done _before_ releasing locks we hold and _after_
|
* this must be done _before_ releasing locks we hold and _after_
|
||||||
* RecordTransactionCommit.
|
* RecordTransactionCommit.
|
||||||
*
|
*
|
||||||
* LWLockAcquire(ProcArrayLock) is required; consider this example:
|
|
||||||
* UPDATE with xid 0 is blocked by xid 1's UPDATE.
|
|
||||||
* xid 1 is doing commit while xid 2 gets snapshot.
|
|
||||||
* If xid 2's GetSnapshotData sees xid 1 as running then it must see
|
|
||||||
* xid 0 as running as well, or it will be able to see two tuple versions
|
|
||||||
* - one deleted by xid 1 and one inserted by xid 0. See notes in
|
|
||||||
* GetSnapshotData.
|
|
||||||
*
|
|
||||||
* Note: MyProc may be null during bootstrap.
|
* Note: MyProc may be null during bootstrap.
|
||||||
*----------
|
|
||||||
*/
|
*/
|
||||||
if (MyProc != NULL)
|
if (MyProc != NULL)
|
||||||
{
|
{
|
||||||
/*
|
if (TransactionIdIsValid(MyProc->xid))
|
||||||
* Lock ProcArrayLock because that's what GetSnapshotData uses.
|
{
|
||||||
* You might assume that we can skip this step if we had no
|
/*
|
||||||
* transaction id assigned, because the failure case outlined
|
* We must lock ProcArrayLock while clearing MyProc->xid, so
|
||||||
* in GetSnapshotData cannot happen in that case. This is true,
|
* that we do not exit the set of "running" transactions while
|
||||||
* but we *still* need the lock guarantee that two concurrent
|
* someone else is taking a snapshot. See discussion in
|
||||||
* computations of the *oldest* xmin will get the same result.
|
* src/backend/access/transam/README.
|
||||||
*/
|
*/
|
||||||
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
|
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
|
||||||
MyProc->xid = InvalidTransactionId;
|
|
||||||
MyProc->lxid = InvalidLocalTransactionId;
|
|
||||||
MyProc->xmin = InvalidTransactionId;
|
|
||||||
MyProc->inVacuum = false; /* must be cleared with xid/xmin */
|
|
||||||
|
|
||||||
/* Clear the subtransaction-XID cache too while holding the lock */
|
MyProc->xid = InvalidTransactionId;
|
||||||
MyProc->subxids.nxids = 0;
|
MyProc->lxid = InvalidLocalTransactionId;
|
||||||
MyProc->subxids.overflowed = false;
|
MyProc->xmin = InvalidTransactionId;
|
||||||
|
MyProc->inVacuum = false; /* must be cleared with xid/xmin */
|
||||||
|
|
||||||
LWLockRelease(ProcArrayLock);
|
/* Clear the subtransaction-XID cache too while holding the lock */
|
||||||
|
MyProc->subxids.nxids = 0;
|
||||||
|
MyProc->subxids.overflowed = false;
|
||||||
|
|
||||||
|
LWLockRelease(ProcArrayLock);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
/*
|
||||||
|
* If we have no XID, we don't need to lock, since we won't
|
||||||
|
* affect anyone else's calculation of a snapshot. We might
|
||||||
|
* change their estimate of global xmin, but that's OK.
|
||||||
|
*/
|
||||||
|
MyProc->lxid = InvalidLocalTransactionId;
|
||||||
|
MyProc->xmin = InvalidTransactionId;
|
||||||
|
MyProc->inVacuum = false; /* must be cleared with xid/xmin */
|
||||||
|
|
||||||
|
Assert(MyProc->subxids.nxids == 0);
|
||||||
|
Assert(MyProc->subxids.overflowed == false);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
PG_TRACE1(transaction__commit, s->transactionId);
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* This is all post-commit cleanup. Note that if an error is raised here,
|
* This is all post-commit cleanup. Note that if an error is raised here,
|
||||||
* it's too late to abort the transaction. This should be just
|
* it's too late to abort the transaction. This should be just
|
||||||
@ -1815,28 +1824,21 @@ PrepareTransaction(void)
|
|||||||
* Let others know about no transaction in progress by me. This has to be
|
* Let others know about no transaction in progress by me. This has to be
|
||||||
* done *after* the prepared transaction has been marked valid, else
|
* done *after* the prepared transaction has been marked valid, else
|
||||||
* someone may think it is unlocked and recyclable.
|
* someone may think it is unlocked and recyclable.
|
||||||
|
*
|
||||||
|
* We can skip locking ProcArrayLock here, because this action does not
|
||||||
|
* actually change anyone's view of the set of running XIDs: our entry
|
||||||
|
* is duplicate with the gxact that has already been inserted into the
|
||||||
|
* ProcArray.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
/*
|
|
||||||
* Lock ProcArrayLock because that's what GetSnapshotData uses.
|
|
||||||
* You might assume that we can skip this step if we have no
|
|
||||||
* transaction id assigned, because the failure case outlined
|
|
||||||
* in GetSnapshotData cannot happen in that case. This is true,
|
|
||||||
* but we *still* need the lock guarantee that two concurrent
|
|
||||||
* computations of the *oldest* xmin will get the same result.
|
|
||||||
*/
|
|
||||||
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
|
|
||||||
MyProc->xid = InvalidTransactionId;
|
MyProc->xid = InvalidTransactionId;
|
||||||
MyProc->lxid = InvalidLocalTransactionId;
|
MyProc->lxid = InvalidLocalTransactionId;
|
||||||
MyProc->xmin = InvalidTransactionId;
|
MyProc->xmin = InvalidTransactionId;
|
||||||
MyProc->inVacuum = false; /* must be cleared with xid/xmin */
|
MyProc->inVacuum = false; /* must be cleared with xid/xmin */
|
||||||
|
|
||||||
/* Clear the subtransaction-XID cache too while holding the lock */
|
/* Clear the subtransaction-XID cache too */
|
||||||
MyProc->subxids.nxids = 0;
|
MyProc->subxids.nxids = 0;
|
||||||
MyProc->subxids.overflowed = false;
|
MyProc->subxids.overflowed = false;
|
||||||
|
|
||||||
LWLockRelease(ProcArrayLock);
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* This is all post-transaction cleanup. Note that if an error is raised
|
* This is all post-transaction cleanup. Note that if an error is raised
|
||||||
* here, it's too late to abort the transaction. This should be just
|
* here, it's too late to abort the transaction. This should be just
|
||||||
@ -1987,37 +1989,56 @@ AbortTransaction(void)
|
|||||||
*/
|
*/
|
||||||
RecordTransactionAbort(false);
|
RecordTransactionAbort(false);
|
||||||
|
|
||||||
|
PG_TRACE1(transaction__abort, MyProc->lxid);
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Let others know about no transaction in progress by me. Note that this
|
* Let others know about no transaction in progress by me. Note that this
|
||||||
* must be done _before_ releasing locks we hold and _after_
|
* must be done _before_ releasing locks we hold and _after_
|
||||||
* RecordTransactionAbort.
|
* RecordTransactionAbort.
|
||||||
|
*
|
||||||
|
* Note: MyProc may be null during bootstrap.
|
||||||
*/
|
*/
|
||||||
if (MyProc != NULL)
|
if (MyProc != NULL)
|
||||||
{
|
{
|
||||||
/*
|
if (TransactionIdIsValid(MyProc->xid))
|
||||||
* Lock ProcArrayLock because that's what GetSnapshotData uses.
|
{
|
||||||
* You might assume that we can skip this step if we have no
|
/*
|
||||||
* transaction id assigned, because the failure case outlined
|
* We must lock ProcArrayLock while clearing MyProc->xid, so
|
||||||
* in GetSnapshotData cannot happen in that case. This is true,
|
* that we do not exit the set of "running" transactions while
|
||||||
* but we *still* need the lock guarantee that two concurrent
|
* someone else is taking a snapshot. See discussion in
|
||||||
* computations of the *oldest* xmin will get the same result.
|
* src/backend/access/transam/README.
|
||||||
*/
|
*/
|
||||||
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
|
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
|
||||||
MyProc->xid = InvalidTransactionId;
|
|
||||||
MyProc->lxid = InvalidLocalTransactionId;
|
|
||||||
MyProc->xmin = InvalidTransactionId;
|
|
||||||
MyProc->inVacuum = false; /* must be cleared with xid/xmin */
|
|
||||||
MyProc->inCommit = false; /* be sure this gets cleared */
|
|
||||||
|
|
||||||
/* Clear the subtransaction-XID cache too while holding the lock */
|
MyProc->xid = InvalidTransactionId;
|
||||||
MyProc->subxids.nxids = 0;
|
MyProc->lxid = InvalidLocalTransactionId;
|
||||||
MyProc->subxids.overflowed = false;
|
MyProc->xmin = InvalidTransactionId;
|
||||||
|
MyProc->inVacuum = false; /* must be cleared with xid/xmin */
|
||||||
|
MyProc->inCommit = false; /* be sure this gets cleared */
|
||||||
|
|
||||||
LWLockRelease(ProcArrayLock);
|
/* Clear the subtransaction-XID cache too while holding the lock */
|
||||||
|
MyProc->subxids.nxids = 0;
|
||||||
|
MyProc->subxids.overflowed = false;
|
||||||
|
|
||||||
|
LWLockRelease(ProcArrayLock);
|
||||||
|
}
|
||||||
|
else
|
||||||
|
{
|
||||||
|
/*
|
||||||
|
* If we have no XID, we don't need to lock, since we won't
|
||||||
|
* affect anyone else's calculation of a snapshot. We might
|
||||||
|
* change their estimate of global xmin, but that's OK.
|
||||||
|
*/
|
||||||
|
MyProc->lxid = InvalidLocalTransactionId;
|
||||||
|
MyProc->xmin = InvalidTransactionId;
|
||||||
|
MyProc->inVacuum = false; /* must be cleared with xid/xmin */
|
||||||
|
MyProc->inCommit = false; /* be sure this gets cleared */
|
||||||
|
|
||||||
|
Assert(MyProc->subxids.nxids == 0);
|
||||||
|
Assert(MyProc->subxids.overflowed == false);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
PG_TRACE1(transaction__abort, s->transactionId);
|
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Post-abort cleanup. See notes in CommitTransaction() concerning
|
* Post-abort cleanup. See notes in CommitTransaction() concerning
|
||||||
* ordering.
|
* ordering.
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/commands/copy.c,v 1.285 2007/06/20 02:02:49 neilc Exp $
|
* $PostgreSQL: pgsql/src/backend/commands/copy.c,v 1.286 2007/09/07 20:59:26 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -1678,13 +1678,6 @@ CopyFrom(CopyState cstate)
|
|||||||
* rd_newRelfilenodeSubid can be cleared before the end of the transaction.
|
* rd_newRelfilenodeSubid can be cleared before the end of the transaction.
|
||||||
* However this is OK since at worst we will fail to make the optimization.
|
* However this is OK since at worst we will fail to make the optimization.
|
||||||
*
|
*
|
||||||
* When skipping WAL it's entirely possible that COPY itself will write no
|
|
||||||
* WAL records at all. This is of concern because RecordTransactionCommit
|
|
||||||
* might decide it doesn't need to log our eventual commit, which we
|
|
||||||
* certainly need it to do. However, we need no special action here for
|
|
||||||
* that, because if we have a new table or new relfilenode then there
|
|
||||||
* must have been a WAL-logged pg_class update earlier in the transaction.
|
|
||||||
*
|
|
||||||
* Also, if the target file is new-in-transaction, we assume that checking
|
* Also, if the target file is new-in-transaction, we assume that checking
|
||||||
* FSM for free space is a waste of time, even if we must use WAL because
|
* FSM for free space is a waste of time, even if we must use WAL because
|
||||||
* of archiving. This could possibly be wrong, but it's unlikely.
|
* of archiving. This could possibly be wrong, but it's unlikely.
|
||||||
|
@ -26,7 +26,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/executor/execMain.c,v 1.296 2007/08/15 21:39:50 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/executor/execMain.c,v 1.297 2007/09/07 20:59:26 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -2635,12 +2635,6 @@ OpenIntoRel(QueryDesc *queryDesc)
|
|||||||
|
|
||||||
/*
|
/*
|
||||||
* We can skip WAL-logging the insertions, unless PITR is in use.
|
* We can skip WAL-logging the insertions, unless PITR is in use.
|
||||||
*
|
|
||||||
* Note that for a non-temp INTO table, this is safe only because we know
|
|
||||||
* that the catalog changes above will have been WAL-logged, and so
|
|
||||||
* RecordTransactionCommit will think it needs to WAL-log the eventual
|
|
||||||
* transaction commit. Else the commit might be lost, even though all the
|
|
||||||
* data is safely fsync'd ...
|
|
||||||
*/
|
*/
|
||||||
estate->es_into_relation_use_wal = XLogArchivingActive();
|
estate->es_into_relation_use_wal = XLogArchivingActive();
|
||||||
estate->es_into_relation_descriptor = intoRelationDesc;
|
estate->es_into_relation_descriptor = intoRelationDesc;
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
*
|
*
|
||||||
* Because of various subtle race conditions it is critical that a backend
|
* Because of various subtle race conditions it is critical that a backend
|
||||||
* hold the correct locks while setting or clearing its MyProc->xid field.
|
* hold the correct locks while setting or clearing its MyProc->xid field.
|
||||||
* See notes in GetSnapshotData.
|
* See notes in src/backend/access/transam/README.
|
||||||
*
|
*
|
||||||
* The process array now also includes PGPROC structures representing
|
* The process array now also includes PGPROC structures representing
|
||||||
* prepared transactions. The xid and subxids fields of these are valid,
|
* prepared transactions. The xid and subxids fields of these are valid,
|
||||||
@ -23,7 +23,7 @@
|
|||||||
*
|
*
|
||||||
*
|
*
|
||||||
* IDENTIFICATION
|
* IDENTIFICATION
|
||||||
* $PostgreSQL: pgsql/src/backend/storage/ipc/procarray.c,v 1.31 2007/09/07 00:58:56 tgl Exp $
|
* $PostgreSQL: pgsql/src/backend/storage/ipc/procarray.c,v 1.32 2007/09/07 20:59:26 tgl Exp $
|
||||||
*
|
*
|
||||||
*-------------------------------------------------------------------------
|
*-------------------------------------------------------------------------
|
||||||
*/
|
*/
|
||||||
@ -407,6 +407,7 @@ TransactionIdIsActive(TransactionId xid)
|
|||||||
* Note: we include all currently running xids in the set of considered xids.
|
* Note: we include all currently running xids in the set of considered xids.
|
||||||
* This ensures that if a just-started xact has not yet set its snapshot,
|
* This ensures that if a just-started xact has not yet set its snapshot,
|
||||||
* when it does set the snapshot it cannot set xmin less than what we compute.
|
* when it does set the snapshot it cannot set xmin less than what we compute.
|
||||||
|
* See notes in src/backend/access/transam/README.
|
||||||
*/
|
*/
|
||||||
TransactionId
|
TransactionId
|
||||||
GetOldestXmin(bool allDbs, bool ignoreVacuum)
|
GetOldestXmin(bool allDbs, bool ignoreVacuum)
|
||||||
@ -468,7 +469,7 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
|
|||||||
return result;
|
return result;
|
||||||
}
|
}
|
||||||
|
|
||||||
/*----------
|
/*
|
||||||
* GetSnapshotData -- returns information about running transactions.
|
* GetSnapshotData -- returns information about running transactions.
|
||||||
*
|
*
|
||||||
* The returned snapshot includes xmin (lowest still-running xact ID),
|
* The returned snapshot includes xmin (lowest still-running xact ID),
|
||||||
@ -481,12 +482,13 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
|
|||||||
* This ensures that the set of transactions seen as "running" by the
|
* This ensures that the set of transactions seen as "running" by the
|
||||||
* current xact will not change after it takes the snapshot.
|
* current xact will not change after it takes the snapshot.
|
||||||
*
|
*
|
||||||
* All running top-level XIDs are included in the snapshot. We also try
|
* All running top-level XIDs are included in the snapshot, except for lazy
|
||||||
* to include running subtransaction XIDs, but since PGPROC has only a
|
* VACUUM processes. We also try to include running subtransaction XIDs,
|
||||||
* limited cache area for subxact XIDs, full information may not be
|
* but since PGPROC has only a limited cache area for subxact XIDs, full
|
||||||
* available. If we find any overflowed subxid arrays, we have to mark
|
* information may not be available. If we find any overflowed subxid arrays,
|
||||||
* the snapshot's subxid data as overflowed, and extra work will need to
|
* we have to mark the snapshot's subxid data as overflowed, and extra work
|
||||||
* be done to determine what's running (see XidInMVCCSnapshot() in tqual.c).
|
* will need to be done to determine what's running (see XidInMVCCSnapshot()
|
||||||
|
* in tqual.c).
|
||||||
*
|
*
|
||||||
* We also update the following backend-global variables:
|
* We also update the following backend-global variables:
|
||||||
* TransactionXmin: the oldest xmin of any snapshot in use in the
|
* TransactionXmin: the oldest xmin of any snapshot in use in the
|
||||||
@ -497,7 +499,6 @@ GetOldestXmin(bool allDbs, bool ignoreVacuum)
|
|||||||
* RecentGlobalXmin: the global xmin (oldest TransactionXmin across all
|
* RecentGlobalXmin: the global xmin (oldest TransactionXmin across all
|
||||||
* running transactions, except those running LAZY VACUUM). This is
|
* running transactions, except those running LAZY VACUUM). This is
|
||||||
* the same computation done by GetOldestXmin(true, true).
|
* the same computation done by GetOldestXmin(true, true).
|
||||||
*----------
|
|
||||||
*/
|
*/
|
||||||
Snapshot
|
Snapshot
|
||||||
GetSnapshotData(Snapshot snapshot, bool serializable)
|
GetSnapshotData(Snapshot snapshot, bool serializable)
|
||||||
@ -550,57 +551,16 @@ GetSnapshotData(Snapshot snapshot, bool serializable)
|
|||||||
|
|
||||||
/*
|
/*
|
||||||
* It is sufficient to get shared lock on ProcArrayLock, even if we are
|
* It is sufficient to get shared lock on ProcArrayLock, even if we are
|
||||||
* computing a serializable snapshot and therefore will be setting
|
* going to set MyProc->xmin.
|
||||||
* MyProc->xmin. This is because any two backends that have overlapping
|
|
||||||
* shared holds on ProcArrayLock will certainly compute the same xmin
|
|
||||||
* (since no xact, in particular not the oldest, can exit the set of
|
|
||||||
* running transactions while we hold ProcArrayLock --- see further
|
|
||||||
* discussion just below). So it doesn't matter whether another backend
|
|
||||||
* concurrently doing GetSnapshotData or GetOldestXmin sees our xmin as
|
|
||||||
* set or not; he'd compute the same xmin for himself either way.
|
|
||||||
* (We are assuming here that xmin can be set and read atomically,
|
|
||||||
* just like xid.)
|
|
||||||
*
|
|
||||||
* There is a corner case in which the above argument doesn't work: if
|
|
||||||
* there isn't any oldest xact, ie, all xids in the array are invalid.
|
|
||||||
* In that case we will compute xmin as the result of ReadNewTransactionId,
|
|
||||||
* and since GetNewTransactionId doesn't take the ProcArrayLock, it's not
|
|
||||||
* so obvious that two backends with overlapping shared locks will get
|
|
||||||
* the same answer. But GetNewTransactionId is required to store the XID
|
|
||||||
* it assigned into the ProcArray before releasing XidGenLock. Therefore
|
|
||||||
* the backend that did ReadNewTransactionId later will see that XID in
|
|
||||||
* the array, and will compute the same xmin as the earlier one that saw
|
|
||||||
* no XIDs in the array.
|
|
||||||
*/
|
*/
|
||||||
LWLockAcquire(ProcArrayLock, LW_SHARED);
|
LWLockAcquire(ProcArrayLock, LW_SHARED);
|
||||||
|
|
||||||
/*--------------------
|
/*
|
||||||
* Unfortunately, we have to call ReadNewTransactionId() after acquiring
|
* Unfortunately, we have to call ReadNewTransactionId() after acquiring
|
||||||
* ProcArrayLock above. It's not good because ReadNewTransactionId() does
|
* ProcArrayLock. It's not good because ReadNewTransactionId() does
|
||||||
* LWLockAcquire(XidGenLock), but *necessary*. We need to be sure that
|
* LWLockAcquire(XidGenLock), but *necessary*. See discussion in
|
||||||
* no transactions exit the set of currently-running transactions
|
* src/backend/access/transam/README.
|
||||||
* between the time we fetch xmax and the time we finish building our
|
|
||||||
* snapshot. Otherwise we could have a situation like this:
|
|
||||||
*
|
|
||||||
* 1. Tx Old is running (in Read Committed mode).
|
|
||||||
* 2. Tx S reads new transaction ID into xmax, then
|
|
||||||
* is swapped out before acquiring ProcArrayLock.
|
|
||||||
* 3. Tx New gets new transaction ID (>= S' xmax),
|
|
||||||
* makes changes and commits.
|
|
||||||
* 4. Tx Old changes some row R changed by Tx New and commits.
|
|
||||||
* 5. Tx S finishes getting its snapshot data. It sees Tx Old as
|
|
||||||
* done, but sees Tx New as still running (since New >= xmax).
|
|
||||||
*
|
|
||||||
* Now S will see R changed by both Tx Old and Tx New, *but* does not
|
|
||||||
* see other changes made by Tx New. If S is supposed to be in
|
|
||||||
* Serializable mode, this is wrong.
|
|
||||||
*
|
|
||||||
* By locking ProcArrayLock before we read xmax, we ensure that TX Old
|
|
||||||
* cannot exit the set of running transactions seen by Tx S. Therefore
|
|
||||||
* both Old and New will be seen as still running => no inconsistency.
|
|
||||||
*--------------------
|
|
||||||
*/
|
*/
|
||||||
|
|
||||||
xmax = ReadNewTransactionId();
|
xmax = ReadNewTransactionId();
|
||||||
|
|
||||||
/* initialize xmin calculation with xmax */
|
/* initialize xmin calculation with xmax */
|
||||||
@ -1147,9 +1107,10 @@ XidCacheRemoveRunningXids(TransactionId xid, int nxids, TransactionId *xids)
|
|||||||
|
|
||||||
/*
|
/*
|
||||||
* We must hold ProcArrayLock exclusively in order to remove transactions
|
* We must hold ProcArrayLock exclusively in order to remove transactions
|
||||||
* from the PGPROC array. (See notes in GetSnapshotData.) It's possible
|
* from the PGPROC array. (See src/backend/access/transam/README.) It's
|
||||||
* this could be relaxed since we know this routine is only used to abort
|
* possible this could be relaxed since we know this routine is only used
|
||||||
* subtransactions, but pending closer analysis we'd best be conservative.
|
* to abort subtransactions, but pending closer analysis we'd best be
|
||||||
|
* conservative.
|
||||||
*/
|
*/
|
||||||
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
|
LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user