1
0
mirror of https://github.com/postgres/postgres.git synced 2025-08-19 23:22:23 +03:00
Files
postgres/src/backend/utils/time/combocid.c
Alvaro Herrera 0ac5ad5134 Improve concurrency of foreign key locking
This patch introduces two additional lock modes for tuples: "SELECT FOR
KEY SHARE" and "SELECT FOR NO KEY UPDATE".  These don't block each
other, in contrast with already existing "SELECT FOR SHARE" and "SELECT
FOR UPDATE".  UPDATE commands that do not modify the values stored in
the columns that are part of the key of the tuple now grab a SELECT FOR
NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently
with tuple locks of the FOR KEY SHARE variety.

Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this
means the concurrency improvement applies to them, which is the whole
point of this patch.

The added tuple lock semantics require some rejiggering of the multixact
module, so that the locking level that each transaction is holding can
be stored alongside its Xid.  Also, multixacts now need to persist
across server restarts and crashes, because they can now represent not
only tuple locks, but also tuple updates.  This means we need more
careful tracking of lifetime of pg_multixact SLRU files; since they now
persist longer, we require more infrastructure to figure out when they
can be removed.  pg_upgrade also needs to be careful to copy
pg_multixact files over from the old server to the new, or at least part
of multixact.c state, depending on the versions of the old and new
servers.

Tuple time qualification rules (HeapTupleSatisfies routines) need to be
careful not to consider tuples with the "is multi" infomask bit set as
being only locked; they might need to look up MultiXact values (i.e.
possibly do pg_multixact I/O) to find out the Xid that updated a tuple,
whereas they previously were assured to only use information readily
available from the tuple header.  This is considered acceptable, because
the extra I/O would involve cases that would previously cause some
commands to block waiting for concurrent transactions to finish.

Another important change is the fact that locking tuples that have
previously been updated causes the future versions to be marked as
locked, too; this is essential for correctness of foreign key checks.
This causes additional WAL-logging, also (there was previously a single
WAL record for a locked tuple; now there are as many as updated copies
of the tuple there exist.)

With all this in place, contention related to tuples being checked by
foreign key rules should be much reduced.

As a bonus, the old behavior that a subtransaction grabbing a stronger
tuple lock than the parent (sub)transaction held on a given tuple and
later aborting caused the weaker lock to be lost, has been fixed.

Many new spec files were added for isolation tester framework, to ensure
overall behavior is sane.  There's probably room for several more tests.

There were several reviewers of this patch; in particular, Noah Misch
and Andres Freund spent considerable time in it.  Original idea for the
patch came from Simon Riggs, after a problem report by Joel Jacobson.
Most code is from me, with contributions from Marti Raudsepp, Alexander
Shulgin, Noah Misch and Andres Freund.

This patch was discussed in several pgsql-hackers threads; the most
important start at the following message-ids:
	AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com
	1290721684-sup-3951@alvh.no-ip.org
	1294953201-sup-2099@alvh.no-ip.org
	1320343602-sup-2290@alvh.no-ip.org
	1339690386-sup-8927@alvh.no-ip.org
	4FE5FF020200002500048A3D@gw.wicourts.gov
	4FEAB90A0200002500048B7D@gw.wicourts.gov
2013-01-23 12:04:59 -03:00

282 lines
7.6 KiB
C

/*-------------------------------------------------------------------------
*
* combocid.c
* Combo command ID support routines
*
* Before version 8.3, HeapTupleHeaderData had separate fields for cmin
* and cmax. To reduce the header size, cmin and cmax are now overlayed
* in the same field in the header. That usually works because you rarely
* insert and delete a tuple in the same transaction, and we don't need
* either field to remain valid after the originating transaction exits.
* To make it work when the inserting transaction does delete the tuple,
* we create a "combo" command ID and store that in the tuple header
* instead of cmin and cmax. The combo command ID can be mapped to the
* real cmin and cmax using a backend-private array, which is managed by
* this module.
*
* To allow reusing existing combo cids, we also keep a hash table that
* maps cmin,cmax pairs to combo cids. This keeps the data structure size
* reasonable in most cases, since the number of unique pairs used by any
* one transaction is likely to be small.
*
* With a 32-bit combo command id we can represent 2^32 distinct cmin,cmax
* combinations. In the most perverse case where each command deletes a tuple
* generated by every previous command, the number of combo command ids
* required for N commands is N*(N+1)/2. That means that in the worst case,
* that's enough for 92682 commands. In practice, you'll run out of memory
* and/or disk space way before you reach that limit.
*
* The array and hash table are kept in TopTransactionContext, and are
* destroyed at the end of each transaction.
*
*
* Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California
*
* IDENTIFICATION
* src/backend/utils/time/combocid.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "access/htup_details.h"
#include "access/xact.h"
#include "utils/combocid.h"
#include "utils/hsearch.h"
#include "utils/memutils.h"
/* Hash table to lookup combo cids by cmin and cmax */
static HTAB *comboHash = NULL;
/* Key and entry structures for the hash table */
typedef struct
{
CommandId cmin;
CommandId cmax;
} ComboCidKeyData;
typedef ComboCidKeyData *ComboCidKey;
typedef struct
{
ComboCidKeyData key;
CommandId combocid;
} ComboCidEntryData;
typedef ComboCidEntryData *ComboCidEntry;
/* Initial size of the hash table */
#define CCID_HASH_SIZE 100
/*
* An array of cmin,cmax pairs, indexed by combo command id.
* To convert a combo cid to cmin and cmax, you do a simple array lookup.
*/
static ComboCidKey comboCids = NULL;
static int usedComboCids = 0; /* number of elements in comboCids */
static int sizeComboCids = 0; /* allocated size of array */
/* Initial size of the array */
#define CCID_ARRAY_SIZE 100
/* prototypes for internal functions */
static CommandId GetComboCommandId(CommandId cmin, CommandId cmax);
static CommandId GetRealCmin(CommandId combocid);
static CommandId GetRealCmax(CommandId combocid);
/**** External API ****/
/*
* GetCmin and GetCmax assert that they are only called in situations where
* they make sense, that is, can deliver a useful answer. If you have
* reason to examine a tuple's t_cid field from a transaction other than
* the originating one, use HeapTupleHeaderGetRawCommandId() directly.
*/
CommandId
HeapTupleHeaderGetCmin(HeapTupleHeader tup)
{
CommandId cid = HeapTupleHeaderGetRawCommandId(tup);
Assert(!(tup->t_infomask & HEAP_MOVED));
Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tup)));
if (tup->t_infomask & HEAP_COMBOCID)
return GetRealCmin(cid);
else
return cid;
}
CommandId
HeapTupleHeaderGetCmax(HeapTupleHeader tup)
{
CommandId cid = HeapTupleHeaderGetRawCommandId(tup);
Assert(!(tup->t_infomask & HEAP_MOVED));
Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetUpdateXid(tup)));
if (tup->t_infomask & HEAP_COMBOCID)
return GetRealCmax(cid);
else
return cid;
}
/*
* Given a tuple we are about to delete, determine the correct value to store
* into its t_cid field.
*
* If we don't need a combo CID, *cmax is unchanged and *iscombo is set to
* FALSE. If we do need one, *cmax is replaced by a combo CID and *iscombo
* is set to TRUE.
*
* The reason this is separate from the actual HeapTupleHeaderSetCmax()
* operation is that this could fail due to out-of-memory conditions. Hence
* we need to do this before entering the critical section that actually
* changes the tuple in shared buffers.
*/
void
HeapTupleHeaderAdjustCmax(HeapTupleHeader tup,
CommandId *cmax,
bool *iscombo)
{
/*
* If we're marking a tuple deleted that was inserted by (any
* subtransaction of) our transaction, we need to use a combo command id.
* Test for HEAP_XMIN_COMMITTED first, because it's cheaper than a
* TransactionIdIsCurrentTransactionId call.
*/
if (!(tup->t_infomask & HEAP_XMIN_COMMITTED) &&
TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tup)))
{
CommandId cmin = HeapTupleHeaderGetCmin(tup);
*cmax = GetComboCommandId(cmin, *cmax);
*iscombo = true;
}
else
{
*iscombo = false;
}
}
/*
* Combo command ids are only interesting to the inserting and deleting
* transaction, so we can forget about them at the end of transaction.
*/
void
AtEOXact_ComboCid(void)
{
/*
* Don't bother to pfree. These are allocated in TopTransactionContext, so
* they're going to go away at the end of transaction anyway.
*/
comboHash = NULL;
comboCids = NULL;
usedComboCids = 0;
sizeComboCids = 0;
}
/**** Internal routines ****/
/*
* Get a combo command id that maps to cmin and cmax.
*
* We try to reuse old combo command ids when possible.
*/
static CommandId
GetComboCommandId(CommandId cmin, CommandId cmax)
{
CommandId combocid;
ComboCidKeyData key;
ComboCidEntry entry;
bool found;
/*
* Create the hash table and array the first time we need to use combo
* cids in the transaction.
*/
if (comboHash == NULL)
{
HASHCTL hash_ctl;
memset(&hash_ctl, 0, sizeof(hash_ctl));
hash_ctl.keysize = sizeof(ComboCidKeyData);
hash_ctl.entrysize = sizeof(ComboCidEntryData);
hash_ctl.hash = tag_hash;
hash_ctl.hcxt = TopTransactionContext;
comboHash = hash_create("Combo CIDs",
CCID_HASH_SIZE,
&hash_ctl,
HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
comboCids = (ComboCidKeyData *)
MemoryContextAlloc(TopTransactionContext,
sizeof(ComboCidKeyData) * CCID_ARRAY_SIZE);
sizeComboCids = CCID_ARRAY_SIZE;
usedComboCids = 0;
}
/* Lookup or create a hash entry with the desired cmin/cmax */
/* We assume there is no struct padding in ComboCidKeyData! */
key.cmin = cmin;
key.cmax = cmax;
entry = (ComboCidEntry) hash_search(comboHash,
(void *) &key,
HASH_ENTER,
&found);
if (found)
{
/* Reuse an existing combo cid */
return entry->combocid;
}
/*
* We have to create a new combo cid. Check that there's room for it in
* the array, and grow it if there isn't.
*/
if (usedComboCids >= sizeComboCids)
{
/* We need to grow the array */
int newsize = sizeComboCids * 2;
comboCids = (ComboCidKeyData *)
repalloc(comboCids, sizeof(ComboCidKeyData) * newsize);
sizeComboCids = newsize;
}
combocid = usedComboCids;
comboCids[combocid].cmin = cmin;
comboCids[combocid].cmax = cmax;
usedComboCids++;
entry->combocid = combocid;
return combocid;
}
static CommandId
GetRealCmin(CommandId combocid)
{
Assert(combocid < usedComboCids);
return comboCids[combocid].cmin;
}
static CommandId
GetRealCmax(CommandId combocid)
{
Assert(combocid < usedComboCids);
return comboCids[combocid].cmax;
}