mirror of
https://github.com/postgres/postgres.git
synced 2025-05-28 05:21:27 +03:00
917 lines
32 KiB
Plaintext
917 lines
32 KiB
Plaintext
<!--
|
|
$Header: /cvsroot/pgsql/doc/src/sgml/mvcc.sgml,v 2.38 2003/09/20 20:12:05 tgl Exp $
|
|
-->
|
|
|
|
<chapter id="mvcc">
|
|
<title>Concurrency Control</title>
|
|
|
|
<indexterm>
|
|
<primary>concurrency</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
This chapter describes the behavior of the
|
|
<productname>PostgreSQL</productname> database system when two or
|
|
more sessions try to access the same data at the same time. The
|
|
goals in that situation are to allow efficient access for all
|
|
sessions while maintaining strict data integrity. Every developer
|
|
of database applications should be familiar with the topics covered
|
|
in this chapter.
|
|
</para>
|
|
|
|
<sect1 id="mvcc-intro">
|
|
<title>Introduction</title>
|
|
|
|
<indexterm>
|
|
<primary>MVCC</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Unlike traditional database systems which use locks for concurrency control,
|
|
<productname>PostgreSQL</productname>
|
|
maintains data consistency by using a multiversion model
|
|
(Multiversion Concurrency Control, <acronym>MVCC</acronym>).
|
|
This means that while querying a database each transaction sees
|
|
a snapshot of data (a <firstterm>database version</firstterm>)
|
|
as it was some
|
|
time ago, regardless of the current state of the underlying data.
|
|
This protects the transaction from viewing inconsistent data that
|
|
could be caused by (other) concurrent transaction updates on the same
|
|
data rows, providing <firstterm>transaction isolation</firstterm>
|
|
for each database session.
|
|
</para>
|
|
|
|
<para>
|
|
The main advantage to using the <acronym>MVCC</acronym> model of
|
|
concurrency control rather than locking is that in
|
|
<acronym>MVCC</acronym> locks acquired for querying (reading) data
|
|
do not conflict with locks acquired for writing data, and so
|
|
reading never blocks writing and writing never blocks reading.
|
|
</para>
|
|
|
|
<para>
|
|
Table- and row-level locking facilities are also available in
|
|
<productname>PostgreSQL</productname> for applications that cannot
|
|
adapt easily to <acronym>MVCC</acronym> behavior. However, proper
|
|
use of <acronym>MVCC</acronym> will generally provide better
|
|
performance than locks.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="transaction-iso">
|
|
<title>Transaction Isolation</title>
|
|
|
|
<indexterm>
|
|
<primary>transaction isolation</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <acronym>SQL</acronym> standard defines four levels of
|
|
transaction isolation in terms of three phenomena that must be
|
|
prevented between concurrent transactions. These undesirable
|
|
phenomena are:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>
|
|
dirty read
|
|
<indexterm><primary>dirty read</primary></indexterm>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
A transaction reads data written by a concurrent uncommitted transaction.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
nonrepeatable read
|
|
<indexterm><primary>nonrepeatable read</primary></indexterm>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
A transaction re-reads data it has previously read and finds that data
|
|
has been modified by another transaction (that committed since the
|
|
initial read).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
phantom read
|
|
<indexterm><primary>phantom read</primary></indexterm>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
A transaction re-executes a query returning a set of rows that satisfy a
|
|
search condition and finds that the set of rows satisfying the condition
|
|
has changed due to another recently-committed transaction.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
|
|
<para>
|
|
<indexterm>
|
|
<primary>transaction isolation level</primary>
|
|
</indexterm>
|
|
The four transaction isolation levels and the corresponding
|
|
behaviors are described in <xref linkend="mvcc-isolevel-table">.
|
|
</para>
|
|
|
|
<table tocentry="1" id="mvcc-isolevel-table">
|
|
<title><acronym>SQL</acronym> Transaction Isolation Levels</title>
|
|
<tgroup cols="4">
|
|
<thead>
|
|
<row>
|
|
<entry>
|
|
Isolation Level
|
|
</entry>
|
|
<entry>
|
|
Dirty Read
|
|
</entry>
|
|
<entry>
|
|
Nonrepeatable Read
|
|
</entry>
|
|
<entry>
|
|
Phantom Read
|
|
</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>
|
|
Read uncommitted
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>
|
|
Read committed
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>
|
|
Repeatable read
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>
|
|
Serializable
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname>
|
|
offers the Read Committed and Serializable isolation levels.
|
|
</para>
|
|
|
|
<sect2 id="xact-read-committed">
|
|
<title>Read Committed Isolation Level</title>
|
|
|
|
<indexterm>
|
|
<primary>transaction isolation level</primary>
|
|
<secondary>read committed</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<firstterm>Read Committed</firstterm>
|
|
is the default isolation level in <productname>PostgreSQL</productname>.
|
|
When a transaction runs on this isolation level,
|
|
a <command>SELECT</command> query sees only data committed before the
|
|
query began; it never sees either uncommitted data or changes committed
|
|
during query execution by concurrent transactions. (However, the
|
|
<command>SELECT</command> does see the effects of previous updates
|
|
executed within its own transaction, even though they are not yet
|
|
committed.) In effect, a <command>SELECT</command> query
|
|
sees a snapshot of the database as of the instant that that query
|
|
begins to run. Notice that two successive <command>SELECT</command> commands can
|
|
see different data, even though they are within a single transaction, if
|
|
other transactions
|
|
commit changes during execution of the first <command>SELECT</command>.
|
|
</para>
|
|
|
|
<para>
|
|
<command>UPDATE</command>, <command>DELETE</command>, and <command>SELECT
|
|
FOR UPDATE</command> commands behave the same as <command>SELECT</command>
|
|
in terms of searching for target rows: they will only find target rows
|
|
that were committed as of the command start time. However, such a target
|
|
row may have already been updated (or deleted or marked for update) by
|
|
another concurrent transaction by the time it is found. In this case, the
|
|
would-be updater will wait for the first updating transaction to commit or
|
|
roll back (if it is still in progress). If the first updater rolls back,
|
|
then its effects are negated and the second updater can proceed with
|
|
updating the originally found row. If the first updater commits, the
|
|
second updater will ignore the row if the first updater deleted it,
|
|
otherwise it will attempt to apply its operation to the updated version of
|
|
the row. The search condition of the command (the <literal>WHERE</> clause) is
|
|
re-evaluated to see if the updated version of the row still matches the
|
|
search condition. If so, the second updater proceeds with its operation,
|
|
starting from the updated version of the row.
|
|
</para>
|
|
|
|
<para>
|
|
Because of the above rule, it is possible for an updating command to see an
|
|
inconsistent snapshot: it can see the effects of concurrent updating
|
|
commands that affected the same rows it is trying to update, but it
|
|
does not see effects of those commands on other rows in the database.
|
|
This behavior makes Read Committed mode unsuitable for commands that
|
|
involve complex search conditions. However, it is just right for simpler
|
|
cases. For example, consider updating bank balances with transactions
|
|
like
|
|
|
|
<screen>
|
|
BEGIN;
|
|
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 12345;
|
|
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 7534;
|
|
COMMIT;
|
|
</screen>
|
|
|
|
If two such transactions concurrently try to change the balance of account
|
|
12345, we clearly want the second transaction to start from the updated
|
|
version of the account's row. Because each command is affecting only a
|
|
predetermined row, letting it see the updated version of the row does
|
|
not create any troublesome inconsistency.
|
|
</para>
|
|
|
|
<para>
|
|
Since in Read Committed mode each new command starts with a new snapshot
|
|
that includes all transactions committed up to that instant, subsequent
|
|
commands in the same transaction will see the effects of the committed
|
|
concurrent transaction in any case. The point at issue here is whether
|
|
or not within a <emphasis>single</> command we see an absolutely consistent
|
|
view of the database.
|
|
</para>
|
|
|
|
<para>
|
|
The partial transaction isolation provided by Read Committed mode is
|
|
adequate for many applications, and this mode is fast and simple to use.
|
|
However, for applications that do complex queries and updates, it may
|
|
be necessary to guarantee a more rigorously consistent view of the
|
|
database than the Read Committed mode provides.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="xact-serializable">
|
|
<title>Serializable Isolation Level</title>
|
|
|
|
<indexterm>
|
|
<primary>transaction isolation level</primary>
|
|
<secondary>serializable</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The level <firstterm>Serializable</firstterm> provides the strictest transaction
|
|
isolation. This level emulates serial transaction execution,
|
|
as if transactions had been executed one after another, serially,
|
|
rather than concurrently. However, applications using this level must
|
|
be prepared to retry transactions due to serialization failures.
|
|
</para>
|
|
|
|
<para>
|
|
When a transaction is on the serializable level,
|
|
a <command>SELECT</command> query sees only data committed before the
|
|
transaction began; it never sees either uncommitted data or changes
|
|
committed
|
|
during transaction execution by concurrent transactions. (However, the
|
|
<command>SELECT</command> does see the effects of previous updates
|
|
executed within its own transaction, even though they are not yet
|
|
committed.) This is different from Read Committed in that the
|
|
<command>SELECT</command>
|
|
sees a snapshot as of the start of the transaction, not as of the start
|
|
of the current query within the transaction. Thus, successive
|
|
<command>SELECT</command> commands within a single transaction always see the same
|
|
data.
|
|
</para>
|
|
|
|
<para>
|
|
<command>UPDATE</command>, <command>DELETE</command>, and <command>SELECT
|
|
FOR UPDATE</command> commands behave the same as <command>SELECT</command>
|
|
in terms of searching for target rows: they will only find target rows
|
|
that were committed as of the transaction start time. However, such a
|
|
target
|
|
row may have already been updated (or deleted or marked for update) by
|
|
another concurrent transaction by the time it is found. In this case, the
|
|
serializable transaction will wait for the first updating transaction to commit or
|
|
roll back (if it is still in progress). If the first updater rolls back,
|
|
then its effects are negated and the serializable transaction can proceed
|
|
with updating the originally found row. But if the first updater commits
|
|
(and actually updated or deleted the row, not just selected it for update)
|
|
then the serializable transaction will be rolled back with the message
|
|
|
|
<screen>
|
|
ERROR: could not serialize access due to concurrent update
|
|
</screen>
|
|
|
|
because a serializable transaction cannot modify rows changed by
|
|
other transactions after the serializable transaction began.
|
|
</para>
|
|
|
|
<para>
|
|
When the application receives this error message, it should abort
|
|
the current transaction and then retry the whole transaction from
|
|
the beginning. The second time through, the transaction sees the
|
|
previously-committed change as part of its initial view of the database,
|
|
so there is no logical conflict in using the new version of the row
|
|
as the starting point for the new transaction's update.
|
|
</para>
|
|
|
|
<para>
|
|
Note that only updating transactions may need to be retried; read-only
|
|
transactions will never have serialization conflicts.
|
|
</para>
|
|
|
|
<para>
|
|
The Serializable mode provides a rigorous guarantee that each
|
|
transaction sees a wholly consistent view of the database. However,
|
|
the application has to be prepared to retry transactions when concurrent
|
|
updates make it impossible to sustain the illusion of serial execution.
|
|
Since the cost of redoing complex transactions may be significant,
|
|
this mode is recommended only when updating transactions contain logic
|
|
sufficiently complex that they may give wrong answers in Read
|
|
Committed mode. Most commonly, Serializable mode is necessary when
|
|
a transaction executes several successive commands that must see
|
|
identical views of the database.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="explicit-locking">
|
|
<title>Explicit Locking</title>
|
|
|
|
<indexterm>
|
|
<primary>lock</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> provides various lock modes
|
|
to control concurrent access to data in tables. These modes can
|
|
be used for application-controlled locking in situations where
|
|
<acronym>MVCC</acronym> does not give the desired behavior. Also,
|
|
most <productname>PostgreSQL</productname> commands automatically
|
|
acquire locks of appropriate modes to ensure that referenced
|
|
tables are not dropped or modified in incompatible ways while the
|
|
command executes. (For example, <command>ALTER TABLE</> cannot be
|
|
executed concurrently with other operations on the same table.)
|
|
</para>
|
|
|
|
<sect2 id="locking-tables">
|
|
<title>Table-Level Locks</title>
|
|
|
|
<indexterm zone="locking-tables">
|
|
<primary>LOCK</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The list below shows the available lock modes and the contexts in
|
|
which they are used automatically by
|
|
<productname>PostgreSQL</productname>.
|
|
Remember that all of these lock modes are table-level locks,
|
|
even if the name contains the word
|
|
<quote>row</quote>; the names of the lock modes are historical.
|
|
To some extent the names reflect the typical usage of each lock
|
|
mode --- but the semantics are all the same. The only real difference
|
|
between one lock mode and another is the set of lock modes with
|
|
which each conflicts. Two transactions cannot hold locks of conflicting
|
|
modes on the same table at the same time. (However, a transaction
|
|
never conflicts with itself. For example, it may acquire
|
|
<literal>ACCESS EXCLUSIVE</literal> lock and later acquire
|
|
<literal>ACCESS SHARE</literal> lock on the same table.) Non-conflicting
|
|
lock modes may be held concurrently by many transactions. Notice in
|
|
particular that some lock modes are self-conflicting (for example,
|
|
an <literal>ACCESS EXCLUSIVE</literal> lock cannot be held by more than one
|
|
transaction at a time) while others are not self-conflicting (for example,
|
|
an <literal>ACCESS SHARE</literal> lock can be held by multiple transactions).
|
|
Once acquired, a lock is held till end of transaction.
|
|
</para>
|
|
|
|
<para>
|
|
To examine a list of the currently outstanding locks in a database
|
|
server, use the <literal>pg_locks</literal> system view. For more
|
|
information on monitoring the status of the lock manager
|
|
subsystem, refer to <xref linkend="monitoring">.
|
|
</para>
|
|
|
|
<variablelist>
|
|
<title>Table-level lock modes</title>
|
|
<varlistentry>
|
|
<term>
|
|
<literal>ACCESS SHARE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>ACCESS EXCLUSIVE</literal> lock
|
|
mode only.
|
|
</para>
|
|
|
|
<para>
|
|
The commands <command>SELECT</command> and
|
|
<command>ANALYZE</command> acquire a lock of this mode on
|
|
referenced tables. In general, any query that only reads a table
|
|
and does not modify it will acquire this lock mode.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>ROW SHARE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>EXCLUSIVE</literal> and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
</para>
|
|
|
|
<para>
|
|
The <command>SELECT FOR UPDATE</command> command acquires a
|
|
lock of this mode on the target table(s) (in addition to
|
|
<literal>ACCESS SHARE</literal> locks on any other tables
|
|
that are referenced but not selected <option>FOR UPDATE</option>).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>ROW EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>SHARE</literal>, <literal>SHARE ROW
|
|
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
</para>
|
|
|
|
<para>
|
|
The commands <command>UPDATE</command>,
|
|
<command>DELETE</command>, and <command>INSERT</command>
|
|
acquire this lock mode on the target table (in addition to
|
|
<literal>ACCESS SHARE</literal> locks on any other referenced
|
|
tables). In general, this lock mode will be acquired by any
|
|
command that modifies the data in a table.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>SHARE UPDATE EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>SHARE UPDATE EXCLUSIVE</literal>,
|
|
<literal>SHARE</literal>, <literal>SHARE ROW
|
|
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
This mode protects a table against
|
|
concurrent schema changes and <command>VACUUM</> runs.
|
|
</para>
|
|
|
|
<para>
|
|
Acquired by <command>VACUUM</command> (without <option>FULL</option>).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>SHARE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>ROW EXCLUSIVE</literal>,
|
|
<literal>SHARE UPDATE EXCLUSIVE</literal>, <literal>SHARE ROW
|
|
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
This mode protects a table against concurrent data changes.
|
|
</para>
|
|
|
|
<para>
|
|
Acquired by <command>CREATE INDEX</command>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>SHARE ROW EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>ROW EXCLUSIVE</literal>,
|
|
<literal>SHARE UPDATE EXCLUSIVE</literal>,
|
|
<literal>SHARE</literal>, <literal>SHARE ROW
|
|
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
</para>
|
|
|
|
<para>
|
|
This lock mode is not automatically acquired by any
|
|
<productname>PostgreSQL</productname> command.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>ROW SHARE</literal>, <literal>ROW
|
|
EXCLUSIVE</literal>, <literal>SHARE UPDATE
|
|
EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
|
|
ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
This mode allows only concurrent <literal>ACCESS SHARE</literal> locks,
|
|
i.e., only reads from the table can proceed in parallel with a
|
|
transaction holding this lock mode.
|
|
</para>
|
|
|
|
<para>
|
|
This lock mode is not automatically acquired by any
|
|
<productname>PostgreSQL</productname> command.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>ACCESS EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with locks of all modes (<literal>ACCESS
|
|
SHARE</literal>, <literal>ROW SHARE</literal>, <literal>ROW
|
|
EXCLUSIVE</literal>, <literal>SHARE UPDATE
|
|
EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
|
|
ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal>).
|
|
This mode guarantees that the
|
|
holder is the only transaction accessing the table in any way.
|
|
</para>
|
|
|
|
<para>
|
|
Acquired by the <command>ALTER TABLE</command>, <command>DROP
|
|
TABLE</command>, <command>REINDEX</command>,
|
|
<command>CLUSTER</command>, and <command>VACUUM FULL</command>
|
|
commands. This is also the default lock mode for <command>LOCK
|
|
TABLE</command> statements that do not specify a mode explicitly.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
|
|
<tip>
|
|
<para>
|
|
Only an <literal>ACCESS EXCLUSIVE</literal> lock blocks a
|
|
<command>SELECT</command> (without <option>FOR UPDATE</option>)
|
|
statement.
|
|
</para>
|
|
</tip>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="locking-rows">
|
|
<title>Row-Level Locks</title>
|
|
|
|
<para>
|
|
In addition to table-level locks, there are row-level locks.
|
|
A row-level lock on a specific row is automatically acquired when the
|
|
row is updated (or deleted or marked for update). The lock is held
|
|
until the transaction commits or rolls back.
|
|
Row-level locks do not affect data
|
|
querying; they block <emphasis>writers to the same row</emphasis>
|
|
only. To acquire a row-level lock on a row without actually
|
|
modifying the row, select the row with <command>SELECT FOR
|
|
UPDATE</command>. Note that once a particular row-level lock is
|
|
acquired, the transaction may update the row multiple times without
|
|
fear of conflicts.
|
|
</para>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> doesn't remember any
|
|
information about modified rows in memory, so it has no limit to
|
|
the number of rows locked at one time. However, locking a row
|
|
may cause a disk write; thus, for example, <command>SELECT FOR
|
|
UPDATE</command> will modify selected rows to mark them and so
|
|
will result in disk writes.
|
|
</para>
|
|
|
|
<para>
|
|
In addition to table and row locks, page-level share/exclusive locks are
|
|
used to control read/write access to table pages in the shared buffer
|
|
pool. These locks are released immediately after a row is fetched or
|
|
updated. Application developers normally need not be concerned with
|
|
page-level locks, but we mention them for completeness.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="locking-deadlocks">
|
|
<title>Deadlocks</title>
|
|
|
|
<indexterm zone="locking-deadlocks">
|
|
<primary>deadlock</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The use of explicit locking can increase the likelihood of
|
|
<firstterm>deadlocks</>, wherein two (or more) transactions each
|
|
hold locks that the other wants. For example, if transaction 1
|
|
acquires an exclusive lock on table A and then tries to acquire
|
|
an exclusive lock on table B, while transaction 2 has already
|
|
exclusive-locked table B and now wants an exclusive lock on table
|
|
A, then neither one can proceed.
|
|
<productname>PostgreSQL</productname> automatically detects
|
|
deadlock situations and resolves them by aborting one of the
|
|
transactions involved, allowing the other(s) to complete.
|
|
(Exactly which transaction will be aborted is difficult to
|
|
predict and should not be relied on.)
|
|
</para>
|
|
|
|
<para>
|
|
Note that deadlocks can also occur as the result of row-level
|
|
locks (and thus, they can occur even if explicit locking is not
|
|
used). Consider the case in which there are two concurrent
|
|
transactions modifying a table. The first transaction executes:
|
|
|
|
<screen>
|
|
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 11111;
|
|
</screen>
|
|
|
|
This acquires a row-level lock on the row with the specified
|
|
account number. Then, the second transaction executes:
|
|
|
|
<screen>
|
|
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 22222;
|
|
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 11111;
|
|
</screen>
|
|
|
|
The first <command>UPDATE</command> statement successfully
|
|
acquires a row-level lock on the specified row, so it succeeds in
|
|
updating that row. However, the second <command>UPDATE</command>
|
|
statement finds that the row it is attempting to update has
|
|
already been locked, so it waits for the transaction that
|
|
acquired the lock to complete. Transaction two is now waiting on
|
|
transaction one to complete before it continues execution. Now,
|
|
transaction one executes:
|
|
|
|
<screen>
|
|
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
|
|
</screen>
|
|
|
|
Transaction one attempts to acquire a row-level lock on the
|
|
specified row, but it cannot: transaction two already holds such
|
|
a lock. So it waits for transaction two to complete. Thus,
|
|
transaction one is blocked on transaction two, and transaction
|
|
two is blocked on transaction one: a deadlock
|
|
condition. <productname>PostgreSQL</productname> will detect this
|
|
situation and abort one of the transactions.
|
|
</para>
|
|
|
|
<para>
|
|
The best defense against deadlocks is generally to avoid them by
|
|
being certain that all applications using a database acquire
|
|
locks on multiple objects in a consistent order. That was the
|
|
reason for the previous deadlock example: if both transactions
|
|
had updated the rows in the same order, no deadlock would have
|
|
occurred. One should also ensure that the first lock acquired on
|
|
an object in a transaction is the highest mode that will be
|
|
needed for that object. If it is not feasible to verify this in
|
|
advance, then deadlocks may be handled on-the-fly by retrying
|
|
transactions that are aborted due to deadlock.
|
|
</para>
|
|
|
|
<para>
|
|
So long as no deadlock situation is detected, a transaction seeking
|
|
either a table-level or row-level lock will wait indefinitely for
|
|
conflicting locks to be released. This means it is a bad idea for
|
|
applications to hold transactions open for long periods of time
|
|
(e.g., while waiting for user input).
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="applevel-consistency">
|
|
<title>Data Consistency Checks at the Application Level</title>
|
|
|
|
<para>
|
|
Because readers in <productname>PostgreSQL</productname>
|
|
do not lock data, regardless of
|
|
transaction isolation level, data read by one transaction can be
|
|
overwritten by another concurrent transaction. In other words,
|
|
if a row is returned by <command>SELECT</command> it doesn't mean that
|
|
the row is still current at the instant it is returned (i.e., sometime
|
|
after the current query began). The row might have been modified or
|
|
deleted by an already-committed transaction that committed after this one
|
|
started.
|
|
Even if the row is still valid <quote>now</quote>, it could be changed or
|
|
deleted
|
|
before the current transaction does a commit or rollback.
|
|
</para>
|
|
|
|
<para>
|
|
Another way to think about it is that each
|
|
transaction sees a snapshot of the database contents, and concurrently
|
|
executing transactions may very well see different snapshots. So the
|
|
whole concept of <quote>now</quote> is somewhat suspect anyway.
|
|
This is not normally
|
|
a big problem if the client applications are isolated from each other,
|
|
but if the clients can communicate via channels outside the database
|
|
then serious confusion may ensue.
|
|
</para>
|
|
|
|
<para>
|
|
To ensure the current validity of a row and protect it against
|
|
concurrent updates one must use <command>SELECT FOR UPDATE</command> or
|
|
an appropriate <command>LOCK TABLE</command> statement.
|
|
(<command>SELECT FOR UPDATE</command> locks just the returned rows against
|
|
concurrent updates, while <command>LOCK TABLE</command> locks the
|
|
whole table.)
|
|
This should be taken into account when porting applications to
|
|
<productname>PostgreSQL</productname> from other environments.
|
|
|
|
<note>
|
|
<para>
|
|
Before version 6.5 <productname>PostgreSQL</productname> used
|
|
read locks, and so the above consideration is also the case when
|
|
upgrading from <productname>PostgreSQL</productname> versions
|
|
prior to 6.5.
|
|
</para>
|
|
</note>
|
|
</para>
|
|
|
|
<para>
|
|
Global validity checks require extra thought under <acronym>MVCC</acronym>. For
|
|
example, a banking application might wish to check that the sum of
|
|
all credits in one table equals the sum of debits in another table,
|
|
when both tables are being actively updated. Comparing the results of two
|
|
successive <literal>SELECT sum(...)</literal> commands will not work reliably under
|
|
Read Committed mode, since the second query will likely include the results
|
|
of transactions not counted by the first. Doing the two sums in a
|
|
single serializable transaction will give an accurate picture of the
|
|
effects of transactions that committed before the serializable transaction
|
|
started --- but one might legitimately wonder whether the answer is still
|
|
relevant by the time it is delivered. If the serializable transaction
|
|
itself applied some changes before trying to make the consistency check,
|
|
the usefulness of the check becomes even more debatable, since now it
|
|
includes some but not all post-transaction-start changes. In such cases
|
|
a careful person might wish to lock all tables needed for the check,
|
|
in order to get an indisputable picture of current reality. A
|
|
<literal>SHARE</> mode (or higher) lock guarantees that there are no
|
|
uncommitted changes in the locked table, other than those of the current
|
|
transaction.
|
|
</para>
|
|
|
|
<para>
|
|
Note also that if one is
|
|
relying on explicit locks to prevent concurrent changes, one should use
|
|
Read Committed mode, or in Serializable mode be careful to obtain the
|
|
lock(s) before performing queries. An explicit lock obtained in a
|
|
serializable transaction guarantees that no other transactions modifying
|
|
the table are still running, but if the snapshot seen by the
|
|
transaction predates obtaining the lock, it may predate some now-committed
|
|
changes in the table. A serializable transaction's snapshot is actually
|
|
frozen at the start of its first query or data-modification command
|
|
(<literal>SELECT</>, <literal>INSERT</>,
|
|
<literal>UPDATE</>, or <literal>DELETE</>), so
|
|
it's possible to obtain explicit locks before the snapshot is
|
|
frozen.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="locking-indexes">
|
|
<title>Locking and Indexes</title>
|
|
|
|
<indexterm zone="locking-indexes">
|
|
<primary>index</primary>
|
|
<secondary>locks</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Though <productname>PostgreSQL</productname>
|
|
provides nonblocking read/write access to table
|
|
data, nonblocking read/write access is not currently offered for every
|
|
index access method implemented
|
|
in <productname>PostgreSQL</productname>.
|
|
The various index types are handled as follows:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>
|
|
B-tree indexes
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Short-term share/exclusive page-level locks are used for
|
|
read/write access. Locks are released immediately after each
|
|
index row is fetched or inserted. B-tree indexes provide
|
|
the highest concurrency without deadlock conditions.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<acronym>GiST</acronym> and R-tree indexes
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Share/exclusive index-level locks are used for read/write access.
|
|
Locks are released after the command is done.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
Hash indexes
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Share/exclusive page-level locks are used for read/write
|
|
access. Locks are released after the page is processed.
|
|
Page-level locks provide better concurrency than index-level
|
|
ones but are liable to deadlocks.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
|
|
<para>
|
|
In short, B-tree indexes offer the best performance for concurrent
|
|
applications; since they also have more features than hash
|
|
indexes, they are the recommended index type for concurrent
|
|
applications that need to index scalar data. When dealing with
|
|
non-scalar data, B-trees obviously cannot be used; in that
|
|
situation, application developers should be aware of the
|
|
relatively poor concurrent performance of GiST and R-tree
|
|
indexes.
|
|
</para>
|
|
</sect1>
|
|
</chapter>
|
|
|
|
<!-- Keep this comment at the end of the file
|
|
Local variables:
|
|
mode:sgml
|
|
sgml-omittag:nil
|
|
sgml-shorttag:t
|
|
sgml-minimize-attributes:nil
|
|
sgml-always-quote-attributes:t
|
|
sgml-indent-step:1
|
|
sgml-indent-data:t
|
|
sgml-parent-document:nil
|
|
sgml-default-dtd-file:"./reference.ced"
|
|
sgml-exposed-tags:nil
|
|
sgml-local-catalogs:("/usr/lib/sgml/catalog")
|
|
sgml-local-ecat-files:nil
|
|
End:
|
|
-->
|