mirror of
https://github.com/postgres/postgres.git
synced 2025-05-28 05:21:27 +03:00
Reported-by: Tom Lane Discussion: https://postgr.es/m/31072.1585690490@sss.pgh.pa.us Backpatch-through: 9.5 - 12
1795 lines
71 KiB
Plaintext
1795 lines
71 KiB
Plaintext
<!-- doc/src/sgml/mvcc.sgml -->
|
|
|
|
<chapter id="mvcc">
|
|
<title>Concurrency Control</title>
|
|
|
|
<indexterm>
|
|
<primary>concurrency</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
This chapter describes the behavior of the
|
|
<productname>PostgreSQL</productname> database system when two or
|
|
more sessions try to access the same data at the same time. The
|
|
goals in that situation are to allow efficient access for all
|
|
sessions while maintaining strict data integrity. Every developer
|
|
of database applications should be familiar with the topics covered
|
|
in this chapter.
|
|
</para>
|
|
|
|
<sect1 id="mvcc-intro">
|
|
<title>Introduction</title>
|
|
|
|
<indexterm>
|
|
<primary>Multiversion Concurrency Control</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>MVCC</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>Serializable Snapshot Isolation</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>SSI</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> provides a rich set of tools
|
|
for developers to manage concurrent access to data. Internally,
|
|
data consistency is maintained by using a multiversion
|
|
model (Multiversion Concurrency Control, <acronym>MVCC</acronym>).
|
|
This means that each SQL statement sees
|
|
a snapshot of data (a <firstterm>database version</firstterm>)
|
|
as it was some
|
|
time ago, regardless of the current state of the underlying data.
|
|
This prevents statements from viewing inconsistent data produced
|
|
by concurrent transactions performing updates on the same
|
|
data rows, providing <firstterm>transaction isolation</firstterm>
|
|
for each database session. <acronym>MVCC</acronym>, by eschewing
|
|
the locking methodologies of traditional database systems,
|
|
minimizes lock contention in order to allow for reasonable
|
|
performance in multiuser environments.
|
|
</para>
|
|
|
|
<para>
|
|
The main advantage of using the <acronym>MVCC</acronym> model of
|
|
concurrency control rather than locking is that in
|
|
<acronym>MVCC</acronym> locks acquired for querying (reading) data
|
|
do not conflict with locks acquired for writing data, and so
|
|
reading never blocks writing and writing never blocks reading.
|
|
<productname>PostgreSQL</productname> maintains this guarantee
|
|
even when providing the strictest level of transaction
|
|
isolation through the use of an innovative <firstterm>Serializable
|
|
Snapshot Isolation</firstterm> (<acronym>SSI</acronym>) level.
|
|
</para>
|
|
|
|
<para>
|
|
Table- and row-level locking facilities are also available in
|
|
<productname>PostgreSQL</productname> for applications which don't
|
|
generally need full transaction isolation and prefer to explicitly
|
|
manage particular points of conflict. However, proper
|
|
use of <acronym>MVCC</acronym> will generally provide better
|
|
performance than locks. In addition, application-defined advisory
|
|
locks provide a mechanism for acquiring locks that are not tied
|
|
to a single transaction.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="transaction-iso">
|
|
<title>Transaction Isolation</title>
|
|
|
|
<indexterm>
|
|
<primary>transaction isolation</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <acronym>SQL</acronym> standard defines four levels of
|
|
transaction isolation. The most strict is Serializable,
|
|
which is defined by the standard in a paragraph which says that any
|
|
concurrent execution of a set of Serializable transactions is guaranteed
|
|
to produce the same effect as running them one at a time in some order.
|
|
The other three levels are defined in terms of phenomena, resulting from
|
|
interaction between concurrent transactions, which must not occur at
|
|
each level. The standard notes that due to the definition of
|
|
Serializable, none of these phenomena are possible at that level. (This
|
|
is hardly surprising -- if the effect of the transactions must be
|
|
consistent with having been run one at a time, how could you see any
|
|
phenomena caused by interactions?)
|
|
</para>
|
|
|
|
<para>
|
|
The phenomena which are prohibited at various levels are:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>
|
|
dirty read
|
|
<indexterm><primary>dirty read</primary></indexterm>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
A transaction reads data written by a concurrent uncommitted transaction.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
nonrepeatable read
|
|
<indexterm><primary>nonrepeatable read</primary></indexterm>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
A transaction re-reads data it has previously read and finds that data
|
|
has been modified by another transaction (that committed since the
|
|
initial read).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
phantom read
|
|
<indexterm><primary>phantom read</primary></indexterm>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
A transaction re-executes a query returning a set of rows that satisfy a
|
|
search condition and finds that the set of rows satisfying the condition
|
|
has changed due to another recently-committed transaction.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
serialization anomaly
|
|
<indexterm><primary>serialization anomaly</primary></indexterm>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
The result of successfully committing a group of transactions
|
|
is inconsistent with all possible orderings of running those
|
|
transactions one at a time.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
|
|
<para>
|
|
<indexterm>
|
|
<primary>transaction isolation level</primary>
|
|
</indexterm>
|
|
The SQL standard and PostgreSQL-implemented transaction isolation levels
|
|
are described in <xref linkend="mvcc-isolevel-table">.
|
|
</para>
|
|
|
|
<table tocentry="1" id="mvcc-isolevel-table">
|
|
<title>Transaction Isolation Levels</title>
|
|
<tgroup cols="5">
|
|
<thead>
|
|
<row>
|
|
<entry>
|
|
Isolation Level
|
|
</entry>
|
|
<entry>
|
|
Dirty Read
|
|
</entry>
|
|
<entry>
|
|
Nonrepeatable Read
|
|
</entry>
|
|
<entry>
|
|
Phantom Read
|
|
</entry>
|
|
<entry>
|
|
Serialization Anomaly
|
|
</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>
|
|
Read uncommitted
|
|
</entry>
|
|
<entry>
|
|
Allowed, but not in PG
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>
|
|
Read committed
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>
|
|
Repeatable read
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Allowed, but not in PG
|
|
</entry>
|
|
<entry>
|
|
Possible
|
|
</entry>
|
|
</row>
|
|
|
|
<row>
|
|
<entry>
|
|
Serializable
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
<entry>
|
|
Not possible
|
|
</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
|
|
<para>
|
|
In <productname>PostgreSQL</productname>, you can request any of
|
|
the four standard transaction isolation levels, but internally only
|
|
three distinct isolation levels are implemented, i.e. PostgreSQL's
|
|
Read Uncommitted mode behaves like Read Committed. This is because
|
|
it is the only sensible way to map the standard isolation levels to
|
|
PostgreSQL's multiversion concurrency control architecture.
|
|
</para>
|
|
|
|
<para>
|
|
The table also shows that PostgreSQL's Repeatable Read implementation
|
|
does not allow phantom reads. Stricter behavior is permitted by the
|
|
SQL standard: the four isolation levels only define which phenomena
|
|
must not happen, not which phenomena <emphasis>must</> happen.
|
|
The behavior of the available isolation levels is detailed in the
|
|
following subsections.
|
|
</para>
|
|
|
|
<para>
|
|
To set the transaction isolation level of a transaction, use the
|
|
command <xref linkend="sql-set-transaction">.
|
|
</para>
|
|
|
|
<important>
|
|
<para>
|
|
Some <productname>PostgreSQL</productname> data types and functions have
|
|
special rules regarding transactional behavior. In particular, changes
|
|
made to a sequence (and therefore the counter of a
|
|
column declared using <type>serial</type>) are immediately visible
|
|
to all other transactions and are not rolled back if the transaction
|
|
that made the changes aborts. See <xref linkend="functions-sequence">
|
|
and <xref linkend="datatype-serial">.
|
|
</para>
|
|
</important>
|
|
|
|
<sect2 id="xact-read-committed">
|
|
<title>Read Committed Isolation Level</title>
|
|
|
|
<indexterm>
|
|
<primary>transaction isolation level</primary>
|
|
<secondary>read committed</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>read committed</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<firstterm>Read Committed</firstterm> is the default isolation
|
|
level in <productname>PostgreSQL</productname>. When a transaction
|
|
uses this isolation level, a <command>SELECT</command> query
|
|
(without a <literal>FOR UPDATE/SHARE</> clause) sees only data
|
|
committed before the query began; it never sees either uncommitted
|
|
data or changes committed during query execution by concurrent
|
|
transactions. In effect, a <command>SELECT</command> query sees
|
|
a snapshot of the database as of the instant the query begins to
|
|
run. However, <command>SELECT</command> does see the effects
|
|
of previous updates executed within its own transaction, even
|
|
though they are not yet committed. Also note that two successive
|
|
<command>SELECT</command> commands can see different data, even
|
|
though they are within a single transaction, if other transactions
|
|
commit changes after the first <command>SELECT</command> starts and
|
|
before the second <command>SELECT</command> starts.
|
|
</para>
|
|
|
|
<para>
|
|
<command>UPDATE</command>, <command>DELETE</command>, <command>SELECT
|
|
FOR UPDATE</command>, and <command>SELECT FOR SHARE</command> commands
|
|
behave the same as <command>SELECT</command>
|
|
in terms of searching for target rows: they will only find target rows
|
|
that were committed as of the command start time. However, such a target
|
|
row might have already been updated (or deleted or locked) by
|
|
another concurrent transaction by the time it is found. In this case, the
|
|
would-be updater will wait for the first updating transaction to commit or
|
|
roll back (if it is still in progress). If the first updater rolls back,
|
|
then its effects are negated and the second updater can proceed with
|
|
updating the originally found row. If the first updater commits, the
|
|
second updater will ignore the row if the first updater deleted it,
|
|
otherwise it will attempt to apply its operation to the updated version of
|
|
the row. The search condition of the command (the <literal>WHERE</> clause) is
|
|
re-evaluated to see if the updated version of the row still matches the
|
|
search condition. If so, the second updater proceeds with its operation
|
|
using the updated version of the row. In the case of
|
|
<command>SELECT FOR UPDATE</command> and <command>SELECT FOR
|
|
SHARE</command>, this means it is the updated version of the row that is
|
|
locked and returned to the client.
|
|
</para>
|
|
|
|
<para>
|
|
<command>INSERT</command> with an <literal>ON CONFLICT DO UPDATE</> clause
|
|
behaves similarly. In Read Committed mode, each row proposed for insertion
|
|
will either insert or update. Unless there are unrelated errors, one of
|
|
those two outcomes is guaranteed. If a conflict originates in another
|
|
transaction whose effects are not yet visible to the <command>INSERT
|
|
</command>, the <command>UPDATE</command> clause will affect that row,
|
|
even though possibly <emphasis>no</> version of that row is
|
|
conventionally visible to the command.
|
|
</para>
|
|
|
|
<para>
|
|
<command>INSERT</command> with an <literal>ON CONFLICT DO
|
|
NOTHING</> clause may have insertion not proceed for a row due to
|
|
the outcome of another transaction whose effects are not visible
|
|
to the <command>INSERT</command> snapshot. Again, this is only
|
|
the case in Read Committed mode.
|
|
</para>
|
|
|
|
<para>
|
|
Because of the above rules, it is possible for an updating command to see
|
|
an inconsistent snapshot: it can see the effects of concurrent updating
|
|
commands on the same rows it is trying to update, but it
|
|
does not see effects of those commands on other rows in the database.
|
|
This behavior makes Read Committed mode unsuitable for commands that
|
|
involve complex search conditions; however, it is just right for simpler
|
|
cases. For example, consider updating bank balances with transactions
|
|
like:
|
|
|
|
<screen>
|
|
BEGIN;
|
|
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 12345;
|
|
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 7534;
|
|
COMMIT;
|
|
</screen>
|
|
|
|
If two such transactions concurrently try to change the balance of account
|
|
12345, we clearly want the second transaction to start with the updated
|
|
version of the account's row. Because each command is affecting only a
|
|
predetermined row, letting it see the updated version of the row does
|
|
not create any troublesome inconsistency.
|
|
</para>
|
|
|
|
<para>
|
|
More complex usage can produce undesirable results in Read Committed
|
|
mode. For example, consider a <command>DELETE</command> command
|
|
operating on data that is being both added and removed from its
|
|
restriction criteria by another command, e.g., assume
|
|
<literal>website</literal> is a two-row table with
|
|
<literal>website.hits</literal> equaling <literal>9</literal> and
|
|
<literal>10</literal>:
|
|
|
|
<screen>
|
|
BEGIN;
|
|
UPDATE website SET hits = hits + 1;
|
|
-- run from another session: DELETE FROM website WHERE hits = 10;
|
|
COMMIT;
|
|
</screen>
|
|
|
|
The <command>DELETE</command> will have no effect even though
|
|
there is a <literal>website.hits = 10</literal> row before and
|
|
after the <command>UPDATE</command>. This occurs because the
|
|
pre-update row value <literal>9</> is skipped, and when the
|
|
<command>UPDATE</command> completes and <command>DELETE</command>
|
|
obtains a lock, the new row value is no longer <literal>10</> but
|
|
<literal>11</>, which no longer matches the criteria.
|
|
</para>
|
|
|
|
<para>
|
|
Because Read Committed mode starts each command with a new snapshot
|
|
that includes all transactions committed up to that instant,
|
|
subsequent commands in the same transaction will see the effects
|
|
of the committed concurrent transaction in any case. The point
|
|
at issue above is whether or not a <emphasis>single</> command
|
|
sees an absolutely consistent view of the database.
|
|
</para>
|
|
|
|
<para>
|
|
The partial transaction isolation provided by Read Committed mode
|
|
is adequate for many applications, and this mode is fast and simple
|
|
to use; however, it is not sufficient for all cases. Applications
|
|
that do complex queries and updates might require a more rigorously
|
|
consistent view of the database than Read Committed mode provides.
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="xact-repeatable-read">
|
|
<title>Repeatable Read Isolation Level</title>
|
|
|
|
<indexterm>
|
|
<primary>transaction isolation level</primary>
|
|
<secondary>repeatable read</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>repeatable read</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <firstterm>Repeatable Read</firstterm> isolation level only sees
|
|
data committed before the transaction began; it never sees either
|
|
uncommitted data or changes committed during transaction execution
|
|
by concurrent transactions. (However, the query does see the
|
|
effects of previous updates executed within its own transaction,
|
|
even though they are not yet committed.) This is a stronger
|
|
guarantee than is required by the <acronym>SQL</acronym> standard
|
|
for this isolation level, and prevents all of the phenomena described
|
|
in <xref linkend="mvcc-isolevel-table"> except for serialization
|
|
anomalies. As mentioned above, this is
|
|
specifically allowed by the standard, which only describes the
|
|
<emphasis>minimum</emphasis> protections each isolation level must
|
|
provide.
|
|
</para>
|
|
|
|
<para>
|
|
This level is different from Read Committed in that a query in a
|
|
repeatable read transaction sees a snapshot as of the start of the
|
|
first non-transaction-control statement in the
|
|
<emphasis>transaction</>, not as of the start
|
|
of the current statement within the transaction. Thus, successive
|
|
<command>SELECT</command> commands within a <emphasis>single</>
|
|
transaction see the same data, i.e., they do not see changes made by
|
|
other transactions that committed after their own transaction started.
|
|
</para>
|
|
|
|
<para>
|
|
Applications using this level must be prepared to retry transactions
|
|
due to serialization failures.
|
|
</para>
|
|
|
|
<para>
|
|
<command>UPDATE</command>, <command>DELETE</command>, <command>SELECT
|
|
FOR UPDATE</command>, and <command>SELECT FOR SHARE</command> commands
|
|
behave the same as <command>SELECT</command>
|
|
in terms of searching for target rows: they will only find target rows
|
|
that were committed as of the transaction start time. However, such a
|
|
target row might have already been updated (or deleted or locked) by
|
|
another concurrent transaction by the time it is found. In this case, the
|
|
repeatable read transaction will wait for the first updating transaction to commit or
|
|
roll back (if it is still in progress). If the first updater rolls back,
|
|
then its effects are negated and the repeatable read transaction can proceed
|
|
with updating the originally found row. But if the first updater commits
|
|
(and actually updated or deleted the row, not just locked it)
|
|
then the repeatable read transaction will be rolled back with the message
|
|
|
|
<screen>
|
|
ERROR: could not serialize access due to concurrent update
|
|
</screen>
|
|
|
|
because a repeatable read transaction cannot modify or lock rows changed by
|
|
other transactions after the repeatable read transaction began.
|
|
</para>
|
|
|
|
<para>
|
|
When an application receives this error message, it should abort
|
|
the current transaction and retry the whole transaction from
|
|
the beginning. The second time through, the transaction will see the
|
|
previously-committed change as part of its initial view of the database,
|
|
so there is no logical conflict in using the new version of the row
|
|
as the starting point for the new transaction's update.
|
|
</para>
|
|
|
|
<para>
|
|
Note that only updating transactions might need to be retried; read-only
|
|
transactions will never have serialization conflicts.
|
|
</para>
|
|
|
|
<para>
|
|
The Repeatable Read mode provides a rigorous guarantee that each
|
|
transaction sees a completely stable view of the database. However,
|
|
this view will not necessarily always be consistent with some serial
|
|
(one at a time) execution of concurrent transactions of the same level.
|
|
For example, even a read only transaction at this level may see a
|
|
control record updated to show that a batch has been completed but
|
|
<emphasis>not</emphasis> see one of the detail records which is logically
|
|
part of the batch because it read an earlier revision of the control
|
|
record. Attempts to enforce business rules by transactions running at
|
|
this isolation level are not likely to work correctly without careful use
|
|
of explicit locks to block conflicting transactions.
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
Prior to <productname>PostgreSQL</productname> version 9.1, a request
|
|
for the Serializable transaction isolation level provided exactly the
|
|
same behavior described here. To retain the legacy Serializable
|
|
behavior, Repeatable Read should now be requested.
|
|
</para>
|
|
</note>
|
|
</sect2>
|
|
|
|
<sect2 id="xact-serializable">
|
|
<title>Serializable Isolation Level</title>
|
|
|
|
<indexterm>
|
|
<primary>transaction isolation level</primary>
|
|
<secondary>serializable</secondary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>serializable</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>predicate locking</primary>
|
|
</indexterm>
|
|
|
|
<indexterm>
|
|
<primary>serialization anomaly</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The <firstterm>Serializable</firstterm> isolation level provides
|
|
the strictest transaction isolation. This level emulates serial
|
|
transaction execution for all committed transactions;
|
|
as if transactions had been executed one after another, serially,
|
|
rather than concurrently. However, like the Repeatable Read level,
|
|
applications using this level must
|
|
be prepared to retry transactions due to serialization failures.
|
|
In fact, this isolation level works exactly the same as Repeatable
|
|
Read except that it monitors for conditions which could make
|
|
execution of a concurrent set of serializable transactions behave
|
|
in a manner inconsistent with all possible serial (one at a time)
|
|
executions of those transactions. This monitoring does not
|
|
introduce any blocking beyond that present in repeatable read, but
|
|
there is some overhead to the monitoring, and detection of the
|
|
conditions which could cause a
|
|
<firstterm>serialization anomaly</firstterm> will trigger a
|
|
<firstterm>serialization failure</firstterm>.
|
|
</para>
|
|
|
|
<para>
|
|
As an example,
|
|
consider a table <structname>mytab</>, initially containing:
|
|
<screen>
|
|
class | value
|
|
-------+-------
|
|
1 | 10
|
|
1 | 20
|
|
2 | 100
|
|
2 | 200
|
|
</screen>
|
|
Suppose that serializable transaction A computes:
|
|
<screen>
|
|
SELECT SUM(value) FROM mytab WHERE class = 1;
|
|
</screen>
|
|
and then inserts the result (30) as the <structfield>value</> in a
|
|
new row with <structfield>class</><literal> = 2</>. Concurrently, serializable
|
|
transaction B computes:
|
|
<screen>
|
|
SELECT SUM(value) FROM mytab WHERE class = 2;
|
|
</screen>
|
|
and obtains the result 300, which it inserts in a new row with
|
|
<structfield>class</><literal> = 1</>. Then both transactions try to commit.
|
|
If either transaction were running at the Repeatable Read isolation level,
|
|
both would be allowed to commit; but since there is no serial order of execution
|
|
consistent with the result, using Serializable transactions will allow one
|
|
transaction to commit and will roll the other back with this message:
|
|
|
|
<screen>
|
|
ERROR: could not serialize access due to read/write dependencies among transactions
|
|
</screen>
|
|
|
|
This is because if A had
|
|
executed before B, B would have computed the sum 330, not 300, and
|
|
similarly the other order would have resulted in a different sum
|
|
computed by A.
|
|
</para>
|
|
|
|
<para>
|
|
When relying on Serializable transactions to prevent anomalies, it is
|
|
important that any data read from a permanent user table not be
|
|
considered valid until the transaction which read it has successfully
|
|
committed. This is true even for read-only transactions, except that
|
|
data read within a <firstterm>deferrable</firstterm> read-only
|
|
transaction is known to be valid as soon as it is read, because such a
|
|
transaction waits until it can acquire a snapshot guaranteed to be free
|
|
from such problems before starting to read any data. In all other cases
|
|
applications must not depend on results read during a transaction that
|
|
later aborted; instead, they should retry the transaction until it
|
|
succeeds.
|
|
</para>
|
|
|
|
<para>
|
|
To guarantee true serializability <productname>PostgreSQL</productname>
|
|
uses <firstterm>predicate locking</>, which means that it keeps locks
|
|
which allow it to determine when a write would have had an impact on
|
|
the result of a previous read from a concurrent transaction, had it run
|
|
first. In <productname>PostgreSQL</productname> these locks do not
|
|
cause any blocking and therefore can <emphasis>not</> play any part in
|
|
causing a deadlock. They are used to identify and flag dependencies
|
|
among concurrent Serializable transactions which in certain combinations
|
|
can lead to serialization anomalies. In contrast, a Read Committed or
|
|
Repeatable Read transaction which wants to ensure data consistency may
|
|
need to take out a lock on an entire table, which could block other
|
|
users attempting to use that table, or it may use <literal>SELECT FOR
|
|
UPDATE</literal> or <literal>SELECT FOR SHARE</literal> which not only
|
|
can block other transactions but cause disk access.
|
|
</para>
|
|
|
|
<para>
|
|
Predicate locks in <productname>PostgreSQL</productname>, like in most
|
|
other database systems, are based on data actually accessed by a
|
|
transaction. These will show up in the
|
|
<link linkend="view-pg-locks"><structname>pg_locks</structname></link>
|
|
system view with a <literal>mode</> of <literal>SIReadLock</>. The
|
|
particular locks
|
|
acquired during execution of a query will depend on the plan used by
|
|
the query, and multiple finer-grained locks (e.g., tuple locks) may be
|
|
combined into fewer coarser-grained locks (e.g., page locks) during the
|
|
course of the transaction to prevent exhaustion of the memory used to
|
|
track the locks. A <literal>READ ONLY</> transaction may be able to
|
|
release its SIRead locks before completion, if it detects that no
|
|
conflicts can still occur which could lead to a serialization anomaly.
|
|
In fact, <literal>READ ONLY</> transactions will often be able to
|
|
establish that fact at startup and avoid taking any predicate locks.
|
|
If you explicitly request a <literal>SERIALIZABLE READ ONLY DEFERRABLE</>
|
|
transaction, it will block until it can establish this fact. (This is
|
|
the <emphasis>only</> case where Serializable transactions block but
|
|
Repeatable Read transactions don't.) On the other hand, SIRead locks
|
|
often need to be kept past transaction commit, until overlapping read
|
|
write transactions complete.
|
|
</para>
|
|
|
|
<para>
|
|
Consistent use of Serializable transactions can simplify development.
|
|
The guarantee that any set of successfully committed concurrent
|
|
Serializable transactions will have the same effect as if they were run
|
|
one at a time means that if you can demonstrate that a single transaction,
|
|
as written, will do the right thing when run by itself, you can have
|
|
confidence that it will do the right thing in any mix of Serializable
|
|
transactions, even without any information about what those other
|
|
transactions might do, or it will not successfully commit. It is
|
|
important that an environment which uses this technique have a
|
|
generalized way of handling serialization failures (which always return
|
|
with a SQLSTATE value of '40001'), because it will be very hard to
|
|
predict exactly which transactions might contribute to the read/write
|
|
dependencies and need to be rolled back to prevent serialization
|
|
anomalies. The monitoring of read/write dependencies has a cost, as does
|
|
the restart of transactions which are terminated with a serialization
|
|
failure, but balanced against the cost and blocking involved in use of
|
|
explicit locks and <literal>SELECT FOR UPDATE</> or <literal>SELECT FOR
|
|
SHARE</>, Serializable transactions are the best performance choice
|
|
for some environments.
|
|
</para>
|
|
|
|
<para>
|
|
While <productname>PostgreSQL</>'s Serializable transaction isolation
|
|
level only allows concurrent transactions to commit if it can prove there
|
|
is a serial order of execution that would produce the same effect, it
|
|
doesn't always prevent errors from being raised that would not occur in
|
|
true serial execution. In particular, it is possible to see unique
|
|
constraint violations caused by conflicts with overlapping Serializable
|
|
transactions even after explicitly checking that the key isn't present
|
|
before attempting to insert it. This can be avoided by making sure
|
|
that <emphasis>all</> Serializable transactions that insert potentially
|
|
conflicting keys explicitly check if they can do so first. For example,
|
|
imagine an application that asks the user for a new key and then checks
|
|
that it doesn't exist already by trying to select it first, or generates
|
|
a new key by selecting the maximum existing key and adding one. If some
|
|
Serializable transactions insert new keys directly without following this
|
|
protocol, unique constraints violations might be reported even in cases
|
|
where they could not occur in a serial execution of the concurrent
|
|
transactions.
|
|
</para>
|
|
|
|
<para>
|
|
For optimal performance when relying on Serializable transactions for
|
|
concurrency control, these issues should be considered:
|
|
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
Declare transactions as <literal>READ ONLY</> when possible.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Control the number of active connections, using a connection pool if
|
|
needed. This is always an important performance consideration, but
|
|
it can be particularly important in a busy system using Serializable
|
|
transactions.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Don't put more into a single transaction than needed for integrity
|
|
purposes.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Don't leave connections dangling <quote>idle in transaction</quote>
|
|
longer than necessary. The configuration parameter
|
|
<xref linkend="guc-idle-in-transaction-session-timeout"> may be used to
|
|
automatically disconnect lingering sessions.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
Eliminate explicit locks, <literal>SELECT FOR UPDATE</>, and
|
|
<literal>SELECT FOR SHARE</> where no longer needed due to the
|
|
protections automatically provided by Serializable transactions.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
When the system is forced to combine multiple page-level predicate
|
|
locks into a single relation-level predicate lock because the predicate
|
|
lock table is short of memory, an increase in the rate of serialization
|
|
failures may occur. You can avoid this by increasing
|
|
<xref linkend="guc-max-pred-locks-per-transaction">,
|
|
<xref linkend="guc-max-pred-locks-per-relation">, and/or
|
|
<xref linkend="guc-max-pred-locks-per-page">.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
A sequential scan will always necessitate a relation-level predicate
|
|
lock. This can result in an increased rate of serialization failures.
|
|
It may be helpful to encourage the use of index scans by reducing
|
|
<xref linkend="guc-random-page-cost"> and/or increasing
|
|
<xref linkend="guc-cpu-tuple-cost">. Be sure to weigh any decrease
|
|
in transaction rollbacks and restarts against any overall change in
|
|
query execution time.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="explicit-locking">
|
|
<title>Explicit Locking</title>
|
|
|
|
<indexterm>
|
|
<primary>lock</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> provides various lock modes
|
|
to control concurrent access to data in tables. These modes can
|
|
be used for application-controlled locking in situations where
|
|
<acronym>MVCC</acronym> does not give the desired behavior. Also,
|
|
most <productname>PostgreSQL</productname> commands automatically
|
|
acquire locks of appropriate modes to ensure that referenced
|
|
tables are not dropped or modified in incompatible ways while the
|
|
command executes. (For example, <command>TRUNCATE</> cannot safely be
|
|
executed concurrently with other operations on the same table, so it
|
|
obtains an exclusive lock on the table to enforce that.)
|
|
</para>
|
|
|
|
<para>
|
|
To examine a list of the currently outstanding locks in a database
|
|
server, use the
|
|
<link linkend="view-pg-locks"><structname>pg_locks</structname></link>
|
|
system view. For more information on monitoring the status of the lock
|
|
manager subsystem, refer to <xref linkend="monitoring">.
|
|
</para>
|
|
|
|
<sect2 id="locking-tables">
|
|
<title>Table-level Locks</title>
|
|
|
|
<indexterm zone="locking-tables">
|
|
<primary>LOCK</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The list below shows the available lock modes and the contexts in
|
|
which they are used automatically by
|
|
<productname>PostgreSQL</productname>. You can also acquire any
|
|
of these locks explicitly with the command <xref
|
|
linkend="sql-lock">.
|
|
Remember that all of these lock modes are table-level locks,
|
|
even if the name contains the word
|
|
<quote>row</quote>; the names of the lock modes are historical.
|
|
To some extent the names reflect the typical usage of each lock
|
|
mode — but the semantics are all the same. The only real difference
|
|
between one lock mode and another is the set of lock modes with
|
|
which each conflicts (see <xref linkend="table-lock-compatibility">).
|
|
Two transactions cannot hold locks of conflicting
|
|
modes on the same table at the same time. (However, a transaction
|
|
never conflicts with itself. For example, it might acquire
|
|
<literal>ACCESS EXCLUSIVE</literal> lock and later acquire
|
|
<literal>ACCESS SHARE</literal> lock on the same table.) Non-conflicting
|
|
lock modes can be held concurrently by many transactions. Notice in
|
|
particular that some lock modes are self-conflicting (for example,
|
|
an <literal>ACCESS EXCLUSIVE</literal> lock cannot be held by more than one
|
|
transaction at a time) while others are not self-conflicting (for example,
|
|
an <literal>ACCESS SHARE</literal> lock can be held by multiple transactions).
|
|
</para>
|
|
|
|
<variablelist>
|
|
<title>Table-level Lock Modes</title>
|
|
<varlistentry>
|
|
<term>
|
|
<literal>ACCESS SHARE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>ACCESS EXCLUSIVE</literal> lock
|
|
mode only.
|
|
</para>
|
|
|
|
<para>
|
|
The <command>SELECT</command> command acquires a lock of this mode on
|
|
referenced tables. In general, any query that only <emphasis>reads</> a table
|
|
and does not modify it will acquire this lock mode.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>ROW SHARE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>EXCLUSIVE</literal> and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
</para>
|
|
|
|
<para>
|
|
The <command>SELECT FOR UPDATE</command> and
|
|
<command>SELECT FOR SHARE</command> commands acquire a
|
|
lock of this mode on the target table(s) (in addition to
|
|
<literal>ACCESS SHARE</literal> locks on any other tables
|
|
that are referenced but not selected
|
|
<option>FOR UPDATE/FOR SHARE</option>).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>ROW EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>SHARE</literal>, <literal>SHARE ROW
|
|
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
</para>
|
|
|
|
<para>
|
|
The commands <command>UPDATE</command>,
|
|
<command>DELETE</command>, and <command>INSERT</command>
|
|
acquire this lock mode on the target table (in addition to
|
|
<literal>ACCESS SHARE</literal> locks on any other referenced
|
|
tables). In general, this lock mode will be acquired by any
|
|
command that <emphasis>modifies data</> in a table.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>SHARE UPDATE EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>SHARE UPDATE EXCLUSIVE</literal>,
|
|
<literal>SHARE</literal>, <literal>SHARE ROW
|
|
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
This mode protects a table against
|
|
concurrent schema changes and <command>VACUUM</> runs.
|
|
</para>
|
|
|
|
<para>
|
|
Acquired by <command>VACUUM</command> (without <option>FULL</option>),
|
|
<command>ANALYZE</>, <command>CREATE INDEX CONCURRENTLY</>,
|
|
<command>CREATE STATISTICS</> and
|
|
<command>ALTER TABLE VALIDATE</command> and other
|
|
<command>ALTER TABLE</command> variants (for full details see
|
|
<xref linkend="SQL-ALTERTABLE">).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>SHARE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>ROW EXCLUSIVE</literal>,
|
|
<literal>SHARE UPDATE EXCLUSIVE</literal>, <literal>SHARE ROW
|
|
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
This mode protects a table against concurrent data changes.
|
|
</para>
|
|
|
|
<para>
|
|
Acquired by <command>CREATE INDEX</command>
|
|
(without <option>CONCURRENTLY</option>).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>SHARE ROW EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>ROW EXCLUSIVE</literal>,
|
|
<literal>SHARE UPDATE EXCLUSIVE</literal>,
|
|
<literal>SHARE</literal>, <literal>SHARE ROW
|
|
EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
This mode protects a table against concurrent data changes, and
|
|
is self-exclusive so that only one session can hold it at a time.
|
|
</para>
|
|
|
|
<para>
|
|
Acquired by <command>CREATE COLLATION</command>,
|
|
<command>CREATE TRIGGER</command>, and many forms of
|
|
<command>ALTER TABLE</command> (see <xref linkend="SQL-ALTERTABLE">).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with the <literal>ROW SHARE</literal>, <literal>ROW
|
|
EXCLUSIVE</literal>, <literal>SHARE UPDATE
|
|
EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
|
|
ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal> lock modes.
|
|
This mode allows only concurrent <literal>ACCESS SHARE</literal> locks,
|
|
i.e., only reads from the table can proceed in parallel with a
|
|
transaction holding this lock mode.
|
|
</para>
|
|
|
|
<para>
|
|
Acquired by <command>REFRESH MATERIALIZED VIEW CONCURRENTLY</command>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>ACCESS EXCLUSIVE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Conflicts with locks of all modes (<literal>ACCESS
|
|
SHARE</literal>, <literal>ROW SHARE</literal>, <literal>ROW
|
|
EXCLUSIVE</literal>, <literal>SHARE UPDATE
|
|
EXCLUSIVE</literal>, <literal>SHARE</literal>, <literal>SHARE
|
|
ROW EXCLUSIVE</literal>, <literal>EXCLUSIVE</literal>, and
|
|
<literal>ACCESS EXCLUSIVE</literal>).
|
|
This mode guarantees that the
|
|
holder is the only transaction accessing the table in any way.
|
|
</para>
|
|
|
|
<para>
|
|
Acquired by the <command>DROP TABLE</>,
|
|
<command>TRUNCATE</command>, <command>REINDEX</command>,
|
|
<command>CLUSTER</command>, <command>VACUUM FULL</command>,
|
|
and <command>REFRESH MATERIALIZED VIEW</command> (without
|
|
<option>CONCURRENTLY</option>)
|
|
commands. Many forms of <command>ALTER TABLE</> also acquire
|
|
a lock at this level. This is also the default lock mode for
|
|
<command>LOCK TABLE</command> statements that do not specify
|
|
a mode explicitly.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
|
|
<tip>
|
|
<para>
|
|
Only an <literal>ACCESS EXCLUSIVE</literal> lock blocks a
|
|
<command>SELECT</command> (without <option>FOR UPDATE/SHARE</option>)
|
|
statement.
|
|
</para>
|
|
</tip>
|
|
|
|
<para>
|
|
Once acquired, a lock is normally held until the end of the transaction. But if a
|
|
lock is acquired after establishing a savepoint, the lock is released
|
|
immediately if the savepoint is rolled back to. This is consistent with
|
|
the principle that <command>ROLLBACK</> cancels all effects of the
|
|
commands since the savepoint. The same holds for locks acquired within a
|
|
<application>PL/pgSQL</> exception block: an error escape from the block
|
|
releases locks acquired within it.
|
|
</para>
|
|
|
|
|
|
|
|
<table tocentry="1" id="table-lock-compatibility">
|
|
<title> Conflicting Lock Modes</title>
|
|
<tgroup cols="9">
|
|
<colspec colnum="2" colname="lockst">
|
|
<colspec colnum="9" colname="lockend">
|
|
<spanspec namest="lockst" nameend="lockend" spanname="lockreq">
|
|
<thead>
|
|
<row>
|
|
<entry morerows="1">Requested Lock Mode</entry>
|
|
<entry spanname="lockreq">Current Lock Mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ACCESS SHARE</entry>
|
|
<entry>ROW SHARE</entry>
|
|
<entry>ROW EXCLUSIVE</entry>
|
|
<entry>SHARE UPDATE EXCLUSIVE</entry>
|
|
<entry>SHARE</entry>
|
|
<entry>SHARE ROW EXCLUSIVE</entry>
|
|
<entry>EXCLUSIVE</entry>
|
|
<entry>ACCESS EXCLUSIVE</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>ACCESS SHARE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ROW SHARE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ROW EXCLUSIVE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHARE UPDATE EXCLUSIVE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHARE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>SHARE ROW EXCLUSIVE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>EXCLUSIVE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>ACCESS EXCLUSIVE</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
</sect2>
|
|
|
|
<sect2 id="locking-rows">
|
|
<title>Row-level Locks</title>
|
|
|
|
<para>
|
|
In addition to table-level locks, there are row-level locks, which
|
|
are listed as below with the contexts in which they are used
|
|
automatically by <productname>PostgreSQL</productname>. See
|
|
<xref linkend="row-lock-compatibility"> for a complete table of
|
|
row-level lock conflicts. Note that a transaction can hold
|
|
conflicting locks on the same row, even in different subtransactions;
|
|
but other than that, two transactions can never hold conflicting locks
|
|
on the same row. Row-level locks do not affect data querying; they
|
|
block only <emphasis>writers and lockers</emphasis> to the same
|
|
row. Row-level locks are released at transaction end or during
|
|
savepoint rollback, just like table-level locks.
|
|
|
|
</para>
|
|
|
|
<variablelist>
|
|
<title>Row-level Lock Modes</title>
|
|
<varlistentry>
|
|
<term>
|
|
<literal>FOR UPDATE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
<literal>FOR UPDATE</literal> causes the rows retrieved by the
|
|
<command>SELECT</command> statement to be locked as though for
|
|
update. This prevents them from being locked, modified or deleted by
|
|
other transactions until the current transaction ends. That is,
|
|
other transactions that attempt <command>UPDATE</command>,
|
|
<command>DELETE</command>,
|
|
<command>SELECT FOR UPDATE</command>,
|
|
<command>SELECT FOR NO KEY UPDATE</command>,
|
|
<command>SELECT FOR SHARE</command> or
|
|
<command>SELECT FOR KEY SHARE</command>
|
|
of these rows will be blocked until the current transaction ends;
|
|
conversely, <command>SELECT FOR UPDATE</command> will wait for a
|
|
concurrent transaction that has run any of those commands on the
|
|
same row,
|
|
and will then lock and return the updated row (or no row, if the
|
|
row was deleted). Within a <literal>REPEATABLE READ</> or
|
|
<literal>SERIALIZABLE</> transaction,
|
|
however, an error will be thrown if a row to be locked has changed
|
|
since the transaction started. For further discussion see
|
|
<xref linkend="applevel-consistency">.
|
|
</para>
|
|
<para>
|
|
The <literal>FOR UPDATE</> lock mode
|
|
is also acquired by any <command>DELETE</> on a row, and also by an
|
|
<command>UPDATE</> that modifies the values on certain columns. Currently,
|
|
the set of columns considered for the <command>UPDATE</> case are those that
|
|
have a unique index on them that can be used in a foreign key (so partial
|
|
indexes and expressional indexes are not considered), but this may change
|
|
in the future.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>FOR NO KEY UPDATE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Behaves similarly to <literal>FOR UPDATE</>, except that the lock
|
|
acquired is weaker: this lock will not block
|
|
<literal>SELECT FOR KEY SHARE</> commands that attempt to acquire
|
|
a lock on the same rows. This lock mode is also acquired by any
|
|
<command>UPDATE</> that does not acquire a <literal>FOR UPDATE</> lock.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>FOR SHARE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Behaves similarly to <literal>FOR NO KEY UPDATE</>, except that it
|
|
acquires a shared lock rather than exclusive lock on each retrieved
|
|
row. A shared lock blocks other transactions from performing
|
|
<command>UPDATE</command>, <command>DELETE</command>,
|
|
<command>SELECT FOR UPDATE</command> or
|
|
<command>SELECT FOR NO KEY UPDATE</> on these rows, but it does not
|
|
prevent them from performing <command>SELECT FOR SHARE</command> or
|
|
<command>SELECT FOR KEY SHARE</command>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<literal>FOR KEY SHARE</literal>
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Behaves similarly to <literal>FOR SHARE</literal>, except that the
|
|
lock is weaker: <literal>SELECT FOR UPDATE</> is blocked, but not
|
|
<literal>SELECT FOR NO KEY UPDATE</>. A key-shared lock blocks
|
|
other transactions from performing <command>DELETE</command> or
|
|
any <command>UPDATE</command> that changes the key values, but not
|
|
other <command>UPDATE</>, and neither does it prevent
|
|
<command>SELECT FOR NO KEY UPDATE</>, <command>SELECT FOR SHARE</>,
|
|
or <command>SELECT FOR KEY SHARE</>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> doesn't remember any
|
|
information about modified rows in memory, so there is no limit on
|
|
the number of rows locked at one time. However, locking a row
|
|
might cause a disk write, e.g., <command>SELECT FOR
|
|
UPDATE</command> modifies selected rows to mark them locked, and so
|
|
will result in disk writes.
|
|
</para>
|
|
|
|
<table tocentry="1" id="row-lock-compatibility">
|
|
<title>Conflicting Row-level Locks</title>
|
|
<tgroup cols="5">
|
|
<colspec colnum="2" colname="lockst">
|
|
<colspec colnum="5" colname="lockend">
|
|
<spanspec namest="lockst" nameend="lockend" spanname="lockreq">
|
|
<thead>
|
|
<row>
|
|
<entry morerows="1">Requested Lock Mode</entry>
|
|
<entry spanname="lockreq">Current Lock Mode</entry>
|
|
</row>
|
|
<row>
|
|
<entry>FOR KEY SHARE</entry>
|
|
<entry>FOR SHARE</entry>
|
|
<entry>FOR NO KEY UPDATE</entry>
|
|
<entry>FOR UPDATE</entry>
|
|
</row>
|
|
</thead>
|
|
<tbody>
|
|
<row>
|
|
<entry>FOR KEY SHARE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>FOR SHARE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>FOR NO KEY UPDATE</entry>
|
|
<entry align="center"></entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
<row>
|
|
<entry>FOR UPDATE</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
<entry align="center">X</entry>
|
|
</row>
|
|
</tbody>
|
|
</tgroup>
|
|
</table>
|
|
</sect2>
|
|
|
|
<sect2 id="locking-pages">
|
|
<title>Page-level Locks</title>
|
|
|
|
<para>
|
|
In addition to table and row locks, page-level share/exclusive locks are
|
|
used to control read/write access to table pages in the shared buffer
|
|
pool. These locks are released immediately after a row is fetched or
|
|
updated. Application developers normally need not be concerned with
|
|
page-level locks, but they are mentioned here for completeness.
|
|
</para>
|
|
|
|
</sect2>
|
|
|
|
<sect2 id="locking-deadlocks">
|
|
<title>Deadlocks</title>
|
|
|
|
<indexterm zone="locking-deadlocks">
|
|
<primary>deadlock</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
The use of explicit locking can increase the likelihood of
|
|
<firstterm>deadlocks</>, wherein two (or more) transactions each
|
|
hold locks that the other wants. For example, if transaction 1
|
|
acquires an exclusive lock on table A and then tries to acquire
|
|
an exclusive lock on table B, while transaction 2 has already
|
|
exclusive-locked table B and now wants an exclusive lock on table
|
|
A, then neither one can proceed.
|
|
<productname>PostgreSQL</productname> automatically detects
|
|
deadlock situations and resolves them by aborting one of the
|
|
transactions involved, allowing the other(s) to complete.
|
|
(Exactly which transaction will be aborted is difficult to
|
|
predict and should not be relied upon.)
|
|
</para>
|
|
|
|
<para>
|
|
Note that deadlocks can also occur as the result of row-level
|
|
locks (and thus, they can occur even if explicit locking is not
|
|
used). Consider the case in which two concurrent
|
|
transactions modify a table. The first transaction executes:
|
|
|
|
<screen>
|
|
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 11111;
|
|
</screen>
|
|
|
|
This acquires a row-level lock on the row with the specified
|
|
account number. Then, the second transaction executes:
|
|
|
|
<screen>
|
|
UPDATE accounts SET balance = balance + 100.00 WHERE acctnum = 22222;
|
|
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 11111;
|
|
</screen>
|
|
|
|
The first <command>UPDATE</command> statement successfully
|
|
acquires a row-level lock on the specified row, so it succeeds in
|
|
updating that row. However, the second <command>UPDATE</command>
|
|
statement finds that the row it is attempting to update has
|
|
already been locked, so it waits for the transaction that
|
|
acquired the lock to complete. Transaction two is now waiting on
|
|
transaction one to complete before it continues execution. Now,
|
|
transaction one executes:
|
|
|
|
<screen>
|
|
UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
|
|
</screen>
|
|
|
|
Transaction one attempts to acquire a row-level lock on the
|
|
specified row, but it cannot: transaction two already holds such
|
|
a lock. So it waits for transaction two to complete. Thus,
|
|
transaction one is blocked on transaction two, and transaction
|
|
two is blocked on transaction one: a deadlock
|
|
condition. <productname>PostgreSQL</productname> will detect this
|
|
situation and abort one of the transactions.
|
|
</para>
|
|
|
|
<para>
|
|
The best defense against deadlocks is generally to avoid them by
|
|
being certain that all applications using a database acquire
|
|
locks on multiple objects in a consistent order. In the example
|
|
above, if both transactions
|
|
had updated the rows in the same order, no deadlock would have
|
|
occurred. One should also ensure that the first lock acquired on
|
|
an object in a transaction is the most restrictive mode that will be
|
|
needed for that object. If it is not feasible to verify this in
|
|
advance, then deadlocks can be handled on-the-fly by retrying
|
|
transactions that abort due to deadlocks.
|
|
</para>
|
|
|
|
<para>
|
|
So long as no deadlock situation is detected, a transaction seeking
|
|
either a table-level or row-level lock will wait indefinitely for
|
|
conflicting locks to be released. This means it is a bad idea for
|
|
applications to hold transactions open for long periods of time
|
|
(e.g., while waiting for user input).
|
|
</para>
|
|
</sect2>
|
|
|
|
<sect2 id="advisory-locks">
|
|
<title>Advisory Locks</title>
|
|
|
|
<indexterm zone="advisory-locks">
|
|
<primary>advisory lock</primary>
|
|
</indexterm>
|
|
|
|
<indexterm zone="advisory-locks">
|
|
<primary>lock</primary>
|
|
<secondary>advisory</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> provides a means for
|
|
creating locks that have application-defined meanings. These are
|
|
called <firstterm>advisory locks</>, because the system does not
|
|
enforce their use — it is up to the application to use them
|
|
correctly. Advisory locks can be useful for locking strategies
|
|
that are an awkward fit for the MVCC model.
|
|
For example, a common use of advisory locks is to emulate pessimistic
|
|
locking strategies typical of so-called <quote>flat file</> data
|
|
management systems.
|
|
While a flag stored in a table could be used for the same purpose,
|
|
advisory locks are faster, avoid table bloat, and are automatically
|
|
cleaned up by the server at the end of the session.
|
|
</para>
|
|
|
|
<para>
|
|
There are two ways to acquire an advisory lock in
|
|
<productname>PostgreSQL</productname>: at session level or at
|
|
transaction level.
|
|
Once acquired at session level, an advisory lock is held until
|
|
explicitly released or the session ends. Unlike standard lock requests,
|
|
session-level advisory lock requests do not honor transaction semantics:
|
|
a lock acquired during a transaction that is later rolled back will still
|
|
be held following the rollback, and likewise an unlock is effective even
|
|
if the calling transaction fails later. A lock can be acquired multiple
|
|
times by its owning process; for each completed lock request there must
|
|
be a corresponding unlock request before the lock is actually released.
|
|
Transaction-level lock requests, on the other hand, behave more like
|
|
regular lock requests: they are automatically released at the end of the
|
|
transaction, and there is no explicit unlock operation. This behavior
|
|
is often more convenient than the session-level behavior for short-term
|
|
usage of an advisory lock.
|
|
Session-level and transaction-level lock requests for the same advisory
|
|
lock identifier will block each other in the expected way.
|
|
If a session already holds a given advisory lock, additional requests by
|
|
it will always succeed, even if other sessions are awaiting the lock; this
|
|
statement is true regardless of whether the existing lock hold and new
|
|
request are at session level or transaction level.
|
|
</para>
|
|
|
|
<para>
|
|
Like all locks in
|
|
<productname>PostgreSQL</productname>, a complete list of advisory locks
|
|
currently held by any session can be found in the <link
|
|
linkend="view-pg-locks"><structname>pg_locks</structname></link> system
|
|
view.
|
|
</para>
|
|
|
|
<para>
|
|
Both advisory locks and regular locks are stored in a shared memory
|
|
pool whose size is defined by the configuration variables
|
|
<xref linkend="guc-max-locks-per-transaction"> and
|
|
<xref linkend="guc-max-connections">.
|
|
Care must be taken not to exhaust this
|
|
memory or the server will be unable to grant any locks at all.
|
|
This imposes an upper limit on the number of advisory locks
|
|
grantable by the server, typically in the tens to hundreds of thousands
|
|
depending on how the server is configured.
|
|
</para>
|
|
|
|
<para>
|
|
In certain cases using advisory locking methods, especially in queries
|
|
involving explicit ordering and <literal>LIMIT</> clauses, care must be
|
|
taken to control the locks acquired because of the order in which SQL
|
|
expressions are evaluated. For example:
|
|
<screen>
|
|
SELECT pg_advisory_lock(id) FROM foo WHERE id = 12345; -- ok
|
|
SELECT pg_advisory_lock(id) FROM foo WHERE id > 12345 LIMIT 100; -- danger!
|
|
SELECT pg_advisory_lock(q.id) FROM
|
|
(
|
|
SELECT id FROM foo WHERE id > 12345 LIMIT 100
|
|
) q; -- ok
|
|
</screen>
|
|
In the above queries, the second form is dangerous because the
|
|
<literal>LIMIT</> is not guaranteed to be applied before the locking
|
|
function is executed. This might cause some locks to be acquired
|
|
that the application was not expecting, and hence would fail to release
|
|
(until it ends the session).
|
|
From the point of view of the application, such locks
|
|
would be dangling, although still viewable in
|
|
<structname>pg_locks</structname>.
|
|
</para>
|
|
|
|
<para>
|
|
The functions provided to manipulate advisory locks are described in
|
|
<xref linkend="functions-advisory-locks">.
|
|
</para>
|
|
</sect2>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="applevel-consistency">
|
|
<title>Data Consistency Checks at the Application Level</title>
|
|
|
|
<para>
|
|
It is very difficult to enforce business rules regarding data integrity
|
|
using Read Committed transactions because the view of the data is
|
|
shifting with each statement, and even a single statement may not
|
|
restrict itself to the statement's snapshot if a write conflict occurs.
|
|
</para>
|
|
|
|
<para>
|
|
While a Repeatable Read transaction has a stable view of the data
|
|
throughout its execution, there is a subtle issue with using
|
|
<acronym>MVCC</acronym> snapshots for data consistency checks, involving
|
|
something known as <firstterm>read/write conflicts</firstterm>.
|
|
If one transaction writes data and a concurrent transaction attempts
|
|
to read the same data (whether before or after the write), it cannot
|
|
see the work of the other transaction. The reader then appears to have
|
|
executed first regardless of which started first or which committed
|
|
first. If that is as far as it goes, there is no problem, but
|
|
if the reader also writes data which is read by a concurrent transaction
|
|
there is now a transaction which appears to have run before either of
|
|
the previously mentioned transactions. If the transaction which appears
|
|
to have executed last actually commits first, it is very easy for a
|
|
cycle to appear in a graph of the order of execution of the transactions.
|
|
When such a cycle appears, integrity checks will not work correctly
|
|
without some help.
|
|
</para>
|
|
|
|
<para>
|
|
As mentioned in <xref linkend="xact-serializable">, Serializable
|
|
transactions are just Repeatable Read transactions which add
|
|
nonblocking monitoring for dangerous patterns of read/write conflicts.
|
|
When a pattern is detected which could cause a cycle in the apparent
|
|
order of execution, one of the transactions involved is rolled back to
|
|
break the cycle.
|
|
</para>
|
|
|
|
<sect2 id="serializable-consistency">
|
|
<title>Enforcing Consistency With Serializable Transactions</title>
|
|
|
|
<para>
|
|
If the Serializable transaction isolation level is used for all writes
|
|
and for all reads which need a consistent view of the data, no other
|
|
effort is required to ensure consistency. Software from other
|
|
environments which is written to use serializable transactions to
|
|
ensure consistency should <quote>just work</quote> in this regard in
|
|
<productname>PostgreSQL</productname>.
|
|
</para>
|
|
|
|
<para>
|
|
When using this technique, it will avoid creating an unnecessary burden
|
|
for application programmers if the application software goes through a
|
|
framework which automatically retries transactions which are rolled
|
|
back with a serialization failure. It may be a good idea to set
|
|
<literal>default_transaction_isolation</> to <literal>serializable</>.
|
|
It would also be wise to take some action to ensure that no other
|
|
transaction isolation level is used, either inadvertently or to
|
|
subvert integrity checks, through checks of the transaction isolation
|
|
level in triggers.
|
|
</para>
|
|
|
|
<para>
|
|
See <xref linkend="xact-serializable"> for performance suggestions.
|
|
</para>
|
|
|
|
<warning>
|
|
<para>
|
|
This level of integrity protection using Serializable transactions
|
|
does not yet extend to hot standby mode (<xref linkend="hot-standby">).
|
|
Because of that, those using hot standby may want to use Repeatable
|
|
Read and explicit locking on the master.
|
|
</para>
|
|
</warning>
|
|
</sect2>
|
|
|
|
<sect2 id="non-serializable-consistency">
|
|
<title>Enforcing Consistency With Explicit Blocking Locks</title>
|
|
|
|
<para>
|
|
When non-serializable writes are possible,
|
|
to ensure the current validity of a row and protect it against
|
|
concurrent updates one must use <command>SELECT FOR UPDATE</command>,
|
|
<command>SELECT FOR SHARE</command>, or an appropriate <command>LOCK
|
|
TABLE</command> statement. (<command>SELECT FOR UPDATE</command>
|
|
and <command>SELECT FOR SHARE</command> lock just the
|
|
returned rows against concurrent updates, while <command>LOCK
|
|
TABLE</command> locks the whole table.) This should be taken into
|
|
account when porting applications to
|
|
<productname>PostgreSQL</productname> from other environments.
|
|
</para>
|
|
|
|
<para>
|
|
Also of note to those converting from other environments is the fact
|
|
that <command>SELECT FOR UPDATE</command> does not ensure that a
|
|
concurrent transaction will not update or delete a selected row.
|
|
To do that in <productname>PostgreSQL</productname> you must actually
|
|
update the row, even if no values need to be changed.
|
|
<command>SELECT FOR UPDATE</command> <emphasis>temporarily blocks</emphasis>
|
|
other transactions from acquiring the same lock or executing an
|
|
<command>UPDATE</command> or <command>DELETE</command> which would
|
|
affect the locked row, but once the transaction holding this lock
|
|
commits or rolls back, a blocked transaction will proceed with the
|
|
conflicting operation unless an actual <command>UPDATE</command> of
|
|
the row was performed while the lock was held.
|
|
</para>
|
|
|
|
<para>
|
|
Global validity checks require extra thought under
|
|
non-serializable <acronym>MVCC</acronym>.
|
|
For example, a banking application might wish to check that the sum of
|
|
all credits in one table equals the sum of debits in another table,
|
|
when both tables are being actively updated. Comparing the results of two
|
|
successive <literal>SELECT sum(...)</literal> commands will not work reliably in
|
|
Read Committed mode, since the second query will likely include the results
|
|
of transactions not counted by the first. Doing the two sums in a
|
|
single repeatable read transaction will give an accurate picture of only the
|
|
effects of transactions that committed before the repeatable read transaction
|
|
started — but one might legitimately wonder whether the answer is still
|
|
relevant by the time it is delivered. If the repeatable read transaction
|
|
itself applied some changes before trying to make the consistency check,
|
|
the usefulness of the check becomes even more debatable, since now it
|
|
includes some but not all post-transaction-start changes. In such cases
|
|
a careful person might wish to lock all tables needed for the check,
|
|
in order to get an indisputable picture of current reality. A
|
|
<literal>SHARE</> mode (or higher) lock guarantees that there are no
|
|
uncommitted changes in the locked table, other than those of the current
|
|
transaction.
|
|
</para>
|
|
|
|
<para>
|
|
Note also that if one is relying on explicit locking to prevent concurrent
|
|
changes, one should either use Read Committed mode, or in Repeatable Read
|
|
mode be careful to obtain
|
|
locks before performing queries. A lock obtained by a
|
|
repeatable read transaction guarantees that no other transactions modifying
|
|
the table are still running, but if the snapshot seen by the
|
|
transaction predates obtaining the lock, it might predate some now-committed
|
|
changes in the table. A repeatable read transaction's snapshot is actually
|
|
frozen at the start of its first query or data-modification command
|
|
(<literal>SELECT</>, <literal>INSERT</>,
|
|
<literal>UPDATE</>, or <literal>DELETE</>), so
|
|
it is possible to obtain locks explicitly before the snapshot is
|
|
frozen.
|
|
</para>
|
|
</sect2>
|
|
</sect1>
|
|
|
|
<sect1 id="mvcc-caveats">
|
|
<title>Caveats</title>
|
|
|
|
<para>
|
|
Some DDL commands, currently only <xref linkend="sql-truncate"> and the
|
|
table-rewriting forms of <xref linkend="sql-altertable">, are not
|
|
MVCC-safe. This means that after the truncation or rewrite commits, the
|
|
table will appear empty to concurrent transactions, if they are using a
|
|
snapshot taken before the DDL command committed. This will only be an
|
|
issue for a transaction that did not access the table in question
|
|
before the DDL command started — any transaction that has done so
|
|
would hold at least an <literal>ACCESS SHARE</literal> table lock,
|
|
which would block the DDL command until that transaction completes.
|
|
So these commands will not cause any apparent inconsistency in the
|
|
table contents for successive queries on the target table, but they
|
|
could cause visible inconsistency between the contents of the target
|
|
table and other tables in the database.
|
|
</para>
|
|
|
|
<para>
|
|
Support for the Serializable transaction isolation level has not yet
|
|
been added to Hot Standby replication targets (described in
|
|
<xref linkend="hot-standby">). The strictest isolation level currently
|
|
supported in hot standby mode is Repeatable Read. While performing all
|
|
permanent database writes within Serializable transactions on the
|
|
master will ensure that all standbys will eventually reach a consistent
|
|
state, a Repeatable Read transaction run on the standby can sometimes
|
|
see a transient state that is inconsistent with any serial execution
|
|
of the transactions on the master.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="locking-indexes">
|
|
<title>Locking and Indexes</title>
|
|
|
|
<indexterm zone="locking-indexes">
|
|
<primary>index</primary>
|
|
<secondary>locks</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Though <productname>PostgreSQL</productname>
|
|
provides nonblocking read/write access to table
|
|
data, nonblocking read/write access is not currently offered for every
|
|
index access method implemented
|
|
in <productname>PostgreSQL</productname>.
|
|
The various index types are handled as follows:
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>
|
|
B-tree, <acronym>GiST</acronym> and <acronym>SP-GiST</acronym> indexes
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Short-term share/exclusive page-level locks are used for
|
|
read/write access. Locks are released immediately after each
|
|
index row is fetched or inserted. These index types provide
|
|
the highest concurrency without deadlock conditions.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
Hash indexes
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Share/exclusive hash-bucket-level locks are used for read/write
|
|
access. Locks are released after the whole bucket is processed.
|
|
Bucket-level locks provide better concurrency than index-level
|
|
ones, but deadlock is possible since the locks are held longer
|
|
than one index operation.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term>
|
|
<acronym>GIN</acronym> indexes
|
|
</term>
|
|
<listitem>
|
|
<para>
|
|
Short-term share/exclusive page-level locks are used for
|
|
read/write access. Locks are released immediately after each
|
|
index row is fetched or inserted. But note that insertion of a
|
|
GIN-indexed value usually produces several index key insertions
|
|
per row, so GIN might do substantial work for a single value's
|
|
insertion.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
|
|
<para>
|
|
Currently, B-tree indexes offer the best performance for concurrent
|
|
applications; since they also have more features than hash
|
|
indexes, they are the recommended index type for concurrent
|
|
applications that need to index scalar data. When dealing with
|
|
non-scalar data, B-trees are not useful, and GiST, SP-GiST or GIN
|
|
indexes should be used instead.
|
|
</para>
|
|
</sect1>
|
|
</chapter>
|