mirror of
https://github.com/postgres/postgres.git
synced 2025-04-22 23:02:54 +03:00
Chapter on multi-version concurrency control from Vadim.
Some wording changes from Vadim's original text doc. Processes cleanly, but may need fixup.
This commit is contained in:
parent
bb0fc46a90
commit
0807dbb294
545
doc/src/sgml/mvcc.sgml
Normal file
545
doc/src/sgml/mvcc.sgml
Normal file
@ -0,0 +1,545 @@
|
||||
<chapter id="mvcc">
|
||||
<title>Multi-Version Concurrency Control</title>
|
||||
|
||||
<abstract>
|
||||
<para>
|
||||
Multi-Version Concurrency Control
|
||||
(MVCC)
|
||||
is an advanced technique for improving database performance in a
|
||||
multi-user environment.
|
||||
<ulink url="mailto:vadim@krs.ru">Vadim Mikheev</ulink> provided
|
||||
the implementation for <productname>Postgres</productname>.
|
||||
</para>
|
||||
</abstract>
|
||||
|
||||
<sect1>
|
||||
<title>Introduction</title>
|
||||
|
||||
<para>
|
||||
Unlike most other database systems which use locks for concurrency control,
|
||||
<productname>Postgres</productname>
|
||||
maintains data consistency by using a multiversion model.
|
||||
This means that while querying database each transaction sees
|
||||
a snapshot of data (a <firstterm>database version</firstterm>)
|
||||
as it was some
|
||||
time ago, regardless of the current state of data queried.
|
||||
This protects the transaction from viewing inconsistent data that
|
||||
could be caused by (other) concurrent transaction updates on the same
|
||||
data rows, providing <firstterm>transaction isolation</firstterm>
|
||||
for each database session.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The main difference between multiversion and lock models is that
|
||||
in MVCC locks acquired for querying (reading) data don't conflict
|
||||
with locks acquired for writing data and so reading never blocks
|
||||
writing and writing never blocks reading.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Transaction Isolation</title>
|
||||
|
||||
<para>
|
||||
The <acronym>ANSI</acronym>/<acronym>ISO</acronym> <acronym>SQL</acronym>
|
||||
standard defines four levels of transaction
|
||||
isolation in terms of three phenomena that must be prevented
|
||||
between concurrent transactions.
|
||||
These undesirable phenomena are:
|
||||
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>
|
||||
dirty reads
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
A transaction reads data written by concurrent uncommitted transaction.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
non-repeatable reads
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
A transaction re-reads data it has previously read and finds that data
|
||||
has been modified by another committed transaction.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
phantom read
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
A transaction re-executes a query returning a set of rows that satisfy a
|
||||
search condition and finds that additional rows satisfying the condition
|
||||
has been inserted by another committed transaction.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Accordingly, the four isolation levels are defined to be:
|
||||
|
||||
<segmentedlist>
|
||||
<segtitle>
|
||||
Isolation Level
|
||||
</segtitle>
|
||||
<segtitle>
|
||||
Dirty Read
|
||||
</segtitle>
|
||||
<segtitle>
|
||||
Non-Repeatable Read
|
||||
</segtitle>
|
||||
<segtitle>
|
||||
Phantom Read
|
||||
</segtitle>
|
||||
<seglistitem>
|
||||
<seg>
|
||||
Read uncommitted
|
||||
</seg>
|
||||
<seg>
|
||||
Possible
|
||||
</seg>
|
||||
<seg>
|
||||
Possible
|
||||
</seg>
|
||||
<seg>
|
||||
Possible
|
||||
</seg>
|
||||
</seglistitem>
|
||||
|
||||
<seglistitem>
|
||||
<seg>
|
||||
Read committed
|
||||
</seg>
|
||||
<seg>
|
||||
Not possible
|
||||
</seg>
|
||||
<seg>
|
||||
Possible
|
||||
</seg>
|
||||
<seg>
|
||||
Possible
|
||||
</seg>
|
||||
</seglistitem>
|
||||
|
||||
<seglistitem>
|
||||
<seg>
|
||||
Repeatable read
|
||||
</seg>
|
||||
<seg>
|
||||
Not possible
|
||||
</seg>
|
||||
<seg>
|
||||
Not possible
|
||||
</seg>
|
||||
<seg>
|
||||
Possible
|
||||
</seg>
|
||||
</seglistitem>
|
||||
|
||||
<seglistitem>
|
||||
<seg>
|
||||
Serializable
|
||||
</seg>
|
||||
<seg>
|
||||
Not possible
|
||||
</seg>
|
||||
<seg>
|
||||
Not possible
|
||||
</seg>
|
||||
<seg>
|
||||
Not possible
|
||||
</seg>
|
||||
</seglistitem>
|
||||
</segmentedlist>
|
||||
|
||||
<productname>Postgres</productname>
|
||||
offers the read committed and serializable isolation levels.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Read Committed Isolation Level</title>
|
||||
|
||||
<para>
|
||||
This is the default isolation level in <productname>Postgres</productname>.
|
||||
When a transaction runs on this isolation level, a query sees only
|
||||
data committed before the query began and never sees either dirty data or
|
||||
concurrent transaction changes committed during query execution.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If a row returned by a query while executing an
|
||||
<command>UPDATE</command> statement
|
||||
(or <command>DELETE</command>
|
||||
or <command>SELECT FOR UPDATE</command>)
|
||||
is being updated by a
|
||||
concurrent uncommitted transaction then the second transaction
|
||||
that tries to update this row will wait for the other transaction to
|
||||
commit or rollback. In the case of rollback, the waiting transaction
|
||||
can proceed to change the row. In the case of commit (and if the
|
||||
row still exists; i.e. was not deleted by the other transaction), the
|
||||
query will be re-executed for this row to check that new row
|
||||
version satisfies query search condition. If the new row version
|
||||
satisfies the query search condition then row will be
|
||||
updated (or deleted or marked for update).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Note that the results of execution of SELECT or INSERT (with a query)
|
||||
statements will not be affected by concurrent transactions.
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Serializable Isolation Level</title>
|
||||
|
||||
<para>
|
||||
This level provides the highest transaction isolation. When a
|
||||
transaction is on the <firstterm>serializable</firstterm> level,
|
||||
a query sees only data
|
||||
committed before the transaction began and never see either dirty data
|
||||
or concurrent transaction changes committed during transaction
|
||||
execution. So, this level emulates serial transaction execution,
|
||||
as if transactions would be executed one after another, serially,
|
||||
rather than concurrently.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
If a row returned by query while executing
|
||||
<command>UPDATE</command>/<command>DELETE</command>/<command>SELECT FOR UPDATE</command>
|
||||
statement is being updated by
|
||||
a concurrent uncommitted transaction then the second transaction
|
||||
that tries to update this row will wait for the other transaction to
|
||||
commit or rollback. In the case of rollback, the waiting transaction
|
||||
can proceed to change the row. In the case of a concurrent
|
||||
transaction commit, a serializable transaction will be rolled back
|
||||
with the message
|
||||
|
||||
<programlisting>
|
||||
ERROR: Can't serialize access due to concurrent update
|
||||
</programlisting>
|
||||
|
||||
because a serializable transaction cannot modify rows changed by
|
||||
other transactions after the serializable transaction began.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
Note that results of execution of <command>SELECT</command>
|
||||
or <command>INSERT</command> (with a query)
|
||||
will not be affected by concurrent transactions.
|
||||
</para>
|
||||
</note>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Locking and Tables</title>
|
||||
|
||||
<para>
|
||||
<productname>Postgres</productname>
|
||||
provides various lock modes to control concurrent
|
||||
access to data in tables. Some of these lock modes are acquired by
|
||||
<productname>Postgres</productname>
|
||||
automatically before statement execution, while others are
|
||||
provided to be used by applications. All lock modes (except for
|
||||
AccessShareLock) acquired in a transaction are held for the duration
|
||||
of the transaction.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
In addition to locks, short-term share/exclusive latches are used
|
||||
to control read/write access to table pages in shared buffer pool.
|
||||
Latches are released immediately after a tuple is fetched or updated.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Table-level locks</title>
|
||||
|
||||
<para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>
|
||||
AccessShareLock
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
An internal lock mode acquiring automatically over tables
|
||||
being queried. <productname>Postgres</productname>
|
||||
releases these locks after statement is
|
||||
done.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Conflicts with AccessExclusiveLock only.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
RowShareLock
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Acquired by <command>SELECT FOR UPDATE</command>
|
||||
and <command>LOCK TABLE</command>
|
||||
for <option>IN ROW SHARE MODE</option> statements.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Conflicts with ExclusiveLock and AccessExclusiveLock modes.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
RowExclusiveLock
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Acquired by <command>UPDATE</command>, <command>DELETE</command>,
|
||||
<command>INSERT</command> and <command>LOCK TABLE</command>
|
||||
for <option>IN ROW EXCLUSIVE MODE</option> statements.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Conflicts with ShareLock, ShareRowExclusiveLock, ExclusiveLock and
|
||||
AccessExclusiveLock modes.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
ShareLock
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Acquired by <command>CREATE INDEX</command>
|
||||
and <command>LOCK TABLE</command> table
|
||||
for <option>IN SHARE MODE</option>
|
||||
statements.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Conflicts with RowExclusiveLock, ShareRowExclusiveLock,
|
||||
ExclusiveLock and AccessExclusiveLock modes.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
ShareRowExclusiveLock
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Acquired by <command>LOCK TABLE</command> for
|
||||
<option>IN SHARE ROW EXCLUSIVE MODE</option> statements.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Conflicts with RowExclusiveLock, ShareLock, ShareRowExclusiveLock,
|
||||
ExclusiveLock and AccessExclusiveLock modes.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
ExclusiveLock
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Acquired by <command>LOCK TABLE</command> table
|
||||
for <option>IN EXCLUSIVE MODE</option> statements.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Conflicts with RowShareLock, RowExclusiveLock, ShareLock,
|
||||
ShareRowExclusiveLock, ExclusiveLock and AccessExclusiveLock
|
||||
modes.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
AccessExclusiveLock
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Acquired by <command>ALTER TABLE</command>,
|
||||
<command>DROP TABLE</command>,
|
||||
<command>VACUUM</command> and <command>LOCK TABLE</command>
|
||||
statements.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Conflicts with RowShareLock, RowExclusiveLock, ShareLock,
|
||||
ShareRowExclusiveLock, ExclusiveLock and AccessExclusiveLock
|
||||
modes.
|
||||
|
||||
<note>
|
||||
<para>
|
||||
Note that only AccessExclusiveLock blocks <command>SELECT</command> (without FOR
|
||||
UPDATE) statement.
|
||||
</para>
|
||||
</note>
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Row-level locks</title>
|
||||
|
||||
<para>
|
||||
These locks are acquired by means of modification of internal
|
||||
fields of row being updated/deleted/marked for update.
|
||||
<productname>Postgres</productname>
|
||||
doesn't remember any information about modified rows in memory and
|
||||
so hasn't limit for locked rows without lock escalation.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
However, take into account that <command>SELECT FOR UPDATE</command> will modify
|
||||
selected rows to mark them and so will results in disk writes.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Row-level locks don't affect data querying. They are used to block
|
||||
writers to <emphasis>the same row</emphasis> only.
|
||||
</para>
|
||||
</sect2>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Locking and Indices</title>
|
||||
|
||||
<para>
|
||||
Though <productname>Postgres</productname>
|
||||
provides unblocking read/write access to table
|
||||
data, it is not the case for all index access methods implemented
|
||||
in <productname>Postgres</productname>.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The various index types are handled as follows:
|
||||
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>
|
||||
GiST and R-Tree indices
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Share/exclusive INDEX-level locks are used for read/write access.
|
||||
Locks are released after statement is done.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
Hash indices
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Share/exclusive PAGE-level locks are used for read/write access.
|
||||
Locks are released after page is processed.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Page-level locks produces better concurrency than index-level ones
|
||||
but are subject to deadlocks.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
Btree
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Short-term share/exclusive PAGE-level latches are used for
|
||||
read/write access. Latches are released immediately after the index
|
||||
tuple is inserted/fetched.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Btree indices provide highest concurrency without deadlock
|
||||
conditions.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</para>
|
||||
</sect1>
|
||||
|
||||
<sect1>
|
||||
<title>Data consistency checks at the application level</title>
|
||||
|
||||
<para>
|
||||
Because readers in <productname>Postgres</productname>
|
||||
don't lock data, regardless of
|
||||
transaction isolation level, data read by one transaction can be
|
||||
overwritten by another. In the other words, if a row is returned
|
||||
by <command>SELECT</command> it doesn't mean that this row really
|
||||
exists at the time it is returned (i.e. sometime after the
|
||||
statement or transaction began) nor
|
||||
that the row is protected from deletion/updation by concurrent
|
||||
transactions before the current transaction commit or rollback.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
To ensure the actual existance of a row and protect it against
|
||||
concurrent updates one must use <command>SELECT FOR UPDATE</command> or
|
||||
an appropriate <command>LOCK TABLE</command> statement.
|
||||
This should be taken into account when porting applications using
|
||||
serializable mode to <productname>Postgres</productname> from other environments.
|
||||
|
||||
<note>
|
||||
<para>
|
||||
Before version 6.5 <productname>Postgres</productname>
|
||||
used read-locks and so the
|
||||
above consideration is also the case
|
||||
when upgrading to 6.5 (or higher) from previous
|
||||
<productname>Postgres</productname> versions.
|
||||
</para>
|
||||
</note>
|
||||
</para>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
||||
<!-- Keep this comment at the end of the file
|
||||
Local variables:
|
||||
mode: sgml
|
||||
sgml-omittag:nil
|
||||
sgml-shorttag:t
|
||||
sgml-minimize-attributes:nil
|
||||
sgml-always-quote-attributes:t
|
||||
sgml-indent-step:1
|
||||
sgml-indent-data:t
|
||||
sgml-parent-document:nil
|
||||
sgml-default-dtd-file:"./reference.ced"
|
||||
sgml-exposed-tags:nil
|
||||
sgml-local-catalogs:"/usr/lib/sgml/catalog"
|
||||
sgml-local-ecat-files:nil
|
||||
End:
|
||||
-->
|
Loading…
x
Reference in New Issue
Block a user