1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-28 23:42:10 +03:00

When FOR UPDATE/SHARE is used with LIMIT, put the LockRows plan node

underneath the Limit node, not atop it.  This fixes the old problem that such
a query might unexpectedly return fewer rows than the LIMIT says, due to
LockRows discarding updated rows.

There is a related problem that LockRows might destroy the sort ordering
produced by earlier steps; but fixing that by pushing LockRows below Sort
would create serious performance problems that are unjustified in many
real-world applications, as well as potential deadlock problems from locking
many more rows than expected.  Instead, keep the present semantics of applying
FOR UPDATE after ORDER BY within a single query level; but allow the user to
specify the other way by writing FOR UPDATE in a sub-select.  To make that
work, track whether FOR UPDATE appeared explicitly in sub-selects or got
pushed down from the parent, and don't flatten a sub-select that contained an
explicit FOR UPDATE.
This commit is contained in:
Tom Lane
2009-10-28 14:55:47 +00:00
parent 44956c52c5
commit 46e3a16b05
13 changed files with 225 additions and 126 deletions

View File

@ -1,5 +1,5 @@
<!--
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.127 2009/10/27 17:11:18 tgl Exp $
$PostgreSQL: pgsql/doc/src/sgml/ref/select.sgml,v 1.128 2009/10/28 14:55:37 tgl Exp $
PostgreSQL documentation
-->
@ -1092,22 +1092,12 @@ FOR SHARE [ OF <replaceable class="parameter">table_name</replaceable> [, ...] ]
has already locked a selected row or rows, <command>SELECT FOR
UPDATE</command> will wait for the other transaction to complete,
and will then lock and return the updated row (or no row, if the
row was deleted). For further discussion see <xref
row was deleted). Within a <literal>SERIALIZABLE</> transaction,
however, an error will be thrown if a row to be locked has changed
since the transaction started. For further discussion see <xref
linkend="mvcc">.
</para>
<para>
To prevent the operation from waiting for other transactions to commit,
use the <literal>NOWAIT</> option. <command>SELECT FOR UPDATE
NOWAIT</command> reports an error, rather than waiting, if a selected row
cannot be locked immediately. Note that <literal>NOWAIT</> applies only
to the row-level lock(s) &mdash; the required <literal>ROW SHARE</literal>
table-level lock is still taken in the ordinary way (see
<xref linkend="mvcc">). You can use the <literal>NOWAIT</> option of
<xref linkend="sql-lock" endterm="sql-lock-title">
if you need to acquire the table-level lock without waiting.
</para>
<para>
<literal>FOR SHARE</literal> behaves similarly, except that it
acquires a shared rather than exclusive lock on each retrieved
@ -1117,13 +1107,26 @@ FOR SHARE [ OF <replaceable class="parameter">table_name</replaceable> [, ...] ]
from performing <command>SELECT FOR SHARE</command>.
</para>
<para>
To prevent the operation from waiting for other transactions to commit,
use the <literal>NOWAIT</> option. With <literal>NOWAIT</>, the statement
reports an error, rather than waiting, if a selected row
cannot be locked immediately. Note that <literal>NOWAIT</> applies only
to the row-level lock(s) &mdash; the required <literal>ROW SHARE</literal>
table-level lock is still taken in the ordinary way (see
<xref linkend="mvcc">). You can use
<xref linkend="sql-lock" endterm="sql-lock-title">
with the <literal>NOWAIT</> option first,
if you need to acquire the table-level lock without waiting.
</para>
<para>
If specific tables are named in <literal>FOR UPDATE</literal>
or <literal>FOR SHARE</literal>,
then only rows coming from those tables are locked; any other
tables used in the <command>SELECT</command> are simply read as
usual. A <literal>FOR UPDATE</literal> or <literal>FOR SHARE</literal>
clause without a table list affects all tables used in the command.
clause without a table list affects all tables used in the statement.
If <literal>FOR UPDATE</literal> or <literal>FOR SHARE</literal> is
applied to a view or sub-query, it affects all tables used in
the view or sub-query.
@ -1151,6 +1154,36 @@ FOR SHARE [ OF <replaceable class="parameter">table_name</replaceable> [, ...] ]
individual table rows; for example they cannot be used with aggregation.
</para>
<para>
When <literal>FOR UPDATE</literal> or <literal>FOR SHARE</literal>
appears at the top level of a <command>SELECT</> query, the rows that
are locked are exactly those that are returned by the query; in the
case of a join query, the rows locked are those that contribute to
returned join rows. In addition, rows that satisfied the query
conditions as of the query snapshot will be locked, although they
will not be returned if they have since been updated to not satisfy
the query conditions. If a <literal>LIMIT</> is used, locking stops
once enough rows have been returned to satisfy the limit (but note that
rows skipped over by <literal>OFFSET</> will get locked). Similarly,
if <literal>FOR UPDATE</literal> or <literal>FOR SHARE</literal>
is used in a cursor's query, only rows actually fetched or stepped past
by the cursor will be locked.
</para>
<para>
When <literal>FOR UPDATE</literal> or <literal>FOR SHARE</literal>
appears in a sub-<command>SELECT</>, the rows locked are those
returned to the outer query by the sub-query. This might involve
fewer rows than inspection of the sub-query alone would suggest,
since conditions from the outer query might be used to optimize
execution of the sub-query. For example,
<programlisting>
SELECT * FROM (SELECT * FROM mytable FOR UPDATE) ss WHERE col1 = 5;
</programlisting>
will lock only rows having <literal>col1 = 5</>, even though that
condition is not textually within the sub-query.
</para>
<caution>
<para>
Avoid locking a row and then modifying it within a later savepoint or
@ -1177,30 +1210,26 @@ ROLLBACK TO s;
<caution>
<para>
It is possible for a <command>SELECT</> command using both
<literal>LIMIT</literal> and <literal>FOR UPDATE/SHARE</literal>
clauses to return fewer rows than specified by <literal>LIMIT</literal>.
This is because <literal>LIMIT</> is applied first. The command
selects the specified number of rows,
but might then block trying to obtain a lock on one or more of them.
Once the <literal>SELECT</> unblocks, the row might have been deleted
or updated so that it does not meet the query <literal>WHERE</> condition
anymore, in which case it will not be returned.
</para>
</caution>
<caution>
<para>
Similarly, it is possible for a <command>SELECT</> command
using <literal>ORDER BY</literal> and <literal>FOR
UPDATE/SHARE</literal> to return rows out of order. This is
because <literal>ORDER BY</> is applied first. The command
orders the result, but might then block trying to obtain a lock
on one or more of the rows. Once the <literal>SELECT</>
unblocks, one of the ordered columns might have been modified
and be returned out of order. A workaround is to perform
<command>SELECT ... FOR UPDATE/SHARE</> and then <command>SELECT
... ORDER BY</>.
It is possible for a <command>SELECT</> command using <literal>ORDER
BY</literal> and <literal>FOR UPDATE/SHARE</literal> to return rows out of
order. This is because <literal>ORDER BY</> is applied first.
The command sorts the result, but might then block trying to obtain a lock
on one or more of the rows. Once the <literal>SELECT</> unblocks, some
of the ordering column values might have been modified, leading to those
rows appearing to be out of order (though they are in order in terms
of the original column values). This can be worked around at need by
placing the <literal>FOR UPDATE/SHARE</literal> clause in a sub-query,
for example
<programlisting>
SELECT * FROM (SELECT * FROM mytable FOR UPDATE) ss ORDER BY column1;
</programlisting>
Note that this will result in locking all rows of <structname>mytable</>,
whereas <literal>FOR UPDATE</> at the top level would lock only the
actually returned rows. This can make for a significant performance
difference, particularly if the <literal>ORDER BY</> is combined with
<literal>LIMIT</> or other restrictions. So this technique is recommended
only if concurrent updates of the ordering columns are expected and a
strictly sorted result is required.
</para>
</caution>
</refsect2>
@ -1541,15 +1570,28 @@ SELECT distributors.* WHERE distributors.name = 'Westward';
used by <productname>MySQL</productname>. The SQL:2008 standard
has introduced the clauses <literal>OFFSET ... FETCH {FIRST|NEXT}
...</literal> for the same functionality, as shown above
in <xref linkend="sql-limit" endterm="sql-limit-title">, and this
in <xref linkend="sql-limit" endterm="sql-limit-title">. This
syntax is also used by <productname>IBM DB2</productname>.
(Applications written for <productname>Oracle</productname>
frequently use a workaround involving the automatically
generated <literal>rownum</literal> column, not available in
generated <literal>rownum</literal> column, which is not available in
PostgreSQL, to implement the effects of these clauses.)
</para>
</refsect2>
<refsect2>
<title><literal>FOR UPDATE</> and <literal>FOR SHARE</></title>
<para>
Although <literal>FOR UPDATE</> appears in the SQL standard, the
standard allows it only as an option of <command>DECLARE CURSOR</>.
<productname>PostgreSQL</productname> allows it in any <command>SELECT</>
query as well as in sub-<command>SELECT</>s, but this is an extension.
The <literal>FOR SHARE</> variant, and the <literal>NOWAIT</> option,
do not appear in the standard.
</para>
</refsect2>
<refsect2>
<title>Nonstandard Clauses</title>