Detect SSI conflicts before reporting constraint violations

While prior to this patch the user-visible effect on the database of any set of successfully committed serializable transactions was always consistent with some one-at-a-time order of execution of those transactions, the presence of declarative constraints could allow errors to occur which were not possible in any such ordering, and developers had no good workarounds to prevent user-facing errors where they were not necessary or desired. This patch adds a check for serialization failure ahead of duplicate key checking so that if a developer explicitly (redundantly) checks for the pre-existing value they will get the desired serialization failure where the problem is caused by a concurrent serializable transaction; otherwise they will get a duplicate key error. While it would be better if the reads performed by the constraints could count as part of the work of the transaction for serialization failure checking, and we will hopefully get there some day, this patch allows a clean and reliable way for developers to work around the issue. In many cases existing code will already be doing the right thing for this to "just work". Author: Thomas Munro, with minor editing of docs by me Reviewed-by: Marko Tiikkaja, Kevin Grittner
2025-10-16 17:07:43 +03:00 · 2016-04-07 11:12:35 -05:00
parent bb140506df
commit fcff8a5751
11 changed files with 307 additions and 7 deletions
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -644,7 +644,7 @@ ERROR:  could not serialize access due to read/write dependencies among transact
    first.  In <productname>PostgreSQL</productname> these locks do not
    cause any blocking and therefore can <emphasis>not</> play any part in
    causing a deadlock.  They are used to identify and flag dependencies
-    among concurrent serializable transactions which in certain combinations
+    among concurrent Serializable transactions which in certain combinations
    can lead to serialization anomalies.  In contrast, a Read Committed or
    Repeatable Read transaction which wants to ensure data consistency may
    need to take out a lock on an entire table, which could block other
@@ -679,12 +679,13 @@ ERROR:  could not serialize access due to read/write dependencies among transact

   <para>
    Consistent use of Serializable transactions can simplify development.
-    The guarantee that any set of concurrent serializable transactions will
-    have the same effect as if they were run one at a time means that if
-    you can demonstrate that a single transaction, as written, will do the
-    right thing when run by itself, you can have confidence that it will
-    do the right thing in any mix of serializable transactions, even without
-    any information about what those other transactions might do.  It is
+    The guarantee that any set of successfully committed concurrent
+    Serializable transactions will have the same effect as if they were run
+    one at a time means that if you can demonstrate that a single transaction,
+    as written, will do the right thing when run by itself, you can have
+    confidence that it will do the right thing in any mix of Serializable
+    transactions, even without any information about what those other
+    transactions might do, or it will not successfully commit.  It is
    important that an environment which uses this technique have a
    generalized way of handling serialization failures (which always return
    with a SQLSTATE value of '40001'), because it will be very hard to
@@ -698,6 +699,26 @@ ERROR:  could not serialize access due to read/write dependencies among transact
    for some environments.
   </para>

+   <para>
+    While <productname>PostgreSQL</>'s Serializable transaction isolation
+    level only allows concurrent transactions to commit if it can prove there
+    is a serial order of execution that would produce the same effect, it
+    doesn't always prevent errors from being raised that would not occur in
+    true serial execution.  In particular, it is possible to see unique
+    constraint violations caused by conflicts with overlapping Serializable
+    transactions even after explicitly checking that the key isn't present
+    before attempting to insert it.  This can be avoided by making sure
+    that <emphasis>all</> Serializable transactions that insert potentially
+    conflicting keys explicitly check if they can do so first.  For example,
+    imagine an application that asks the user for a new key and then checks
+    that it doesn't exist already by trying to select it first, or generates
+    a new key by selecting the maximum existing key and adding one.  If some
+    Serializable transactions insert new keys directly without following this
+    protocol, unique constraints violations might be reported even in cases
+    where they could not occur in a serial execution of the concurrent
+    transactions.
+   </para>
+
   <para>
    For optimal performance when relying on Serializable transactions for
    concurrency control, these issues should be considered: