Re-think predicate locking on GIN indexes.

The principle behind the locking was not very well thought-out, and not documented. Add a section in the README to explain how it's supposed to work, and change the code so that it actually works that way. This fixes two bugs: 1. If fast update was turned on concurrently, subsequent inserts to the pending list would not conflict with predicate locks that were acquired earlier, on entry pages. The included 'predicate-gin-fastupdate' test demonstrates that. To fix, make all scans acquire a predicate lock on the metapage. That lock represents a scan of the pending list, whether or not there is a pending list at the moment. Forget about the optimization to skip locking/checking for locks, when fastupdate=off. 2. If a scan finds no match, it still needs to lock the entry page. The point of predicate locks is to lock the gabs between values, whether or not there is a match. The included 'predicate-gin-nomatch' test tests that case. In addition to those two bug fixes, this removes some unnecessary locking, following the principle laid out in the README. Because all items in a posting tree have the same key value, a lock on the posting tree root is enough to cover all the items. (With a very large posting tree, it would possibly be better to lock the posting tree leaf pages instead, so that a "skip scan" with a query like "A & B", you could avoid unnecessary conflict if a new tuple is inserted with A but !B. But let's keep this simple.) Also, some spelling fixes. Author: Heikki Linnakangas with some editorization by me Review: Andrey Borodin, Alexander Korotkov Discussion: https://www.postgresql.org/message-id/0b3ad2c2-2692-62a9-3a04-5724f2af9114@iki.fi
2025-08-17 01:02:17 +03:00 · 2018-05-04 11:27:50 +03:00
parent 7d8679975f
commit 0bef1c0678
18 changed files with 251 additions and 117 deletions
--- a/src/backend/storage/lmgr/README-SSI
+++ b/src/backend/storage/lmgr/README-SSI
@@ -373,21 +373,22 @@ index *leaf* pages needed to lock the appropriate index range. If,
 however, a search discovers that no root page has yet been created, a
 predicate lock on the index relation is required.

+    * Like a B-tree, GIN searches acquire predicate locks only on the
+leaf pages of entry tree. When performing an equality scan, and an
+entry has a posting tree, the posting tree root is locked instead, to
+lock only that key value. However, fastupdate=on postpones the
+insertion of tuples into index structure by temporarily storing them
+into pending list. That makes us unable to detect r-w conflicts using
+page-level locks. To cope with that, insertions to the pending list
+conflict with all scans.
+
    * GiST searches can determine that there are no matches at any
 level of the index, so we acquire predicate lock at each index
 level during a GiST search. An index insert at the leaf level can
 then be trusted to ripple up to all levels and locations where
 conflicting predicate locks may exist. In case there is a page split,
-we need to copy predicate lock from an original page to all new pages.
-
-    * GIN searches acquire predicate locks only on the leaf pages
-of entry tree and posting tree. During a page split, a predicate locks are
-copied from the original page to the new page. In the same way predicate locks
-are copied from entry tree leaf page to freshly created posting tree root.
-However, when fast update is enabled, a predicate lock on the whole index
-relation is required. Fast update postpones the insertion of tuples into index
-structure by temporarily storing them into pending list. That makes us unable
-to detect r-w conflicts using page-level locks.
+we need to copy predicate lock from the original page to all the new
+pages.

    * Hash index searches acquire predicate locks on the primary
 page of a bucket. It acquires a lock on both the old and new buckets
@@ -395,7 +396,6 @@ for scans that happen concurrently with page splits. During a bucket
 split, a predicate lock is copied from the primary page of an old
 bucket to the primary page of a new bucket.

-
    * The effects of page splits, overflows, consolidations, and
 removals must be carefully reviewed to ensure that predicate locks
 aren't "lost" during those operations, or kept with pages which could