1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-12 05:01:15 +03:00

Fix two serious bugs introduced into hash indexes by the 8.4 patch that made

hash indexes keep entries sorted by hash value.  First, the original plans for
concurrency assumed that insertions would happen only at the end of a page,
which is no longer true; this could cause scans to transiently fail to find
index entries in the presence of concurrent insertions.  We can compensate
by teaching scans to re-find their position after re-acquiring read locks.
Second, neither the bucket split nor the bucket compaction logic had been
fixed to preserve hashvalue ordering, so application of either of those
processes could lead to permanent corruption of an index, in the sense
that searches might fail to find entries that are present.

This patch fixes the split and compaction logic to preserve hashvalue
ordering, but it cannot do anything about pre-existing corruption.  We will
need to recommend reindexing all hash indexes in the 8.4.2 release notes.

To buy back the performance loss hereby induced in split and compaction,
fix them to use PageIndexMultiDelete instead of retail PageIndexDelete
operations.  We might later want to do something with qsort'ing the
page contents rather than doing a binary search for each insertion,
but that seemed more invasive than I cared to risk in a back-patch.

Per bug #5157 from Jeff Janes and subsequent investigation.
This commit is contained in:
Tom Lane
2009-11-01 21:25:25 +00:00
parent ef59fa0453
commit c4afdca4c2
6 changed files with 250 additions and 190 deletions

View File

@@ -8,7 +8,7 @@
*
*
* IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/access/hash/hashinsert.c,v 1.52 2009/01/01 17:23:35 momjian Exp $
* $PostgreSQL: pgsql/src/backend/access/hash/hashinsert.c,v 1.53 2009/11/01 21:25:25 tgl Exp $
*
*-------------------------------------------------------------------------
*/
@@ -20,10 +20,6 @@
#include "utils/rel.h"
static OffsetNumber _hash_pgaddtup(Relation rel, Buffer buf,
Size itemsize, IndexTuple itup);
/*
* _hash_doinsert() -- Handle insertion of a single index tuple.
*
@@ -180,15 +176,16 @@ _hash_doinsert(Relation rel, IndexTuple itup)
/*
* _hash_pgaddtup() -- add a tuple to a particular page in the index.
*
* This routine adds the tuple to the page as requested; it does
* not write out the page. It is an error to call pgaddtup() without
* a write lock and pin.
* This routine adds the tuple to the page as requested; it does not write out
* the page. It is an error to call pgaddtup() without pin and write lock on
* the target buffer.
*
* Returns the offset number at which the tuple was inserted. This function
* is responsible for preserving the condition that tuples in a hash index
* page are sorted by hashkey value.
*/
static OffsetNumber
_hash_pgaddtup(Relation rel,
Buffer buf,
Size itemsize,
IndexTuple itup)
OffsetNumber
_hash_pgaddtup(Relation rel, Buffer buf, Size itemsize, IndexTuple itup)
{
OffsetNumber itup_off;
Page page;