mirror of
https://github.com/postgres/postgres.git
synced 2025-07-28 23:42:10 +03:00
R-tree is dead ... long live GiST.
This commit is contained in:
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.75 2005/11/04 23:13:59 petere Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.76 2005/11/07 17:36:44 tgl Exp $
|
||||
-->
|
||||
<chapter id="backup">
|
||||
<title>Backup and Restore</title>
|
||||
@ -1129,8 +1129,8 @@ restore_command = 'copy /mnt/server/archivedir/%f "%p"' # Windows
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Operations on hash and R-tree indexes are
|
||||
not presently WAL-logged, so replay will not update these index types.
|
||||
Operations on hash indexes are
|
||||
not presently WAL-logged, so replay will not update these indexes.
|
||||
The recommended workaround is to manually <command>REINDEX</> each
|
||||
such index after completing a recovery operation.
|
||||
</para>
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.33 2005/10/25 13:38:09 momjian Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.34 2005/11/07 17:36:44 tgl Exp $
|
||||
Genetic Optimizer
|
||||
-->
|
||||
|
||||
@ -51,8 +51,8 @@ Genetic Optimizer
|
||||
caused by the support of a variety of <firstterm>join
|
||||
methods</firstterm> (e.g., nested loop, hash join, merge join in
|
||||
<productname>PostgreSQL</productname>) to process individual joins
|
||||
and a diversity of <firstterm>indexes</firstterm> (e.g., R-tree,
|
||||
B-tree, hash in <productname>PostgreSQL</productname>) as access
|
||||
and a diversity of <firstterm>indexes</firstterm> (e.g.,
|
||||
B-tree, hash, GiST in <productname>PostgreSQL</productname>) as access
|
||||
paths for relations.
|
||||
</para>
|
||||
|
||||
|
@ -1,25 +1,22 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.24 2005/11/04 23:14:00 petere Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.25 2005/11/07 17:36:44 tgl Exp $
|
||||
-->
|
||||
|
||||
<chapter id="GiST">
|
||||
<title>GiST Indexes</title>
|
||||
|
||||
<sect1 id="gist-intro">
|
||||
<title>Introduction</title>
|
||||
|
||||
<para>
|
||||
<indexterm>
|
||||
<primary>index</primary>
|
||||
<secondary>GiST</secondary>
|
||||
</indexterm>
|
||||
<indexterm>
|
||||
<primary>GiST</primary>
|
||||
<see>index</see>
|
||||
</indexterm>
|
||||
|
||||
<sect1 id="gist-intro">
|
||||
<title>Introduction</title>
|
||||
|
||||
<para>
|
||||
<acronym>GiST</acronym> stands for Generalized Search Tree. It is a
|
||||
balanced, tree-structured access method, that acts as a base template in
|
||||
which to implement arbitrary indexing schemes. B+-trees, R-trees and many
|
||||
which to implement arbitrary indexing schemes. B-trees, R-trees and many
|
||||
other indexing schemes can be implemented in <acronym>GiST</acronym>.
|
||||
</para>
|
||||
|
||||
@ -60,17 +57,17 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.24 2005/11/04 23:14:00 petere Exp
|
||||
<para>
|
||||
This extensibility should not be confused with the extensibility of the
|
||||
other standard search trees in terms of the data they can handle. For
|
||||
example, <productname>PostgreSQL</productname> supports extensible B+-trees
|
||||
and R-trees. That means that you can use
|
||||
<productname>PostgreSQL</productname> to build a B+-tree or R-tree over any
|
||||
data type you want. But B+-trees only support range predicates
|
||||
example, <productname>PostgreSQL</productname> supports extensible B-trees
|
||||
and hash indexes. That means that you can use
|
||||
<productname>PostgreSQL</productname> to build a B-tree or hash over any
|
||||
data type you want. But B-trees only support range predicates
|
||||
(<literal><</literal>, <literal>=</literal>, <literal>></literal>),
|
||||
and R-trees only support n-D range queries (contains, contained, equals).
|
||||
and hash indexes only support equality queries.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
So if you index, say, an image collection with a
|
||||
<productname>PostgreSQL</productname> B+-tree, you can only issue queries
|
||||
<productname>PostgreSQL</productname> B-tree, you can only issue queries
|
||||
such as <quote>is imagex equal to imagey</quote>, <quote>is imagex less
|
||||
than imagey</quote> and <quote>is imagex greater than imagey</quote>?
|
||||
Depending on how you define <quote>equals</quote>, <quote>less than</quote>
|
||||
@ -84,7 +81,7 @@ $PostgreSQL: pgsql/doc/src/sgml/gist.sgml,v 1.24 2005/11/04 23:14:00 petere Exp
|
||||
All it takes to get a <acronym>GiST</acronym> access method up and running
|
||||
is to implement seven user-defined methods, which define the behavior of
|
||||
keys in the tree. Of course these methods have to be pretty fancy to
|
||||
support fancy queries, but for all the standard queries (B+-trees,
|
||||
support fancy queries, but for all the standard queries (B-trees,
|
||||
R-trees, etc.) they're relatively straightforward. In short,
|
||||
<acronym>GiST</acronym> combines extensibility along with generality, code
|
||||
reuse, and a clean interface.
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.54 2005/11/04 23:14:00 petere Exp $ -->
|
||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/indices.sgml,v 1.55 2005/11/07 17:36:44 tgl Exp $ -->
|
||||
|
||||
<chapter id="indexes">
|
||||
<title id="indexes-title">Indexes</title>
|
||||
@ -104,7 +104,7 @@ CREATE INDEX test1_id_index ON test1 (id);
|
||||
|
||||
<para>
|
||||
<productname>PostgreSQL</productname> provides several index types:
|
||||
B-tree, R-tree, Hash, and GiST. Each index type uses a different
|
||||
B-tree, Hash, and GiST. Each index type uses a different
|
||||
algorithm that is best suited to different types of queries.
|
||||
By default, the <command>CREATE INDEX</command> command will create a
|
||||
B-tree index, which fits the most common situations.
|
||||
@ -155,20 +155,51 @@ CREATE INDEX test1_id_index ON test1 (id);
|
||||
<para>
|
||||
<indexterm>
|
||||
<primary>index</primary>
|
||||
<secondary>R-tree</secondary>
|
||||
<secondary>hash</secondary>
|
||||
</indexterm>
|
||||
<indexterm>
|
||||
<primary>R-tree</primary>
|
||||
<primary>hash</primary>
|
||||
<see>index</see>
|
||||
</indexterm>
|
||||
R-tree indexes are suited for queries on two-dimensional spatial data.
|
||||
To create an R-tree index, use a command of the form
|
||||
Hash indexes can only handle simple equality comparisons.
|
||||
The query planner will consider using a hash index whenever an
|
||||
indexed column is involved in a comparison using the
|
||||
<literal>=</literal> operator. The following command is used to
|
||||
create a hash index:
|
||||
<synopsis>
|
||||
CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING rtree (<replaceable>column</replaceable>);
|
||||
CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING hash (<replaceable>column</replaceable>);
|
||||
</synopsis>
|
||||
The <productname>PostgreSQL</productname> query planner will
|
||||
consider using an R-tree index whenever an indexed column is
|
||||
involved in a comparison using one of these operators:
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
Testing has shown <productname>PostgreSQL</productname>'s hash
|
||||
indexes to perform no better than B-tree indexes, and the
|
||||
index size and build time for hash indexes is much worse.
|
||||
Furthermore, hash index operations are not presently WAL-logged,
|
||||
so hash indexes may need to be rebuilt with <command>REINDEX</>
|
||||
after a database crash.
|
||||
For these reasons, hash index use is presently discouraged.
|
||||
</para>
|
||||
</note>
|
||||
|
||||
<para>
|
||||
<indexterm>
|
||||
<primary>index</primary>
|
||||
<secondary>GiST</secondary>
|
||||
</indexterm>
|
||||
<indexterm>
|
||||
<primary>GiST</primary>
|
||||
<see>index</see>
|
||||
</indexterm>
|
||||
GiST indexes are not a single kind of index, but rather an infrastructure
|
||||
within which many different indexing strategies can be implemented.
|
||||
Accordingly, the particular operators with which a GiST index can be
|
||||
used vary depending on the indexing strategy (the <firstterm>operator
|
||||
class</>). As an example, the standard distribution of
|
||||
<productname>PostgreSQL</productname> includes GiST operator classes
|
||||
for several two-dimensional geometric data types, which support indexed
|
||||
queries using these operators:
|
||||
|
||||
<simplelist>
|
||||
<member><literal><<</literal></member>
|
||||
@ -187,64 +218,10 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
|
||||
|
||||
(See <xref linkend="functions-geometry"> for the meaning of
|
||||
these operators.)
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<indexterm>
|
||||
<primary>index</primary>
|
||||
<secondary>hash</secondary>
|
||||
</indexterm>
|
||||
<indexterm>
|
||||
<primary>hash</primary>
|
||||
<see>index</see>
|
||||
</indexterm>
|
||||
Hash indexes can only handle simple equality comparisons.
|
||||
The query planner will consider using a hash index whenever an
|
||||
indexed column is involved in a comparison using the
|
||||
<literal>=</literal> operator. The following command is used to
|
||||
create a hash index:
|
||||
<synopsis>
|
||||
CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable> USING hash (<replaceable>column</replaceable>);
|
||||
</synopsis>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
GiST indexes are not a single kind of index, but rather an infrastructure
|
||||
within which many different indexing strategies can be implemented.
|
||||
Accordingly, the particular operators with which a GiST index can be
|
||||
used vary depending on the indexing strategy (the <firstterm>operator
|
||||
class</>). The standard distribution of
|
||||
<productname>PostgreSQL</productname> includes GiST operator classes
|
||||
equivalent to the R-tree operator classes, and many other GiST operator
|
||||
Many other GiST operator
|
||||
classes are available in the <literal>contrib</> collection or as separate
|
||||
projects. For more information see <xref linkend="GiST">.
|
||||
</para>
|
||||
|
||||
<note>
|
||||
<para>
|
||||
Testing has shown <productname>PostgreSQL</productname>'s hash
|
||||
indexes to perform no better than B-tree indexes, and the
|
||||
index size and build time for hash indexes is much worse.
|
||||
Furthermore, hash index operations are not presently WAL-logged,
|
||||
so hash indexes may need to be rebuilt with <command>REINDEX</>
|
||||
after a database crash.
|
||||
For these reasons, hash index use is presently discouraged.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Similarly, R-tree indexes do not seem to have any performance
|
||||
advantages compared to the equivalent operations of GiST indexes.
|
||||
Like hash indexes, they are not WAL-logged and may need
|
||||
reindexing after a database crash.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
While the problems with hash indexes may be fixed eventually,
|
||||
it is likely that the R-tree index type will be retired in a future
|
||||
release. Users are encouraged to migrate applications that use R-tree
|
||||
indexes to GiST indexes.
|
||||
</para>
|
||||
</note>
|
||||
</sect1>
|
||||
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.52 2005/10/21 01:41:28 tgl Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/mvcc.sgml,v 2.53 2005/11/07 17:36:44 tgl Exp $
|
||||
-->
|
||||
|
||||
<chapter id="mvcc">
|
||||
@ -991,18 +991,6 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
|
||||
<varlistentry>
|
||||
<term>
|
||||
R-tree indexes
|
||||
</term>
|
||||
<listitem>
|
||||
<para>
|
||||
Share/exclusive index-level locks are used for read/write access.
|
||||
Locks are released after the entire command is done.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</para>
|
||||
|
||||
@ -1012,8 +1000,7 @@ UPDATE accounts SET balance = balance - 100.00 WHERE acctnum = 22222;
|
||||
indexes, they are the recommended index type for concurrent
|
||||
applications that need to index scalar data. When dealing with
|
||||
non-scalar data, B-trees are not useful, and GiST indexes should
|
||||
be used instead. R-tree indexes are deprecated and are likely
|
||||
to disappear entirely in a future release.
|
||||
be used instead.
|
||||
</para>
|
||||
</sect1>
|
||||
</chapter>
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/create_index.sgml,v 1.51 2005/01/04 00:39:53 tgl Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/ref/create_index.sgml,v 1.52 2005/11/07 17:36:44 tgl Exp $
|
||||
PostgreSQL documentation
|
||||
-->
|
||||
|
||||
@ -34,7 +34,7 @@ CREATE [ UNIQUE ] INDEX <replaceable class="parameter">name</replaceable> ON <re
|
||||
<command>CREATE INDEX</command> constructs an index <replaceable
|
||||
class="parameter">index_name</replaceable> on the specified table.
|
||||
Indexes are primarily used to enhance database performance (though
|
||||
inappropriate use will result in slower performance).
|
||||
inappropriate use can result in slower performance).
|
||||
</para>
|
||||
|
||||
<para>
|
||||
@ -55,11 +55,7 @@ CREATE [ UNIQUE ] INDEX <replaceable class="parameter">name</replaceable> ON <re
|
||||
|
||||
<para>
|
||||
<productname>PostgreSQL</productname> provides the index methods
|
||||
B-tree, R-tree, hash, and GiST. The B-tree index method is an
|
||||
implementation of Lehman-Yao high-concurrency B-trees. The R-tree
|
||||
index method implements standard R-trees using Guttman's quadratic
|
||||
split algorithm. The hash index method is an implementation of
|
||||
Litwin's linear hashing. Users can also define their own index
|
||||
B-tree, hash, and GiST. Users can also define their own index
|
||||
methods, but that is fairly complicated.
|
||||
</para>
|
||||
|
||||
@ -137,9 +133,9 @@ CREATE [ UNIQUE ] INDEX <replaceable class="parameter">name</replaceable> ON <re
|
||||
<term><replaceable class="parameter">method</replaceable></term>
|
||||
<listitem>
|
||||
<para>
|
||||
The name of the method to be used for the index. Choices are
|
||||
The name of the index method to be used. Choices are
|
||||
<literal>btree</literal>, <literal>hash</literal>,
|
||||
<literal>rtree</literal>, and <literal>gist</literal>. The
|
||||
and <literal>gist</literal>. The
|
||||
default method is <literal>btree</literal>.
|
||||
</para>
|
||||
</listitem>
|
||||
@ -243,6 +239,15 @@ CREATE [ UNIQUE ] INDEX <replaceable class="parameter">name</replaceable> ON <re
|
||||
The best way to use indexes in such cases is to create a partial index
|
||||
using an <literal>IS NULL</> predicate.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Prior releases of <productname>PostgreSQL</productname> also had an
|
||||
R-tree index method. This method has been removed because
|
||||
it had no significant advantages over the GiST method.
|
||||
If <literal>USING rtree</> is specified, <command>CREATE INDEX</>
|
||||
will interpret it as <literal>USING gist</>, to simplify conversion
|
||||
of old databases to GiST.
|
||||
</para>
|
||||
</refsect1>
|
||||
|
||||
<refsect1>
|
||||
@ -270,13 +275,13 @@ CREATE INDEX code_idx ON films(code) TABLESPACE indexspace;
|
||||
Is this example correct?
|
||||
</comment>
|
||||
<para>
|
||||
To create a R-tree index on a point attribute so that we
|
||||
To create a GiST index on a point attribute so that we
|
||||
can efficiently use box operators on the result of the
|
||||
conversion function:
|
||||
</para>
|
||||
<programlisting>
|
||||
CREATE INDEX pointloc
|
||||
ON points USING RTREE (point2box(location) box_ops);
|
||||
ON points USING GIST (point2box(location) box_ops);
|
||||
SELECT * FROM points
|
||||
WHERE point2box(points.pointloc) = boxes.box;
|
||||
</programlisting>
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.41 2005/07/19 01:27:59 neilc Exp $
|
||||
$PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.42 2005/11/07 17:36:44 tgl Exp $
|
||||
-->
|
||||
|
||||
<sect1 id="xindex">
|
||||
@ -170,8 +170,12 @@ $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.41 2005/07/19 01:27:59 neilc Exp
|
||||
</table>
|
||||
|
||||
<para>
|
||||
R-tree indexes express relationships in two-dimensional space.
|
||||
They use twelve strategies, shown in
|
||||
GiST indexes are even more flexible: they do not have a fixed set of
|
||||
strategies at all. Instead, the <quote>consistency</> support routine
|
||||
of each particular GiST operator class interprets the strategy numbers
|
||||
however it likes. As an example, several of the built-in GiST index
|
||||
operator classes index two-dimensional geometric objects, providing
|
||||
the <quote>R-tree</> strategies shown in
|
||||
<xref linkend="xindex-rtree-strat-table">. Four of these are true
|
||||
two-dimensional tests (overlaps, same, contains, contained by);
|
||||
four of them consider only the X direction; and the other four
|
||||
@ -179,7 +183,7 @@ $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.41 2005/07/19 01:27:59 neilc Exp
|
||||
</para>
|
||||
|
||||
<table tocentry="1" id="xindex-rtree-strat-table">
|
||||
<title>R-tree Strategies</title>
|
||||
<title>GiST Two-Dimensional <quote>R-tree</> Strategies</title>
|
||||
<tgroup cols="2">
|
||||
<thead>
|
||||
<row>
|
||||
@ -240,13 +244,6 @@ $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.41 2005/07/19 01:27:59 neilc Exp
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
GiST indexes are even more flexible: they do not have a fixed set of
|
||||
strategies at all. Instead, the <quote>consistency</> support routine
|
||||
of each particular GiST operator class interprets the strategy numbers
|
||||
however it likes.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
Note that all strategy operators return Boolean values. In
|
||||
practice, all operators defined as index method strategies must
|
||||
@ -274,9 +271,8 @@ $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.41 2005/07/19 01:27:59 neilc Exp
|
||||
additional support routines in order to work. For example, the B-tree
|
||||
index method must be able to compare two keys and determine whether one
|
||||
is greater than, equal to, or less than the other. Similarly, the
|
||||
R-tree index method must be able to compute
|
||||
intersections, unions, and sizes of rectangles. These
|
||||
operations do not correspond to operators used in qualifications in
|
||||
hash index method must be able to compute hash codes for key values.
|
||||
These operations do not correspond to operators used in qualifications in
|
||||
SQL commands; they are administrative routines used by
|
||||
the index methods, internally.
|
||||
</para>
|
||||
@ -339,37 +335,6 @@ $PostgreSQL: pgsql/doc/src/sgml/xindex.sgml,v 1.41 2005/07/19 01:27:59 neilc Exp
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
R-tree indexes require three support functions,
|
||||
shown in <xref linkend="xindex-rtree-support-table">.
|
||||
</para>
|
||||
|
||||
<table tocentry="1" id="xindex-rtree-support-table">
|
||||
<title>R-tree Support Functions</title>
|
||||
<tgroup cols="2">
|
||||
<thead>
|
||||
<row>
|
||||
<entry>Function</entry>
|
||||
<entry>Support Number</entry>
|
||||
</row>
|
||||
</thead>
|
||||
<tbody>
|
||||
<row>
|
||||
<entry>union</entry>
|
||||
<entry>1</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>intersection</entry>
|
||||
<entry>2</entry>
|
||||
</row>
|
||||
<row>
|
||||
<entry>size</entry>
|
||||
<entry>3</entry>
|
||||
</row>
|
||||
</tbody>
|
||||
</tgroup>
|
||||
</table>
|
||||
|
||||
<para>
|
||||
GiST indexes require seven support functions,
|
||||
shown in <xref linkend="xindex-gist-support-table">.
|
||||
@ -746,7 +711,7 @@ SELECT * FROM table WHERE integer_column < 4;
|
||||
</programlisting>
|
||||
can be satisfied exactly by a B-tree index on the integer column.
|
||||
But there are cases where an index is useful as an inexact guide to
|
||||
the matching rows. For example, if an R-tree index stores only
|
||||
the matching rows. For example, if a GiST index stores only
|
||||
bounding boxes for objects, then it cannot exactly satisfy a <literal>WHERE</>
|
||||
condition that tests overlap between nonrectangular objects such as
|
||||
polygons. Yet we could use the index to find objects whose bounding
|
||||
|
Reference in New Issue
Block a user