mirror of
https://github.com/postgres/postgres.git
synced 2025-05-08 07:21:33 +03:00
columns, and the new version can be stored on the same heap page, we no longer generate extra index entries for the new version. Instead, index searches follow the HOT-chain links to ensure they find the correct tuple version. In addition, this patch introduces the ability to "prune" dead tuples on a per-page basis, without having to do a complete VACUUM pass to recover space. VACUUM is still needed to clean up dead index entries, however. Pavan Deolasee, with help from a bunch of other people.
538 lines
19 KiB
Plaintext
538 lines
19 KiB
Plaintext
<!--
|
|
$PostgreSQL: pgsql/doc/src/sgml/ref/create_index.sgml,v 1.65 2007/09/20 17:56:30 tgl Exp $
|
|
PostgreSQL documentation
|
|
-->
|
|
|
|
<refentry id="SQL-CREATEINDEX">
|
|
<refmeta>
|
|
<refentrytitle id="sql-createindex-title">CREATE INDEX</refentrytitle>
|
|
<refmiscinfo>SQL - Language Statements</refmiscinfo>
|
|
</refmeta>
|
|
|
|
<refnamediv>
|
|
<refname>CREATE INDEX</refname>
|
|
<refpurpose>define a new index</refpurpose>
|
|
</refnamediv>
|
|
|
|
<indexterm zone="sql-createindex">
|
|
<primary>CREATE INDEX</primary>
|
|
</indexterm>
|
|
|
|
<refsynopsisdiv>
|
|
<synopsis>
|
|
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] <replaceable class="parameter">name</replaceable> ON <replaceable class="parameter">table</replaceable> [ USING <replaceable class="parameter">method</replaceable> ]
|
|
( { <replaceable class="parameter">column</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ <replaceable class="parameter">opclass</replaceable> ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] )
|
|
[ WITH ( <replaceable class="PARAMETER">storage_parameter</replaceable> = <replaceable class="PARAMETER">value</replaceable> [, ... ] ) ]
|
|
[ TABLESPACE <replaceable class="parameter">tablespace</replaceable> ]
|
|
[ WHERE <replaceable class="parameter">predicate</replaceable> ]
|
|
</synopsis>
|
|
</refsynopsisdiv>
|
|
|
|
<refsect1>
|
|
<title>Description</title>
|
|
|
|
<para>
|
|
<command>CREATE INDEX</command> constructs an index <replaceable
|
|
class="parameter">index_name</replaceable> on the specified table.
|
|
Indexes are primarily used to enhance database performance (though
|
|
inappropriate use can result in slower performance).
|
|
</para>
|
|
|
|
<para>
|
|
The key field(s) for the index are specified as column names,
|
|
or alternatively as expressions written in parentheses.
|
|
Multiple fields can be specified if the index method supports
|
|
multicolumn indexes.
|
|
</para>
|
|
|
|
<para>
|
|
An index field can be an expression computed from the values of
|
|
one or more columns of the table row. This feature can be used
|
|
to obtain fast access to data based on some transformation of
|
|
the basic data. For example, an index computed on
|
|
<literal>upper(col)</> would allow the clause
|
|
<literal>WHERE upper(col) = 'JIM'</> to use an index.
|
|
</para>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</productname> provides the index methods
|
|
B-tree, hash, GiST, and GIN. Users can also define their own index
|
|
methods, but that is fairly complicated.
|
|
</para>
|
|
|
|
<para>
|
|
When the <literal>WHERE</literal> clause is present, a
|
|
<firstterm>partial index</firstterm> is created.
|
|
A partial index is an index that contains entries for only a portion of
|
|
a table, usually a portion that is more useful for indexing than the
|
|
rest of the table. For example, if you have a table that contains both
|
|
billed and unbilled orders where the unbilled orders take up a small
|
|
fraction of the total table and yet that is an often used section, you
|
|
can improve performance by creating an index on just that portion.
|
|
Another possible application is to use <literal>WHERE</literal> with
|
|
<literal>UNIQUE</literal> to enforce uniqueness over a subset of a
|
|
table. See <xref linkend="indexes-partial"> for more discussion.
|
|
</para>
|
|
|
|
<para>
|
|
The expression used in the <literal>WHERE</literal> clause can refer
|
|
only to columns of the underlying table, but it can use all columns,
|
|
not just the ones being indexed. Presently, subqueries and
|
|
aggregate expressions are also forbidden in <literal>WHERE</literal>.
|
|
The same restrictions apply to index fields that are expressions.
|
|
</para>
|
|
|
|
<para>
|
|
All functions and operators used in an index definition must be
|
|
<quote>immutable</>, that is, their results must depend only on
|
|
their arguments and never on any outside influence (such as
|
|
the contents of another table or the current time). This restriction
|
|
ensures that the behavior of the index is well-defined. To use a
|
|
user-defined function in an index expression or <literal>WHERE</literal>
|
|
clause, remember to mark the function immutable when you create it.
|
|
</para>
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>Parameters</title>
|
|
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term><literal>UNIQUE</literal></term>
|
|
<listitem>
|
|
<para>
|
|
Causes the system to check for
|
|
duplicate values in the table when the index is created (if data
|
|
already exist) and each time data is added. Attempts to
|
|
insert or update data which would result in duplicate entries
|
|
will generate an error.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>CONCURRENTLY</literal></term>
|
|
<listitem>
|
|
<para>
|
|
When this option is used, <productname>PostgreSQL</> will build the
|
|
index without taking any locks that prevent concurrent inserts,
|
|
updates, or deletes on the table; whereas a standard index build
|
|
locks out writes (but not reads) on the table until it's done.
|
|
There are several caveats to be aware of when using this option
|
|
— see <xref linkend="SQL-CREATEINDEX-CONCURRENTLY"
|
|
endterm="SQL-CREATEINDEX-CONCURRENTLY-title">.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><replaceable class="parameter">name</replaceable></term>
|
|
<listitem>
|
|
<para>
|
|
The name of the index to be created. No schema name can be included
|
|
here; the index is always created in the same schema as its parent
|
|
table.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><replaceable class="parameter">table</replaceable></term>
|
|
<listitem>
|
|
<para>
|
|
The name (possibly schema-qualified) of the table to be indexed.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><replaceable class="parameter">method</replaceable></term>
|
|
<listitem>
|
|
<para>
|
|
The name of the index method to be used. Choices are
|
|
<literal>btree</literal>, <literal>hash</literal>,
|
|
<literal>gist</literal>, and <literal>gin</>. The
|
|
default method is <literal>btree</literal>.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><replaceable class="parameter">column</replaceable></term>
|
|
<listitem>
|
|
<para>
|
|
The name of a column of the table.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><replaceable class="parameter">expression</replaceable></term>
|
|
<listitem>
|
|
<para>
|
|
An expression based on one or more columns of the table. The
|
|
expression usually must be written with surrounding parentheses,
|
|
as shown in the syntax. However, the parentheses can be omitted
|
|
if the expression has the form of a function call.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><replaceable class="parameter">opclass</replaceable></term>
|
|
<listitem>
|
|
<para>
|
|
The name of an operator class. See below for details.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>ASC</></term>
|
|
<listitem>
|
|
<para>
|
|
Specifies ascending sort order (which is the default).
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>DESC</></term>
|
|
<listitem>
|
|
<para>
|
|
Specifies descending sort order.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>NULLS FIRST</></term>
|
|
<listitem>
|
|
<para>
|
|
Specifies that nulls sort before non-nulls. This is the default
|
|
when <literal>DESC</> is specified.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><literal>NULLS LAST</></term>
|
|
<listitem>
|
|
<para>
|
|
Specifies that nulls sort after non-nulls. This is the default
|
|
when <literal>DESC</> is not specified.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><replaceable class="parameter">storage_parameter</replaceable></term>
|
|
<listitem>
|
|
<para>
|
|
The name of an index-method-specific storage parameter. See
|
|
below for details.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><replaceable class="parameter">tablespace</replaceable></term>
|
|
<listitem>
|
|
<para>
|
|
The tablespace in which to create the index. If not specified,
|
|
<xref linkend="guc-default-tablespace"> is consulted, or
|
|
<xref linkend="guc-temp-tablespaces"> for indexes on temporary
|
|
tables.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
<varlistentry>
|
|
<term><replaceable class="parameter">predicate</replaceable></term>
|
|
<listitem>
|
|
<para>
|
|
The constraint expression for a partial index.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
<refsect2 id="SQL-CREATEINDEX-storage-parameters">
|
|
<title id="SQL-CREATEINDEX-storage-parameters-title">Index Storage Parameters</title>
|
|
|
|
<para>
|
|
The <literal>WITH</> clause can specify <firstterm>storage parameters</>
|
|
for indexes. Each index method can have its own set of allowed storage
|
|
parameters. The built-in index methods all accept a single parameter:
|
|
</para>
|
|
|
|
<variablelist>
|
|
|
|
<varlistentry>
|
|
<term><literal>FILLFACTOR</></term>
|
|
<listitem>
|
|
<para>
|
|
The fillfactor for an index is a percentage that determines how full
|
|
the index method will try to pack index pages. For B-trees, leaf pages
|
|
are filled to this percentage during initial index build, and also
|
|
when extending the index at the right (largest key values). If pages
|
|
subsequently become completely full, they will be split, leading to
|
|
gradual degradation in the index's efficiency. B-trees use a default
|
|
fillfactor of 90, but any value from 10 to 100 can be selected.
|
|
If the table is static then fillfactor 100 is best to minimize the
|
|
index's physical size, but for heavily updated tables a smaller
|
|
fillfactor is better to minimize the need for page splits. The
|
|
other index methods use fillfactor in different but roughly analogous
|
|
ways; the default fillfactor varies between methods.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
|
|
</variablelist>
|
|
|
|
</refsect2>
|
|
|
|
<refsect2 id="SQL-CREATEINDEX-CONCURRENTLY">
|
|
<title id="SQL-CREATEINDEX-CONCURRENTLY-title">Building Indexes Concurrently</title>
|
|
|
|
<indexterm zone="SQL-CREATEINDEX-CONCURRENTLY">
|
|
<primary>index</primary>
|
|
<secondary>building concurrently</secondary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Creating an index can interfere with regular operation of a database.
|
|
Normally <productname>PostgreSQL</> locks the table to be indexed against
|
|
writes and performs the entire index build with a single scan of the
|
|
table. Other transactions can still read the table, but if they try to
|
|
insert, update, or delete rows in the table they will block until the
|
|
index build is finished. This could have a severe effect if the system is
|
|
a live production database. Very large tables can take many hours to be
|
|
indexed, and even for smaller tables, an index build can lock out writers
|
|
for periods that are unacceptably long for a production system.
|
|
</para>
|
|
|
|
<para>
|
|
<productname>PostgreSQL</> supports building indexes without locking
|
|
out writes. This method is invoked by specifying the
|
|
<literal>CONCURRENTLY</> option of <command>CREATE INDEX</>.
|
|
When this option is used,
|
|
<productname>PostgreSQL</> must perform two scans of the table, and in
|
|
addition it must wait for all existing transactions that could potentially
|
|
use the index to terminate. Thus
|
|
this method requires more total work than a standard index build and takes
|
|
significantly longer to complete. However, since it allows normal
|
|
operations to continue while the index is built, this method is useful for
|
|
adding new indexes in a production environment. Of course, the extra CPU
|
|
and I/O load imposed by the index creation might slow other operations.
|
|
</para>
|
|
|
|
<para>
|
|
In a concurrent index build, the index is actually entered into the
|
|
system catalogs in one transaction, then the two table scans occur in a
|
|
second and third transaction.
|
|
If a problem arises while scanning the table, such as a
|
|
uniqueness violation in a unique index, the <command>CREATE INDEX</>
|
|
command will fail but leave behind an <quote>invalid</> index. This index
|
|
will be ignored for querying purposes because it might be incomplete;
|
|
however it will still consume update overhead. The <application>psql</>
|
|
<command>\d</> command will mark such an index as <literal>INVALID</>:
|
|
|
|
<programlisting>
|
|
postgres=# \d tab
|
|
Table "public.tab"
|
|
Column | Type | Modifiers
|
|
--------+---------+-----------
|
|
col | integer |
|
|
Indexes:
|
|
"idx" btree (col) INVALID
|
|
</programlisting>
|
|
|
|
The recommended recovery
|
|
method in such cases is to drop the index and try again to perform
|
|
<command>CREATE INDEX CONCURRENTLY</>. (Another possibility is to rebuild
|
|
the index with <command>REINDEX</>. However, since <command>REINDEX</>
|
|
does not support concurrent builds, this option is unlikely to seem
|
|
attractive.)
|
|
</para>
|
|
|
|
<para>
|
|
Another caveat when building a unique index concurrently is that the
|
|
uniqueness constraint is already being enforced against other transactions
|
|
when the second table scan begins. This means that constraint violations
|
|
could be reported in other queries prior to the index becoming available
|
|
for use, or even in cases where the index build eventually fails. Also,
|
|
if a failure does occur in the second scan, the <quote>invalid</> index
|
|
continues to enforce its uniqueness constraint afterwards.
|
|
</para>
|
|
|
|
<para>
|
|
Concurrent builds of expression indexes and partial indexes are supported.
|
|
Errors occurring in the evaluation of these expressions could cause
|
|
behavior similar to that described above for unique constraint violations.
|
|
</para>
|
|
|
|
<para>
|
|
Regular index builds permit other regular index builds on the
|
|
same table to occur in parallel, but only one concurrent index build
|
|
can occur on a table at a time. In both cases, no other types of schema
|
|
modification on the table are allowed meanwhile. Another difference
|
|
is that a regular <command>CREATE INDEX</> command can be performed within
|
|
a transaction block, but <command>CREATE INDEX CONCURRENTLY</> cannot.
|
|
</para>
|
|
</refsect2>
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>Notes</title>
|
|
|
|
<para>
|
|
See <xref linkend="indexes"> for information about when indexes can
|
|
be used, when they are not used, and in which particular situations
|
|
they can be useful.
|
|
</para>
|
|
|
|
<para>
|
|
Currently, only the B-tree and GiST index methods support
|
|
multicolumn indexes. Up to 32 fields can be specified by default.
|
|
(This limit can be altered when building
|
|
<productname>PostgreSQL</productname>.) Only B-tree currently
|
|
supports unique indexes.
|
|
</para>
|
|
|
|
<para>
|
|
An <firstterm>operator class</firstterm> can be specified for each
|
|
column of an index. The operator class identifies the operators to be
|
|
used by the index for that column. For example, a B-tree index on
|
|
four-byte integers would use the <literal>int4_ops</literal> class;
|
|
this operator class includes comparison functions for four-byte
|
|
integers. In practice the default operator class for the column's data
|
|
type is usually sufficient. The main point of having operator classes
|
|
is that for some data types, there could be more than one meaningful
|
|
ordering. For example, we might want to sort a complex-number data
|
|
type either by absolute value or by real part. We could do this by
|
|
defining two operator classes for the data type and then selecting
|
|
the proper class when making an index. More information about
|
|
operator classes is in <xref linkend="indexes-opclass"> and in <xref
|
|
linkend="xindex">.
|
|
</para>
|
|
|
|
<para>
|
|
For index methods that support ordered scans (currently, only B-tree),
|
|
the optional clauses <literal>ASC</>, <literal>DESC</>, <literal>NULLS
|
|
FIRST</>, and/or <literal>NULLS LAST</> can be specified to reverse
|
|
the normal sort direction of the index. Since an ordered index can be
|
|
scanned either forward or backward, it is not normally useful to create a
|
|
single-column <literal>DESC</> index — that sort ordering is already
|
|
available with a regular index. The value of these options is that
|
|
multicolumn indexes can be created that match the sort ordering requested
|
|
by a mixed-ordering query, such as <literal>SELECT ... ORDER BY x ASC, y
|
|
DESC</>. The <literal>NULLS</> options are useful if you need to support
|
|
<quote>nulls sort low</> behavior, rather than the default <quote>nulls
|
|
sort high</>, in queries that depend on indexes to avoid sorting steps.
|
|
</para>
|
|
|
|
<para>
|
|
Use <xref linkend="sql-dropindex" endterm="sql-dropindex-title">
|
|
to remove an index.
|
|
</para>
|
|
|
|
<para>
|
|
Prior releases of <productname>PostgreSQL</productname> also had an
|
|
R-tree index method. This method has been removed because
|
|
it had no significant advantages over the GiST method.
|
|
If <literal>USING rtree</> is specified, <command>CREATE INDEX</>
|
|
will interpret it as <literal>USING gist</>, to simplify conversion
|
|
of old databases to GiST.
|
|
</para>
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>Examples</title>
|
|
|
|
<para>
|
|
To create a B-tree index on the column <literal>title</literal> in
|
|
the table <literal>films</literal>:
|
|
<programlisting>
|
|
CREATE UNIQUE INDEX title_idx ON films (title);
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To create an index on the expression <literal>lower(title)</>,
|
|
allowing efficient case-insensitive searches:
|
|
<programlisting>
|
|
CREATE INDEX lower_title_idx ON films ((lower(title)));
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To create an index with non-default sort ordering of nulls:
|
|
<programlisting>
|
|
CREATE INDEX title_idx_nulls_low ON films (title NULLS FIRST);
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To create an index with non-default fill factor:
|
|
<programlisting>
|
|
CREATE UNIQUE INDEX title_idx ON films (title) WITH (fillfactor = 70);
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
To create an index on the column <literal>code</> in the table
|
|
<literal>films</> and have the index reside in the tablespace
|
|
<literal>indexspace</>:
|
|
<programlisting>
|
|
CREATE INDEX code_idx ON films(code) TABLESPACE indexspace;
|
|
</programlisting>
|
|
</para>
|
|
|
|
<!--
|
|
<comment>
|
|
Is this example correct?
|
|
</comment>
|
|
<para>
|
|
To create a GiST index on a point attribute so that we
|
|
can efficiently use box operators on the result of the
|
|
conversion function:
|
|
<programlisting>
|
|
CREATE INDEX pointloc
|
|
ON points USING GIST (point2box(location) box_ops);
|
|
SELECT * FROM points
|
|
WHERE point2box(points.pointloc) = boxes.box;
|
|
</programlisting>
|
|
</para>
|
|
-->
|
|
|
|
<para>
|
|
To create an index without locking out writes to the table:
|
|
<programlisting>
|
|
CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity);
|
|
</programlisting>
|
|
</para>
|
|
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>Compatibility</title>
|
|
|
|
<para>
|
|
<command>CREATE INDEX</command> is a
|
|
<productname>PostgreSQL</productname> language extension. There
|
|
are no provisions for indexes in the SQL standard.
|
|
</para>
|
|
</refsect1>
|
|
|
|
<refsect1>
|
|
<title>See Also</title>
|
|
|
|
<simplelist type="inline">
|
|
<member><xref linkend="sql-alterindex" endterm="sql-alterindex-title"></member>
|
|
<member><xref linkend="sql-dropindex" endterm="sql-dropindex-title"></member>
|
|
</simplelist>
|
|
</refsect1>
|
|
</refentry>
|