1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-30 11:03:19 +03:00

Replace the built-in GIN array opclasses with a single polymorphic opclass.

We had thirty different GIN array opclasses sharing the same operators and
support functions.  That still didn't cover all the built-in types, nor
did it cover arrays of extension-added types.  What we want is a single
polymorphic opclass for "anyarray".  There were two missing features needed
to make this possible:

1. We have to be able to declare the index storage type as ANYELEMENT
when the opclass is declared to index ANYARRAY.  This just takes a few
more lines in index_create().  Although this currently seems of use only
for GIN, there's no reason to make index_create() restrict it to that.

2. We have to be able to identify the proper GIN compare function for
the index storage type.  This patch proceeds by making the compare function
optional in GIN opclass definitions, and specifying that the default btree
comparison function for the index storage type will be looked up when the
opclass omits it.  Again, that seems pretty generically useful.

Since the comparison function lookup is done in initGinState(), making
use of the second feature adds an additional cache lookup to GIN index
access setup.  It seems unlikely that that would be very noticeable given
the other costs involved, but maybe at some point we should consider
making GinState data persist longer than it now does --- we could keep it
in the index relcache entry, perhaps.

Rather fortuitously, we don't seem to need to do anything to get this
change to play nice with dump/reload or pg_upgrade scenarios: the new
opclass definition is automatically selected to replace existing index
definitions, and the on-disk data remains compatible.  Also, if a user has
created a custom opclass definition for a non-builtin type, this doesn't
break that, since CREATE INDEX will prefer an exact match to opcintype
over a match to ANYARRAY.  However, if there's anyone out there with
handwritten DDL that explicitly specifies _bool_ops or one of the other
replaced opclass names, they'll need to adjust that.

Tom Lane, reviewed by Enrique Meneses

Discussion: <14436.1470940379@sss.pgh.pa.us>
This commit is contained in:
Tom Lane
2016-09-26 14:52:44 -04:00
parent a4afb2b5c0
commit fdc9186f7e
11 changed files with 101 additions and 503 deletions

View File

@ -85,298 +85,8 @@
</thead>
<tbody>
<row>
<entry><literal>_abstime_ops</></entry>
<entry><type>abstime[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_bit_ops</></entry>
<entry><type>bit[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_bool_ops</></entry>
<entry><type>boolean[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_bpchar_ops</></entry>
<entry><type>character[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_bytea_ops</></entry>
<entry><type>bytea[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_char_ops</></entry>
<entry><type>"char"[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_cidr_ops</></entry>
<entry><type>cidr[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_date_ops</></entry>
<entry><type>date[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_float4_ops</></entry>
<entry><type>float4[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_float8_ops</></entry>
<entry><type>float8[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_inet_ops</></entry>
<entry><type>inet[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_int2_ops</></entry>
<entry><type>smallint[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_int4_ops</></entry>
<entry><type>integer[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_int8_ops</></entry>
<entry><type>bigint[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_interval_ops</></entry>
<entry><type>interval[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_macaddr_ops</></entry>
<entry><type>macaddr[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_money_ops</></entry>
<entry><type>money[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_name_ops</></entry>
<entry><type>name[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_numeric_ops</></entry>
<entry><type>numeric[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_oid_ops</></entry>
<entry><type>oid[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_oidvector_ops</></entry>
<entry><type>oidvector[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_reltime_ops</></entry>
<entry><type>reltime[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_text_ops</></entry>
<entry><type>text[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_time_ops</></entry>
<entry><type>time[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_timestamp_ops</></entry>
<entry><type>timestamp[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_timestamptz_ops</></entry>
<entry><type>timestamp with time zone[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_timetz_ops</></entry>
<entry><type>time with time zone[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_tinterval_ops</></entry>
<entry><type>tinterval[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_varbit_ops</></entry>
<entry><type>bit varying[]</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
<literal>=</>
<literal>@&gt;</>
</entry>
</row>
<row>
<entry><literal>_varchar_ops</></entry>
<entry><type>character varying[]</></entry>
<entry><literal>array_ops</></entry>
<entry><type>anyarray</></entry>
<entry>
<literal>&amp;&amp;</>
<literal>&lt;@</>
@ -441,22 +151,10 @@
</para>
<para>
There are three methods that an operator class for
There are two methods that an operator class for
<acronym>GIN</acronym> must provide:
<variablelist>
<varlistentry>
<term><function>int compare(Datum a, Datum b)</></term>
<listitem>
<para>
Compares two keys (not indexed items!) and returns an integer less than
zero, zero, or greater than zero, indicating whether the first key is
less than, equal to, or greater than the second. Null keys are never
passed to this function.
</para>
</listitem>
</varlistentry>
<variablelist>
<varlistentry>
<term><function>Datum *extractValue(Datum itemValue, int32 *nkeys,
bool **nullFlags)</></term>
@ -645,7 +343,38 @@
</listitem>
</varlistentry>
</variablelist>
</para>
<para>
In addition, GIN must have a way to sort the key values stored in the index.
The operator class can define the sort ordering by specifying a comparison
method:
<variablelist>
<varlistentry>
<term><function>int compare(Datum a, Datum b)</></term>
<listitem>
<para>
Compares two keys (not indexed items!) and returns an integer less than
zero, zero, or greater than zero, indicating whether the first key is
less than, equal to, or greater than the second. Null keys are never
passed to this function.
</para>
</listitem>
</varlistentry>
</variablelist>
Alternatively, if the operator class does not provide a <function>compare</>
method, GIN will look up the default btree operator class for the index
key data type, and use its comparison function. It is recommended to
specify the comparison function in a GIN operator class that is meant for
just one data type, as looking up the btree operator class costs a few
cycles. However, polymorphic GIN operator classes (such
as <literal>array_ops</>) typically cannot specify a single comparison
function.
</para>
<para>
Optionally, an operator class for <acronym>GIN</acronym> can supply the
following method:
@ -900,11 +629,9 @@
<title>Examples</title>
<para>
The <productname>PostgreSQL</productname> source distribution includes
<acronym>GIN</acronym> operator classes for <type>tsvector</> and
for one-dimensional arrays of all internal types. Prefix searching in
<type>tsvector</> is implemented using the <acronym>GIN</> partial match
feature.
The core <productname>PostgreSQL</> distribution
includes the <acronym>GIN</acronym> operator classes previously shown in
<xref linkend="gin-builtin-opclasses-table">.
The following <filename>contrib</> modules also contain
<acronym>GIN</acronym> operator classes:

View File

@ -315,9 +315,8 @@ SELECT * FROM places ORDER BY location <-> point '(101,456)' LIMIT 10;
operators with which a GIN index can be used vary depending on the
indexing strategy.
As an example, the standard distribution of
<productname>PostgreSQL</productname> includes GIN operator classes
for one-dimensional arrays, which support indexed
queries using these operators:
<productname>PostgreSQL</productname> includes a GIN operator class
for arrays, which supports indexed queries using these operators:
<simplelist>
<member><literal>&lt;@</literal></member>

View File

@ -235,6 +235,11 @@ CREATE OPERATOR CLASS <replaceable class="parameter">name</replaceable> [ DEFAUL
(currently GiST, GIN and BRIN) allow it to be different. The
<literal>STORAGE</> clause must be omitted unless the index
method allows a different type to be used.
If the column <replaceable class="parameter">data_type</> is specified
as <type>anyarray</>, the <replaceable class="parameter">storage_type</>
can be declared as <type>anyelement</> to indicate that the index
entries are members of the element type belonging to the actual array
type that each particular index is created for.
</para>
</listitem>
</varlistentry>

View File

@ -288,7 +288,7 @@
have a fixed set of strategies either. Instead the support routines of
each operator class interpret the strategy numbers according to the
operator class's definition. As an example, the strategy numbers used by
the built-in operator classes for arrays are shown in
the built-in operator class for arrays are shown in
<xref linkend="xindex-gin-array-strat-table">.
</para>