mirror of
https://github.com/postgres/postgres.git
synced 2025-07-02 09:02:37 +03:00
Indexing support for pattern matching operations via separate operator
class when lc_collate is not C.
This commit is contained in:
@ -1,4 +1,4 @@
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.35 2003/04/15 13:26:54 petere Exp $ -->
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/charset.sgml,v 2.36 2003/05/15 15:50:18 petere Exp $ -->
|
||||
|
||||
<chapter id="charset">
|
||||
<title>Localization</>
|
||||
@ -213,23 +213,13 @@ initdb --locale=sv_SE
|
||||
The <function>to_char</> family of functions
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The <literal>LIKE</> and <literal>~</> operators for pattern
|
||||
matching
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The only severe drawback of using the locale support in
|
||||
<productname>PostgreSQL</> is its speed. So use locales only if you
|
||||
actually need it. It should be noted in particular that selecting
|
||||
a non-C locale disables index optimizations for <literal>LIKE</> and
|
||||
<literal>~</> operators, which can make a huge difference in the
|
||||
speed of searches that use those operators.
|
||||
<productname>PostgreSQL</> is its speed. So use locales only if
|
||||
you actually need them.
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
|
@ -1,4 +1,4 @@
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.40 2003/03/25 16:15:36 petere Exp $ -->
|
||||
<!-- $Header: /cvsroot/pgsql/doc/src/sgml/indices.sgml,v 1.41 2003/05/15 15:50:18 petere Exp $ -->
|
||||
|
||||
<chapter id="indexes">
|
||||
<title id="indexes-title">Indexes</title>
|
||||
@ -132,6 +132,19 @@ CREATE INDEX test1_id_index ON test1 (id);
|
||||
</simplelist>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
The optimizer can also use a B-tree index for queries involving the
|
||||
pattern matching operators <literal>LIKE</>,
|
||||
<literal>ILIKE</literal>, <literal>~</literal>, and
|
||||
<literal>~*</literal>, <emphasis>if</emphasis> the pattern is
|
||||
anchored to the beginning of the string, e.g., <literal>col LIKE
|
||||
'foo%'</literal> or <literal>col ~ '^foo'</literal>, but not
|
||||
<literal>col LIKE '%bar'</literal>. However, if your server does
|
||||
not use the C locale you will need to create the index with a
|
||||
special operator class. See <xref linkend="indexes-opclass">
|
||||
below.
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<indexterm>
|
||||
<primary>indexes</primary>
|
||||
@ -405,6 +418,36 @@ CREATE INDEX <replaceable>name</replaceable> ON <replaceable>table</replaceable>
|
||||
<literal>bigbox_ops</literal>.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The operator classes <literal>text_pattern_ops</literal>,
|
||||
<literal>varchar_pattern_ops</literal>,
|
||||
<literal>bpchar_pattern_ops</literal>, and
|
||||
<literal>name_pattern_ops</literal> support B-tree indexes on
|
||||
the types <type>text</type>, <type>varchar</type>,
|
||||
<type>char</type>, and <type>name</type>, respectively. The
|
||||
difference to the ordinary operator classes is that the values
|
||||
are compared strictly character by character rather than
|
||||
according to the locale-specific collation rules. This makes
|
||||
these operator classes suitable for use by queries involving
|
||||
pattern matching expressions (<literal>LIKE</literal> or POSIX
|
||||
regular expressions) if the server does not use the standard
|
||||
<quote>C</quote> locale. As an example, to index a
|
||||
<type>varchar</type> column like this:
|
||||
<programlisting>
|
||||
CREATE INDEX test_index ON test_table (col varchar_pattern_ops);
|
||||
</programlisting>
|
||||
If you do use the C locale, you should instead create an index
|
||||
with the default operator class. Also note that you should
|
||||
create an index with the default operator class if you want
|
||||
queries involving ordinary comparisons to use an index. Such
|
||||
queries cannot use the
|
||||
<literal><replaceable>xxx</replaceable>_pattern_ops</literal>
|
||||
operator classes. It is possible, however, to create multiple
|
||||
indexes on the same column with different operator classes.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/release.sgml,v 1.187 2003/05/14 03:25:59 tgl Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/release.sgml,v 1.188 2003/05/15 15:50:18 petere Exp $
|
||||
-->
|
||||
|
||||
<appendix id="release">
|
||||
@ -24,6 +24,7 @@ CDATA means the content is "SGML-free", so you can write without
|
||||
worries about funny characters.
|
||||
-->
|
||||
<literallayout><![CDATA[
|
||||
Pattern matching operations can use indexes regardless of locale
|
||||
New frontend/backend protocol supports many long-requested features
|
||||
SET AUTOCOMMIT TO OFF is no longer supported
|
||||
Reimplementation of NUMERIC datatype for more speed
|
||||
|
@ -1,5 +1,5 @@
|
||||
<!--
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.179 2003/05/14 03:26:00 tgl Exp $
|
||||
$Header: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v 1.180 2003/05/15 15:50:18 petere Exp $
|
||||
-->
|
||||
|
||||
<Chapter Id="runtime">
|
||||
@ -133,26 +133,13 @@ postgres$ <userinput>initdb -D /usr/local/pgsql/data</userinput>
|
||||
</para>
|
||||
|
||||
<para>
|
||||
<command>initdb</command> also initializes the default locale<indexterm><primary>locale</></> for
|
||||
the database cluster. Normally, it will just take the locale
|
||||
settings in the environment and apply them to the initialized
|
||||
database. It is possible to specify a different locale for the
|
||||
database; more information about that can be found in <xref
|
||||
linkend="locale">. One surprise you might encounter while running
|
||||
<command>initdb</command> is a notice similar to this:
|
||||
<screen>
|
||||
The database cluster will be initialized with locale de_DE.
|
||||
This locale setting will prevent the use of indexes for pattern matching
|
||||
operations. If that is a concern, rerun initdb with the collation order
|
||||
set to "C". For more information see the documentation.
|
||||
</screen>
|
||||
This is intended to warn you that the currently selected locale
|
||||
will cause indexes to be sorted in an order that prevents them from
|
||||
being used for <literal>LIKE</> and regular-expression searches. If you need
|
||||
good performance in such searches, you should set your current
|
||||
locale to <literal>C</> and re-run <command>initdb</command>, e.g.,
|
||||
by running <literal>initdb --lc-collate=C</literal>. The sort
|
||||
order used within a particular database cluster is set by
|
||||
<command>initdb</command> also initializes the default
|
||||
locale<indexterm><primary>locale</></> for the database cluster.
|
||||
Normally, it will just take the locale settings in the environment
|
||||
and apply them to the initialized database. It is possible to
|
||||
specify a different locale for the database; more information about
|
||||
that can be found in <xref linkend="locale">. The sort order used
|
||||
within a particular database cluster is set by
|
||||
<command>initdb</command> and cannot be changed later, short of
|
||||
dumping all data, rerunning <command>initdb</command>, and
|
||||
reloading the data. So it's important to make this choice correctly
|
||||
|
Reference in New Issue
Block a user