mirror of
https://github.com/postgres/postgres.git
synced 2025-06-11 20:28:21 +03:00
Remove contrib/tsearch2.
This module was intended to ease migrations of applications that used the pre-8.3 version of text search to the in-core version introduced in that release. However, since all pre-8.3 releases of the database have been out of support for more than 5 years at this point, we expect that few people are depending on it at this point. If some people still need it, nothing prevents it from being maintained as a separate extension, outside of core. Discussion: http://postgr.es/m/CA+Tgmob5R8aDHiFRTQsSJbT1oreKg2FOSBrC=2f4tqEH3dOMAg@mail.gmail.com
This commit is contained in:
@ -142,7 +142,6 @@ CREATE EXTENSION <replaceable>module_name</> FROM unpackaged;
|
||||
&tablefunc;
|
||||
&tcn;
|
||||
&test-decoding;
|
||||
&tsearch2;
|
||||
&tsm-system-rows;
|
||||
&tsm-system-time;
|
||||
&unaccent;
|
||||
|
@ -151,7 +151,6 @@
|
||||
<!ENTITY test-decoding SYSTEM "test-decoding.sgml">
|
||||
<!ENTITY test-parser SYSTEM "test-parser.sgml">
|
||||
<!ENTITY test-shm-mq SYSTEM "test-shm-mq.sgml">
|
||||
<!ENTITY tsearch2 SYSTEM "tsearch2.sgml">
|
||||
<!ENTITY tsm-system-rows SYSTEM "tsm-system-rows.sgml">
|
||||
<!ENTITY tsm-system-time SYSTEM "tsm-system-time.sgml">
|
||||
<!ENTITY unaccent SYSTEM "unaccent.sgml">
|
||||
|
@ -3864,87 +3864,4 @@ Parser: "pg_catalog.default"
|
||||
|
||||
</sect1>
|
||||
|
||||
<sect1 id="textsearch-migration">
|
||||
<title>Migration from Pre-8.3 Text Search</title>
|
||||
|
||||
<para>
|
||||
Applications that use the <xref linkend="tsearch2">
|
||||
module for text searching will need some adjustments to work
|
||||
with the
|
||||
built-in features:
|
||||
</para>
|
||||
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
Some functions have been renamed or had small adjustments in their
|
||||
argument lists, and all of them are now in the <literal>pg_catalog</>
|
||||
schema, whereas in a previous installation they would have been in
|
||||
<literal>public</> or another non-system schema. There is a new
|
||||
version of <application>tsearch2</>
|
||||
that provides a compatibility layer to solve most problems in this
|
||||
area.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The old <application>tsearch2</> functions and other objects
|
||||
<emphasis>must</> be suppressed when loading <application>pg_dump</>
|
||||
output from a pre-8.3 database. While many of them won't load anyway,
|
||||
a few will and then cause problems. One simple way to deal with this
|
||||
is to load the new <application>tsearch2</> module before restoring
|
||||
the dump; then it will block the old objects from being loaded.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Text search configuration setup is completely different now.
|
||||
Instead of manually inserting rows into configuration tables,
|
||||
search is configured through the specialized SQL commands shown
|
||||
earlier in this chapter. There is no automated
|
||||
support for converting an existing custom configuration for 8.3;
|
||||
you're on your own here.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Most types of dictionaries rely on some outside-the-database
|
||||
configuration files. These are largely compatible with pre-8.3
|
||||
usage, but note the following differences:
|
||||
|
||||
<itemizedlist spacing="compact" mark="bullet">
|
||||
<listitem>
|
||||
<para>
|
||||
Configuration files now must be placed in a single specified
|
||||
directory (<filename>$SHAREDIR/tsearch_data</>), and must have
|
||||
a specific extension depending on the type of file, as noted
|
||||
previously in the descriptions of the various dictionary types.
|
||||
This restriction was added to forestall security problems.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Configuration files must be encoded in UTF-8 encoding,
|
||||
regardless of what database encoding is used.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
In thesaurus configuration files, stop words must be marked with
|
||||
<literal>?</>.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
</itemizedlist>
|
||||
|
||||
</sect1>
|
||||
|
||||
</chapter>
|
||||
|
@ -1,203 +0,0 @@
|
||||
<!-- doc/src/sgml/tsearch2.sgml -->
|
||||
|
||||
<sect1 id="tsearch2" xreflabel="tsearch2">
|
||||
<title>tsearch2</title>
|
||||
|
||||
<indexterm zone="tsearch2">
|
||||
<primary>tsearch2</primary>
|
||||
</indexterm>
|
||||
|
||||
<para>
|
||||
The <application>tsearch2</> module provides backwards-compatible
|
||||
text search functionality for applications that used
|
||||
<application>tsearch2</> before text searching was integrated
|
||||
into core <productname>PostgreSQL</productname> in release 8.3.
|
||||
</para>
|
||||
|
||||
<sect2>
|
||||
<title>Portability Issues</title>
|
||||
|
||||
<para>
|
||||
Although the built-in text search features were based on
|
||||
<application>tsearch2</> and are largely similar to it,
|
||||
there are numerous small differences that will create portability
|
||||
issues for existing applications:
|
||||
</para>
|
||||
|
||||
<itemizedlist mark="bullet">
|
||||
<listitem>
|
||||
<para>
|
||||
Some functions' names were changed, for example <function>rank</>
|
||||
to <function>ts_rank</>.
|
||||
The replacement <literal>tsearch2</literal> module
|
||||
provides aliases having the old names.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The built-in text search data types and functions all exist within
|
||||
the system schema <literal>pg_catalog</>. In an installation using
|
||||
<application>tsearch2</>, these objects would usually have been in
|
||||
the <literal>public</> schema, though some users chose to place them
|
||||
in a separate schema of their own. Explicitly schema-qualified
|
||||
references to the objects will therefore fail in either case.
|
||||
The replacement <literal>tsearch2</literal> module
|
||||
provides alias objects that are stored in <literal>public</>
|
||||
(or another schema if necessary) so that such references will still work.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
There is no concept of a <quote>current parser</> or <quote>current
|
||||
dictionary</> in the built-in text search features, only of a current
|
||||
search configuration (set by the <varname>default_text_search_config</>
|
||||
parameter). While the current parser and current dictionary were used
|
||||
only by functions intended for debugging, this might still pose
|
||||
a porting obstacle in some cases.
|
||||
The replacement <literal>tsearch2</literal> module emulates these
|
||||
additional state variables and provides backwards-compatible functions
|
||||
for setting and retrieving them.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
<para>
|
||||
There are some issues that are not addressed by the replacement
|
||||
<literal>tsearch2</literal> module, and will therefore require
|
||||
application code changes in any case:
|
||||
</para>
|
||||
|
||||
<itemizedlist mark="bullet">
|
||||
<listitem>
|
||||
<para>
|
||||
The old <function>tsearch2</> trigger function allowed items in its
|
||||
argument list to be names of functions to be invoked on the text data
|
||||
before it was converted to <type>tsvector</> format. This was removed
|
||||
as being a security hole, since it was not possible to guarantee that
|
||||
the function invoked was the one intended. The recommended approach
|
||||
if the data must be massaged before being indexed is to write a custom
|
||||
trigger that does the work for itself.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
Text search configuration information has been moved into core
|
||||
system catalogs that are noticeably different from the tables used
|
||||
by <application>tsearch2</>. Any applications that examined
|
||||
or modified those tables will need adjustment.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
If an application used any custom text search configurations,
|
||||
those will need to be set up in the core
|
||||
catalogs using the new text search configuration SQL commands.
|
||||
The replacement <literal>tsearch2</literal> module offers a little
|
||||
bit of support for this by making it possible to load an old set
|
||||
of <application>tsearch2</> configuration tables into
|
||||
<productname>PostgreSQL</productname> 8.3. (Without the module,
|
||||
it is not possible to load the configuration data because values in the
|
||||
<type>regprocedure</> columns cannot be resolved to functions.)
|
||||
While those configuration tables won't actually <emphasis>do</>
|
||||
anything, at least their contents will be available to be consulted
|
||||
while setting up an equivalent custom configuration in 8.3.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The old <function>reset_tsearch()</> and <function>get_covers()</>
|
||||
functions are not supported.
|
||||
</para>
|
||||
</listitem>
|
||||
|
||||
<listitem>
|
||||
<para>
|
||||
The replacement <literal>tsearch2</literal> module does not define
|
||||
any alias operators, relying entirely on the built-in ones.
|
||||
This would only pose an issue if an application used explicitly
|
||||
schema-qualified operator names, which is very uncommon.
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>Converting a pre-8.3 Installation</title>
|
||||
|
||||
<para>
|
||||
The recommended way to update a pre-8.3 installation that uses
|
||||
<application>tsearch2</> is:
|
||||
</para>
|
||||
|
||||
<procedure>
|
||||
<step>
|
||||
<para>
|
||||
Make a dump from the old installation in the usual way,
|
||||
but be sure not to use <literal>-c</> (<literal>--clean</>)
|
||||
option of <application>pg_dump</> or <application>pg_dumpall</>.
|
||||
</para>
|
||||
</step>
|
||||
|
||||
<step>
|
||||
<para>
|
||||
In the new installation, create empty database(s) and install
|
||||
the replacement <literal>tsearch2</literal> module into each
|
||||
database that will use text search. This must be done
|
||||
<emphasis>before</> loading the dump data! If your old installation
|
||||
had the <application>tsearch2</> objects in a schema other
|
||||
than <literal>public</>, be sure to adjust the
|
||||
<command>CREATE EXTENSION</> command so that the replacement
|
||||
objects are created in that same schema.
|
||||
</para>
|
||||
</step>
|
||||
|
||||
<step>
|
||||
<para>
|
||||
Load the dump data. There will be quite a few errors reported
|
||||
due to failure to recreate the original <application>tsearch2</>
|
||||
objects. These errors can be ignored, but this means you cannot
|
||||
restore the dump in a single transaction (eg, you cannot use
|
||||
<application>pg_restore</>'s <option>-1</> switch).
|
||||
</para>
|
||||
</step>
|
||||
|
||||
<step>
|
||||
<para>
|
||||
Examine the contents of the restored <application>tsearch2</>
|
||||
configuration tables (<structname>pg_ts_cfg</> and so on), and
|
||||
create equivalent built-in text search configurations as needed.
|
||||
You may drop the old configuration tables once you've extracted
|
||||
all the useful information from them.
|
||||
</para>
|
||||
</step>
|
||||
|
||||
<step>
|
||||
<para>
|
||||
Test your application.
|
||||
</para>
|
||||
</step>
|
||||
</procedure>
|
||||
|
||||
<para>
|
||||
At a later time you may wish to rename application references
|
||||
to the alias text search objects, so that you can eventually
|
||||
uninstall the replacement <literal>tsearch2</literal> module.
|
||||
</para>
|
||||
|
||||
</sect2>
|
||||
|
||||
<sect2>
|
||||
<title>References</title>
|
||||
<para>
|
||||
Tsearch2 Development Site
|
||||
<ulink url="http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/"></ulink>
|
||||
</para>
|
||||
</sect2>
|
||||
|
||||
</sect1>
|
Reference in New Issue
Block a user