mirror of
https://github.com/postgres/postgres.git
synced 2025-04-29 13:56:47 +03:00
Minor wording improvements per suggestion from Jeff Davis. Also tweak
hyphenated-word parser examples per earlier discussion with Alvaro.
This commit is contained in:
parent
8a8bcb447a
commit
2aac6f10f6
@ -1,4 +1,4 @@
|
|||||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.26 2007/10/25 13:06:35 alvherre Exp $ -->
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/textsearch.sgml,v 1.27 2007/10/27 00:19:45 tgl Exp $ -->
|
||||||
|
|
||||||
<chapter id="textsearch">
|
<chapter id="textsearch">
|
||||||
<title id="textsearch-title">Full Text Search</title>
|
<title id="textsearch-title">Full Text Search</title>
|
||||||
@ -1770,7 +1770,7 @@ LIMIT 10;
|
|||||||
<row>
|
<row>
|
||||||
<entry><literal>hword</></entry>
|
<entry><literal>hword</></entry>
|
||||||
<entry>Hyphenated word, all letters</entry>
|
<entry>Hyphenated word, all letters</entry>
|
||||||
<entry><literal>político-militar</literal></entry>
|
<entry><literal>lógico-matemática</literal></entry>
|
||||||
</row>
|
</row>
|
||||||
<row>
|
<row>
|
||||||
<entry><literal>numhword</></entry>
|
<entry><literal>numhword</></entry>
|
||||||
@ -1780,14 +1780,13 @@ LIMIT 10;
|
|||||||
<row>
|
<row>
|
||||||
<entry><literal>hword_asciipart</></entry>
|
<entry><literal>hword_asciipart</></entry>
|
||||||
<entry>Hyphenated word part, all ASCII</entry>
|
<entry>Hyphenated word part, all ASCII</entry>
|
||||||
<entry><literal>militar</literal> in the context
|
<entry><literal>postgresql</literal> in the context <literal>postgresql-beta1</literal></entry>
|
||||||
<literal>político-militar</literal>, or <literal>postgresql</literal> in the context <literal>postgresql-beta1</literal></entry>
|
|
||||||
</row>
|
</row>
|
||||||
<row>
|
<row>
|
||||||
<entry><literal>hword_part</></entry>
|
<entry><literal>hword_part</></entry>
|
||||||
<entry>Hyphenated word part, all letters</entry>
|
<entry>Hyphenated word part, all letters</entry>
|
||||||
<entry><literal>físico</literal> or <literal>químico</literal>
|
<entry><literal>lógico</literal> or <literal>matemática</literal>
|
||||||
in the context <literal>físico-químico</literal></entry>
|
in the context <literal>lógico-matemática</literal></entry>
|
||||||
</row>
|
</row>
|
||||||
<row>
|
<row>
|
||||||
<entry><literal>hword_numpart</></entry>
|
<entry><literal>hword_numpart</></entry>
|
||||||
@ -1902,12 +1901,12 @@ SELECT alias, description, token FROM ts_debug('foo-bar-beta1');
|
|||||||
instructive example:
|
instructive example:
|
||||||
|
|
||||||
<programlisting>
|
<programlisting>
|
||||||
SELECT alias, description, token FROM ts_debug('http://foo.com/stuff/index.html');
|
SELECT alias, description, token FROM ts_debug('http://example.com/stuff/index.html');
|
||||||
alias | description | token
|
alias | description | token
|
||||||
----------+---------------+--------------------------
|
----------+---------------+------------------------------
|
||||||
protocol | Protocol head | http://
|
protocol | Protocol head | http://
|
||||||
url | URL | foo.com/stuff/index.html
|
url | URL | example.com/stuff/index.html
|
||||||
host | Host | foo.com
|
host | Host | example.com
|
||||||
uri | URI | /stuff/index.html
|
uri | URI | /stuff/index.html
|
||||||
</programlisting>
|
</programlisting>
|
||||||
</para>
|
</para>
|
||||||
@ -3093,8 +3092,9 @@ SELECT plainto_tsquery('supernovae stars');
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
A GiST index is <firstterm>lossy</firstterm>, meaning it is necessary
|
A GiST index is <firstterm>lossy</firstterm>, meaning that the index
|
||||||
to check the actual table row to eliminate false matches.
|
may produce false matches, and it is necessary
|
||||||
|
to check the actual table row to eliminate such false matches.
|
||||||
<productname>PostgreSQL</productname> does this automatically; for
|
<productname>PostgreSQL</productname> does this automatically; for
|
||||||
example, in the query plan below, the <literal>Filter:</literal>
|
example, in the query plan below, the <literal>Filter:</literal>
|
||||||
line indicates the index output will be rechecked:
|
line indicates the index output will be rechecked:
|
||||||
@ -3112,14 +3112,15 @@ EXPLAIN SELECT * FROM apod WHERE textsearch @@ to_tsquery('supernovae');
|
|||||||
index by a fixed-length signature. The signature is generated by hashing
|
index by a fixed-length signature. The signature is generated by hashing
|
||||||
each word into a random bit in an n-bit string, with all these bits OR-ed
|
each word into a random bit in an n-bit string, with all these bits OR-ed
|
||||||
together to produce an n-bit document signature. When two words hash to
|
together to produce an n-bit document signature. When two words hash to
|
||||||
the same bit position there will be a false match, and if all words in
|
the same bit position there will be a false match. If all words in
|
||||||
the query have matches (real or false) then the table row must be
|
the query have matches (real or false) then the table row must be
|
||||||
retrieved to see if the match is correct.
|
retrieved to see if the match is correct.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Lossiness causes performance degradation since random access to table
|
Lossiness causes performance degradation due to useless fetches of table
|
||||||
records is slow; this limits the usefulness of GiST indexes. The
|
records that turn out to be false matches. Since random access to table
|
||||||
|
records is slow, this limits the usefulness of GiST indexes. The
|
||||||
likelihood of false matches depends on several factors, in particular the
|
likelihood of false matches depends on several factors, in particular the
|
||||||
number of unique words, so using dictionaries to reduce this number is
|
number of unique words, so using dictionaries to reduce this number is
|
||||||
recommended.
|
recommended.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user