mirror of
https://github.com/postgres/postgres.git
synced 2025-04-24 10:47:04 +03:00
Doc: improve documentation about ts_headline() function.
Now that I've had my nose in that code, I thought the docs about it left something to be desired.
This commit is contained in:
parent
c9b0c678d3
commit
a4d4f59196
@ -1295,64 +1295,75 @@ ts_headline(<optional> <replaceable class="parameter">config</replaceable> <type
|
|||||||
<itemizedlist spacing="compact" mark="bullet">
|
<itemizedlist spacing="compact" mark="bullet">
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
<literal>StartSel</literal>, <literal>StopSel</literal>: the strings with
|
<literal>MaxWords</literal>, <literal>MinWords</literal> (integers):
|
||||||
which to delimit query words appearing in the document, to distinguish
|
these numbers determine the longest and shortest headlines to output.
|
||||||
them from other excerpted words. You must double-quote these strings
|
The default values are 35 and 15.
|
||||||
if they contain spaces or commas.
|
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
<literal>MaxWords</literal>, <literal>MinWords</literal>: these numbers
|
<literal>ShortWord</literal> (integer): words of this length or less
|
||||||
determine the longest and shortest headlines to output.
|
will be dropped at the start and end of a headline, unless they are
|
||||||
|
query terms. The default value of three eliminates common English
|
||||||
|
articles.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
<literal>ShortWord</literal>: words of this length or less will be
|
<literal>HighlightAll</literal> (boolean): if
|
||||||
dropped at the start and end of a headline. The default
|
|
||||||
value of three eliminates common English articles.
|
|
||||||
</para>
|
|
||||||
</listitem>
|
|
||||||
<listitem>
|
|
||||||
<para>
|
|
||||||
<literal>HighlightAll</literal>: Boolean flag; if
|
|
||||||
<literal>true</literal> the whole document will be used as the
|
<literal>true</literal> the whole document will be used as the
|
||||||
headline, ignoring the preceding three parameters.
|
headline, ignoring the preceding three parameters. The default
|
||||||
|
is <literal>false</literal>.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
<literal>MaxFragments</literal>: maximum number of text excerpts
|
<literal>MaxFragments</literal> (integer): maximum number of text
|
||||||
or fragments to display. The default value of zero selects a
|
fragments to display. The default value of zero selects a
|
||||||
non-fragment-oriented headline generation method. A value greater than
|
non-fragment-based headline generation method. A value greater
|
||||||
zero selects fragment-based headline generation. This method
|
than zero selects fragment-based headline generation (see below).
|
||||||
finds text fragments with as many query words as possible and
|
|
||||||
stretches those fragments around the query words. As a result
|
|
||||||
query words are close to the middle of each fragment and have words on
|
|
||||||
each side. Each fragment will be of at most <literal>MaxWords</literal> and
|
|
||||||
words of length <literal>ShortWord</literal> or less are dropped at the start
|
|
||||||
and end of each fragment. If not all query words are found in the
|
|
||||||
document, then a single fragment of the first <literal>MinWords</literal>
|
|
||||||
in the document will be displayed.
|
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
<listitem>
|
<listitem>
|
||||||
<para>
|
<para>
|
||||||
<literal>FragmentDelimiter</literal>: When more than one fragment is
|
<literal>StartSel</literal>, <literal>StopSel</literal> (strings):
|
||||||
displayed, the fragments will be separated by this string.
|
the strings with which to delimit query words appearing in the
|
||||||
|
document, to distinguish them from other excerpted words. The
|
||||||
|
default values are <quote><literal><b></literal></quote> and
|
||||||
|
<quote><literal></b></literal></quote>, which can be suitable
|
||||||
|
for HTML output.
|
||||||
|
</para>
|
||||||
|
</listitem>
|
||||||
|
<listitem>
|
||||||
|
<para>
|
||||||
|
<literal>FragmentDelimiter</literal> (string): When more than one
|
||||||
|
fragment is displayed, the fragments will be separated by this string.
|
||||||
|
The default is <quote><literal> ... </literal></quote>.
|
||||||
</para>
|
</para>
|
||||||
</listitem>
|
</listitem>
|
||||||
</itemizedlist>
|
</itemizedlist>
|
||||||
|
|
||||||
These option names are recognized case-insensitively.
|
These option names are recognized case-insensitively.
|
||||||
Any unspecified options receive these defaults:
|
You must double-quote string values if they contain spaces or commas.
|
||||||
|
</para>
|
||||||
|
|
||||||
<programlisting>
|
<para>
|
||||||
StartSel=<b>, StopSel=</b>,
|
In non-fragment-based headline
|
||||||
MaxWords=35, MinWords=15, ShortWord=3, HighlightAll=FALSE,
|
generation, <function>ts_headline</function> locates matches for the
|
||||||
MaxFragments=0, FragmentDelimiter=" ... "
|
given <replaceable class="parameter">query</replaceable> and chooses a
|
||||||
</programlisting>
|
single one to display, preferring matches that have more query words
|
||||||
|
within the allowed headline length.
|
||||||
|
In fragment-based headline generation, <function>ts_headline</function>
|
||||||
|
locates the query matches and splits each match
|
||||||
|
into <quote>fragments</quote> of no more than <literal>MaxWords</literal>
|
||||||
|
words each, preferring fragments with more query words, and when
|
||||||
|
possible <quote>stretching</quote> fragments to include surrounding
|
||||||
|
words. The fragment-based mode is thus more useful when the query
|
||||||
|
matches span large sections of the document, or when it's desirable to
|
||||||
|
display multiple matches.
|
||||||
|
In either mode, if no query matches can be identified, then a single
|
||||||
|
fragment of the first <literal>MinWords</literal> words in the document
|
||||||
|
will be displayed.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
@ -1364,25 +1375,24 @@ SELECT ts_headline('english',
|
|||||||
is to find all documents containing given query terms
|
is to find all documents containing given query terms
|
||||||
and return them in order of their similarity to the
|
and return them in order of their similarity to the
|
||||||
query.',
|
query.',
|
||||||
to_tsquery('query & similarity'));
|
to_tsquery('english', 'query & similarity'));
|
||||||
ts_headline
|
ts_headline
|
||||||
------------------------------------------------------------
|
------------------------------------------------------------
|
||||||
containing given <b>query</b> terms
|
containing given <b>query</b> terms +
|
||||||
and return them in order of their <b>similarity</b> to the
|
and return them in order of their <b>similarity</b> to the+
|
||||||
<b>query</b>.
|
<b>query</b>.
|
||||||
|
|
||||||
SELECT ts_headline('english',
|
SELECT ts_headline('english',
|
||||||
'The most common type of search
|
'Search terms may occur
|
||||||
is to find all documents containing given query terms
|
many times in a document,
|
||||||
and return them in order of their similarity to the
|
requiring ranking of the search matches to decide which
|
||||||
query.',
|
occurrences to display in the result.',
|
||||||
to_tsquery('query & similarity'),
|
to_tsquery('english', 'search & term'),
|
||||||
'StartSel = <, StopSel = >');
|
'MaxFragments=10, MaxWords=7, MinWords=3, StartSel=<<, StopSel=>>');
|
||||||
ts_headline
|
ts_headline
|
||||||
-------------------------------------------------------
|
------------------------------------------------------------
|
||||||
containing given <query> terms
|
<<Search>> <<terms>> may occur +
|
||||||
and return them in order of their <similarity> to the
|
many times ... ranking of the <<search>> matches to decide
|
||||||
<query>.
|
|
||||||
</screen>
|
</screen>
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user