Docs review for unaccent: fix grammar, markup, etc.

2025-07-27 12:41:57 +03:00 · 2010-08-25 02:12:00 +00:00
parent 1dab218a69
commit 7fc614c698
1 changed files with 52 additions and 46 deletions
--- a/doc/src/sgml/unaccent.sgml
+++ b/doc/src/sgml/unaccent.sgml
@ -1,3 +1,5 @@
+<!-- $PostgreSQL: pgsql/doc/src/sgml/unaccent.sgml,v 1.6 2010/08/25 02:12:00 tgl Exp $ -->
+
 <sect1 id="unaccent">
 <title>unaccent</title>

@ -6,24 +8,24 @@
 </indexterm>

 <para>
-  <filename>unaccent</> removes accents (diacritic signs) from a lexeme.
-  It's a filtering dictionary, that means its output is 
-  always passed to the next dictionary (if any), contrary to the standard 
-  behavior. Currently, it supports most important accents from European 
-  languages. 
+  <filename>unaccent</> is a text search dictionary that removes accents
+  (diacritic signs) from lexemes.
+  It's a filtering dictionary, which means its output is
+  always passed to the next dictionary (if any), unlike the normal
+  behavior of dictionaries.  This allows accent-insensitive processing
+  for full text search.
 </para>

 <para>
-  Limitation: Current implementation of <filename>unaccent</> 
-  dictionary cannot be used as a normalizing dictionary for 
-  <filename>thesaurus</filename> dictionary.
+  The current implementation of <filename>unaccent</> cannot be used as a
+  normalizing dictionary for the <filename>thesaurus</filename> dictionary.
 </para>
- 
+
 <sect2>
  <title>Configuration</title>

  <para>
-   A <literal>unaccent</> dictionary accepts the following options:
+   An <literal>unaccent</> dictionary accepts the following options:
  </para>
  <itemizedlist>
   <listitem>
@ -43,23 +45,27 @@
  <itemizedlist>
   <listitem>
    <para>
-     Each line represents pair: character_with_accent  character_without_accent
+     Each line represents a pair, consisting of a character with accent
+     followed by a character without accent.  The first is translated into
+     the second.  For example,
 <programlisting>
 &Agrave;        A
 &Aacute;        A
-&Acirc;         A
+&Acirc;        A
 &Atilde;        A
-&Auml;          A
-&Aring;         A
-&AElig;         A
+&Auml;        A
+&Aring;        A
+&AElig;        A
 </programlisting>
    </para>
   </listitem>
  </itemizedlist>

  <para>
-   Look at <filename>unaccent.rules</>, which is installed in
-   <filename>$SHAREDIR/tsearch_data/</>, for an example.
+   A more complete example, which is directly useful for most European
+   languages, can be found in <filename>unaccent.rules</>, which is installed
+   in <filename>$SHAREDIR/tsearch_data/</> when the <filename>unaccent</>
+   module is installed.
  </para>
 </sect2>

@ -67,66 +73,66 @@
  <title>Usage</title>

  <para>
-   Running the installation script creates a text search template
-   <literal>unaccent</> and a dictionary <literal>unaccent</>
+   Running the installation script <filename>unaccent.sql</> creates a text
+   search template <literal>unaccent</> and a dictionary <literal>unaccent</>
   based on it, with default parameters.  You can alter the
   parameters, for example

 <programlisting>
-=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
+mydb=# ALTER TEXT SEARCH DICTIONARY unaccent (RULES='my_rules');
 </programlisting>

   or create new dictionaries based on the template.
  </para>

  <para>
-   To test the dictionary, you can try
-
+   To test the dictionary, you can try:
 <programlisting>
-=# select ts_lexize('unaccent','Hôtel');
- ts_lexize 
+mydb=# select ts_lexize('unaccent','H&ocirc;tel');
+ ts_lexize
 -----------
 {Hotel}
 (1 row)
 </programlisting>
  </para>
-  
+
  <para>
-  Filtering dictionary are useful for correct work of 
-  <function>ts_headline</function> function.
+   Here is an example showing how to insert the
+   <filename>unaccent</> dictionary into a text search configuration:
 <programlisting>
-=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
-=# ALTER TEXT SEARCH CONFIGURATION fr
+mydb=# CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
+mydb=# ALTER TEXT SEARCH CONFIGURATION fr
        ALTER MAPPING FOR hword, hword_part, word
        WITH unaccent, french_stem;
-=# select to_tsvector('fr','Hôtels de la Mer');
-    to_tsvector    
+mydb=# select to_tsvector('fr','H&ocirc;tels de la Mer');
+    to_tsvector
 -------------------
 'hotel':1 'mer':4
 (1 row)

-=# select to_tsvector('fr','Hôtel de la Mer') @@ to_tsquery('fr','Hotels');
- ?column? 
+mydb=# select to_tsvector('fr','H&ocirc;tel de la Mer') @@ to_tsquery('fr','Hotels');
+ ?column?
 ----------
 t
 (1 row)
-=# select ts_headline('fr','Hôtel de la Mer',to_tsquery('fr','Hotels'));
-      ts_headline       
------------------------
-  &lt;b&gt;Hôtel&lt;/b&gt;de la Mer
-(1 row)

+mydb=# select ts_headline('fr','H&ocirc;tel de la Mer',to_tsquery('fr','Hotels'));
+      ts_headline
+------------------------
+ &lt;b&gt;H&ocirc;tel&lt;/b&gt; de la Mer
+(1 row)
 </programlisting>
  </para>
 </sect2>

 <sect2>
- <title>Function</title>
+ <title>Functions</title>

 <para>
-  <function>unaccent</> function removes accents (diacritic signs) from
-  argument string. Basically, it's a wrapper around 
-  <filename>unaccent</> dictionary.
+  The <function>unaccent()</> function removes accents (diacritic signs) from
+  a given string.  Basically, it's a wrapper around the
+  <filename>unaccent</> dictionary, but it can be used outside normal
+  text search contexts.
 </para>

 <indexterm>
@ -134,14 +140,14 @@
 </indexterm>

 <synopsis>
-unaccent(<optional><replaceable class="PARAMETER">dictionary</replaceable>, </optional> <replaceable class="PARAMETER">string</replaceable>)
-returns <type>text</type>
+unaccent(<optional><replaceable class="PARAMETER">dictionary</replaceable>, </optional> <replaceable class="PARAMETER">string</replaceable>) returns <type>text</type>
 </synopsis>

 <para>
+  For example:
 <programlisting>
-SELECT unaccent('unaccent', 'Hôtel');
-SELECT unaccent('Hôtel');
+SELECT unaccent('unaccent', 'H&ocirc;tel');
+SELECT unaccent('H&ocirc;tel');
 </programlisting>
 </para>
 </sect2>