XPath fixes:

- Function renamed to "xpath". - Function is now strict, per discussion. - Return empty array in case when XPath expression detects nothing (previously, NULL was returned in such case), per discussion. - (bugfix) Work with fragments with prologue: select xpath('/a', '<?xml version="1.0"?><a /><b />'); // now XML datum is always wrapped with dummy <x>...</x>, XML prologue simply goes away (if any). - Some cleanup. Nikolay Samokhvalov Some code cleanup and documentation work by myself.
2025-07-28 23:42:10 +03:00 · 2007-05-21 17:10:29 +00:00
parent 0c644d2c3d
commit 3963574d13
9 changed files with 238 additions and 205 deletions
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/datatype.sgml,v 1.200 2007/05/08 17:02:59 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/datatype.sgml,v 1.201 2007/05/21 17:10:28 petere Exp $ -->

 <chapter id="datatype">
  <title id="datatype-title">Data Types</title>
@ -3213,7 +3213,7 @@ SELECT * FROM test;
  <sect1 id="datatype-uuid">
   <title><acronym>UUID</acronym> Type</title>

-   <indexterm zone="datatype-xml">
+   <indexterm zone="datatype-uuid">
    <primary>UUID</primary>
   </indexterm>

@ -3289,6 +3289,8 @@ a0eebc999c0b4ef8bb6d6bb9bd380a11
    value is a full document or only a content fragment.
   </para>

+   <sect2>
+    <title>Creating XML Values</title>
   <para>
    To produce a value of type <type>xml</type> from character data,
    use the function
@ -3299,7 +3301,7 @@ XMLPARSE ( { DOCUMENT | CONTENT } <replaceable>value</replaceable>)
    Examples:
 <programlisting><![CDATA[
 XMLPARSE (DOCUMENT '<?xml version="1.0"?><book><title>Manual</title><chapter>...</chapter><book>')
-XMLPARSE (CONTENT 'abc<foo>bar</bar><bar>foo</foo>')
+XMLPARSE (CONTENT 'abc<foo>bar</foo><bar>foo</bar>')
 ]]></programlisting>
    While this is the only way to convert character strings into XML
    values according to the SQL standard, the PostgreSQL-specific
@ -3351,7 +3353,10 @@ SET xmloption TO { DOCUMENT | CONTENT };
    The default is <literal>CONTENT</literal>, so all forms of XML
    data are allowed.
   </para>
+   </sect2>

+   <sect2>
+    <title>Encoding Handling</title>
   <para>
    Care must be taken when dealing with multiple character encodings
    on the client, server, and in the XML data passed through them.
@ -3398,6 +3403,41 @@ SET xmloption TO { DOCUMENT | CONTENT };
    processed in UTF-8, computations will be most efficient if the
    server encoding is also UTF-8.
   </para>
+   </sect2>
+
+   <sect2>
+   <title>Accessing XML Values</title>
+
+   <para>
+    The <type>xml</type> data type is unusual in that it does not
+    provide any comparison operators.  This is because there is no
+    well-defined and universally useful comparison algorithm for XML
+    data.  One consequence of this is that you cannot retrieve rows by
+    comparing an <type>xml</type> column against a search value.  XML
+    values should therefore typically be accompanied by a separate key
+    field such as an ID.  An alternative solution for comparing XML
+    values is to convert them to character strings first, but note
+    that character string comparison has little to do with a useful
+    XML comparison method.
+   </para>
+
+   <para>
+    Since there are no comparison operators for the <type>xml</type>
+    data type, it is not possible to create an index directly on a
+    column of this type.  If speedy searches in XML data are desired,
+    possible workarounds would be casting the expression to a
+    character string type and indexing that, or indexing an XPath
+    expression.  The actual query would of course have to be adjusted
+    to search by the indexed expression.
+   </para>
+
+   <para>
+    The full-text search module Tsearch2 could also be used to speed
+    up full-document searches in XML data.  The necessary
+    preprocessing support is, however, not available in the PostgreSQL
+    distribution in this release.
+   </para>
+   </sect2>
  </sect1>

  &array;
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.379 2007/05/07 07:53:26 petere Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.380 2007/05/21 17:10:28 petere Exp $ -->

 <chapter id="functions">
  <title>Functions and Operators</title>
@ -7512,7 +7512,7 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
   type.  The function-like expressions <function>xmlparse</function>
   and <function>xmlserialize</function> for converting to and from
   type <type>xml</type> are not repeated here.  Use of many of these
-   <type>xml</type> functions requires the installation to have been built
+   functions requires the installation to have been built
   with <command>configure --with-libxml</>.
  </para>

@ -7848,6 +7848,51 @@ SELECT xmlroot(xmlparse(document '<?xml version="1.1"?><content>abc</content>'),
   </sect3>
  </sect2>

+  <sect2 id="functions-xml-processing">
+   <title>Processing XML</title>
+
+   <indexterm>
+    <primary>XPath</primary>
+   </indexterm>
+
+   <para>
+    To process values of data type <type>xml</type>, PostgreSQL offers
+    the function <function>xpath</function>, which evaluates XPath 1.0
+    expressions.
+   </para>
+
+<synopsis>
+<function>xpath</function>(<replaceable>xpath</replaceable>, <replaceable>xml</replaceable><optional>, <replaceable>nsarray</replaceable></optional>)
+</synopsis>
+
+   <para>
+    The function <function>xpath</function> evaluates the XPath
+    expression <replaceable>xpath</replaceable> against the XML value
+    <replaceable>xml</replaceable>.  It returns an array of XML values
+    corresponding to the node set produced by the XPath expression.
+   </para>
+
+   <para>
+    The third argument of the function is an array of namespace
+    mappings.  This array should be a two-dimensional array with the
+    length of the second axis being equal to 2 (i.e., it should be an
+    array of arrays, each of which consists of exactly 2 elements).
+    The first element of each array entry is the namespace name, the
+    second the namespace URI.
+   </para>
+
+   <para>
+    Example:
+<screen><![CDATA[
+SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>', ARRAY[ARRAY['my', 'http://example.com']]);
+ xpath  
+--------
+ {test}
+(1 row)
+]]></screen>
+   </para>
+  </sect2>
+
  <sect2 id="functions-xml-mapping">
   <title>Mapping Tables to XML</title>

@ -8097,75 +8142,6 @@ table2-mapping
 ]]></programlisting>
   </figure>
  </sect2>
-
-  <sect2>
-   <title>Processing XML</title>
-
-   <para>
-    <acronym>XML</> support is not just the existence of an
-    <type>xml</type> data type, but a variety of features supported by
-    a database system.  These capabilities include import/export,
-    indexing, searching, transforming, and <acronym>XML</> to
-    <acronym>SQL</> mapping.  <productname>PostgreSQL</> supports some
-    but not all of these <acronym>XML</> capabilities.  For an
-    overview of <acronym>XML</> use in databases, see <ulink
-    url="http://www.rpbourret.com/xml/XMLAndDatabases.htm"></>.
-   </para>
-
-   <variablelist>
-   <varlistentry>
-    <term>Indexing</term>
-    <listitem>
-
-     <para>
-      <filename>contrib/xml2/</> functions can be used in expression
-      indexes to index specific <acronym>XML</> fields.  To index the
-      full contents of <acronym>XML</> documents, the full-text
-      indexing tool <filename>contrib/tsearch2/</> can be used.  Of
-      course, Tsearch2 indexes have no <acronym>XML</> awareness so
-      additional <filename>contrib/xml2/</> checks should be added to
-      queries.
-     </para>
-    </listitem>
-   </varlistentry>
-
-   <varlistentry>
-    <term>Searching</term>
-    <listitem>
-
-     <para>
-      XPath searches are implemented using <filename>contrib/xml2/</>.
-      It processes <acronym>XML</> text documents and returns results
-      based on the requested query.
-     </para>
-    </listitem>
-   </varlistentry>
-
-   <varlistentry>
-    <term>Transforming</term>
-    <listitem>
-
-     <para>
-      <filename>contrib/xml2/</> supports <acronym>XSLT</> (Extensible
-      Stylesheet Language Transformation).
-     </para>
-    </listitem>
-   </varlistentry>
-
-   <varlistentry>
-    <term>XML to SQL Mapping</term>
-    <listitem>
-
-     <para>
-      This involves converting <acronym>XML</> data to and from
-      relational structures. <productname>PostgreSQL</> has no
-      internal support for such mapping, and relies on external tools
-      to do such conversions.
-     </para>
-    </listitem>
-   </varlistentry>
-   </variablelist>
-  </sect2>
 </sect1>