diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index cd5f5f0f6e4..0624c7f9a19 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4228,6 +4228,11 @@ a0ee-bc99-9c0b-4ef8-bb6d-6bb9-bd38-0a11
value is a full document or only a content fragment.
+
+ Limits and compatibility notes for the xml data type
+ can be found in .
+
+
Creating XML Values
diff --git a/doc/src/sgml/features.sgml b/doc/src/sgml/features.sgml
index 6c22d698673..df9524aedd9 100644
--- a/doc/src/sgml/features.sgml
+++ b/doc/src/sgml/features.sgml
@@ -16,7 +16,8 @@
Language SQL. A revised version of the standard is released
from time to time; the most recent update appearing in 2011.
The 2011 version is referred to as ISO/IEC 9075:2011, or simply as SQL:2011.
- The versions prior to that were SQL:2008, SQL:2003, SQL:1999, and SQL-92. Each version
+ The versions prior to that were SQL:2008, SQL:2006, SQL:2003, SQL:1999,
+ and SQL-92. Each version
replaces the previous one, so claims of conformance to earlier
versions have no official merit.
PostgreSQL development aims for
@@ -155,4 +156,382 @@
+
+ XML Limits and Conformance to SQL/XML
+
+
+ SQL/XML
+ limits and conformance
+
+
+
+ Significant revisions to the XML-related specifications in ISO/IEC 9075-14
+ (SQL/XML) were introduced with SQL:2006.
+ PostgreSQL's implementation of the XML data
+ type and related functions largely follows the earlier 2003 edition,
+ with some borrowing from later editions. In particular:
+
+
+
+ Where the current standard provides a family of XML data types
+ to hold document
or content
in
+ untyped or XML Schema-typed variants, and a type
+ XML(SEQUENCE) to hold arbitrary pieces of XML content,
+ PostgreSQL provides the single
+ xml type, which can hold document
or
+ content
. There is no equivalent of the
+ standard's sequence
type.
+
+
+
+
+
+ PostgreSQL provides two functions
+ introduced in SQL:2006, but in variants that use the XPath 1.0
+ language, rather than XML Query as specified for them in the
+ standard.
+
+
+
+
+
+
+ This section presents some of the resulting differences you may encounter.
+
+
+
+ Queries are restricted to XPath 1.0
+
+
+ The PostgreSQL-specific functions
+ xpath() and xpath_exists()
+ query XML documents using the XPath language.
+ PostgreSQL also provides XPath-only variants
+ of the standard functions XMLEXISTS and
+ XMLTABLE, which officially use
+ the XQuery language. For all of these functions,
+ PostgreSQL relies on the
+ libxml2 library, which provides only XPath 1.0.
+
+
+
+ There is a strong connection between the XQuery language and XPath
+ versions 2.0 and later: any expression that is syntactically valid and
+ executes successfully in both produces the same result (with a minor
+ exception for expressions containing numeric character references or
+ predefined entity references, which XQuery replaces with the
+ corresponding character while XPath leaves them alone). But there is
+ no such connection between these languages and XPath 1.0; it was an
+ earlier language and differs in many respects.
+
+
+
+ There are two categories of limitation to keep in mind: the restriction
+ from XQuery to XPath for the functions specified in the SQL standard, and
+ the restriction of XPath to version 1.0 for both the standard and the
+ PostgreSQL-specific functions.
+
+
+
+ Restriction of XQuery to XPath
+
+
+ Features of XQuery beyond those of XPath include:
+
+
+
+
+ XQuery expressions can construct and return new XML nodes, in
+ addition to all possible XPath values. XPath can create and return
+ values of the atomic types (numbers, strings, and so on) but can
+ only return XML nodes that were already present in documents
+ supplied as input to the expression.
+
+
+
+
+
+ XQuery has control constructs for iteration, sorting, and grouping.
+
+
+
+
+
+ XQuery allows declaration and use of local functions.
+
+
+
+
+
+
+ Recent XPath versions begin to offer capabilities overlapping with
+ these (such as functional-style for-each and
+ sort, anonymous functions, and
+ parse-xml to create a node from a string),
+ but such features were not available before XPath 3.0.
+
+
+
+
+ Restriction of XPath to 1.0
+
+
+ For developers familiar with XQuery and XPath 2.0 or later, XPath 1.0
+ presents a number of differences to contend with:
+
+
+
+
+ The fundamental type of an XQuery/XPath expression, the
+ sequence, which can contain XML nodes, atomic values,
+ or both, does not exist in XPath 1.0. A 1.0 expression can only
+ produce a node-set (containing zero or more XML nodes), or a single
+ atomic value.
+
+
+
+
+
+ Unlike an XQuery/XPath sequence, which can contain any desired
+ items in any desired order, an XPath 1.0 node-set has no
+ guaranteed order and, like any set, does not allow multiple
+ appearances of the same item.
+
+
+ The libxml2 library does seem to
+ always return node-sets to PostgreSQL
+ with their members in the same relative order they had in the
+ input document. Its documentation does not commit to this
+ behavior, and an XPath 1.0 expression cannot control it.
+
+
+
+
+
+
+
+ While XQuery/XPath provides all of the types defined in XML Schema
+ and many operators and functions over those types, XPath 1.0 has only
+ node-sets and the three atomic types boolean,
+ double, and string.
+
+
+
+
+
+ XPath 1.0 has no conditional operator. An XQuery/XPath expression
+ such as if ( hat ) then hat/@size else "no hat"
+ has no XPath 1.0 equivalent.
+
+
+
+
+
+ XPath 1.0 has no ordering comparison operator for strings. Both
+ "cat" < "dog" and
+ "cat" > "dog" are false, because each is a
+ numeric comparison of two NaNs. In contrast,
+ = and != do compare the strings
+ as strings.
+
+
+
+
+
+ XPath 1.0 blurs the distinction between
+ value comparisons and
+ general comparisons as XQuery/XPath define
+ them. Both sale/@hatsize = 7 and
+ sale/@customer = "alice" are existentially
+ quantified comparisons, true if there is
+ any sale with the given value for the
+ attribute, but sale/@taxable = false() is a
+ value comparison to the
+ effective boolean value of a whole node-set.
+ It is true only if no sale has
+ a taxable attribute at all.
+
+
+
+
+
+ In the XQuery/XPath data model, a document
+ node can have either document form (i.e., exactly one
+ top-level element, with only comments and processing instructions
+ outside of it) or content form (with those constraints
+ relaxed). Its equivalent in XPath 1.0, the
+ root node, can only be in document form.
+ This is part of the reason an xml value passed as the
+ context item to any PostgreSQL
+ XPath-based function must be in document form.
+
+
+
+
+
+
+ The differences highlighted here are not all of them. In XQuery and
+ the 2.0 and later versions of XPath, there is an XPath 1.0 compatibility
+ mode, and the W3C lists of
+ function library changes
+ and
+ language changes
+ applied in that mode offer a more complete (but still not exhaustive)
+ account of the differences. The compatibility mode cannot make the
+ later languages exactly equivalent to XPath 1.0.
+
+
+
+
+ Mappings between SQL and XML data types and values
+
+
+ In SQL:2006 and later, both directions of conversion between standard SQL
+ data types and the XML Schema types are specified precisely. However, the
+ rules are expressed using the types and semantics of XQuery/XPath, and
+ have no direct application to the different data model of XPath 1.0.
+
+
+
+ When PostgreSQL maps SQL data values to XML
+ (as in xmlelement), or XML to SQL (as in the output
+ columns of xmltable), except for a few cases
+ treated specially, PostgreSQL simply assumes
+ that the XML data type's XPath 1.0 string form will be valid as the
+ text-input form of the SQL datatype, and conversely. This rule has the
+ virtue of simplicity while producing, for many data types, results similar
+ to the mappings specified in the standard. In this release,
+ an explicit cast is needed if an xmltable column
+ expression produces a boolean or double value; see
+ .
+
+
+
+ Where interoperability with other systems is a concern, for some data
+ types, it may be necessary to use data type formatting functions (such
+ as those in ) explicitly to
+ produce the standard mappings.
+
+
+
+
+
+
+ Incidental limits of the implementation
+
+
+
+ This section concerns limits that are not inherent in the
+ libxml2 library, but apply to the current
+ implementation in PostgreSQL.
+
+
+
+
+ Cast needed for xmltable column
+ of boolean or double type
+
+
+
+ An xmltable column expression evaluating to an XPath
+ boolean or number result will produce an unexpected XPath object
+ type
error. The workaround is to rewrite the column expression to
+ be inside the XPath string function;
+ PostgreSQL will then assign the string value
+ successfully to an SQL output column of boolean or double type.
+
+
+
+
+
+ Column path result or SQL result column of XML type
+
+
+
+ In this release, a xmltable column expression
+ that evaluates to an XML node-set can be assigned to an SQL result
+ column of XML type, producing a concatenation of: for most types of
+ node in the node-set, a text node containing the XPath 1.0
+ string-value of the node, but for an element node,
+ a copy of the node itself. Such a node-set may be assigned to an SQL
+ column of non-XML type only if the node-set has a single node, with the
+ string-value of most node types replaced with an empty string, the
+ string-value of an element node replaced with a concatenation of only its
+ direct text-node children (excluding those of descendants), and the
+ string-value of a text or attribute node being as defined in XPath 1.0.
+ An XPath string value assigned to a result column of XML type must be
+ parsable as XML.
+
+
+
+ It is best not to develop code that relies on these behaviors, which have
+ little resemblance to the spec, and are changed in
+ PostgreSQL 12.
+
+
+
+
+ Only BY VALUE passing mechanism is supported
+
+
+ The SQL standard defines two passing mechanisms
+ that apply when passing an XML argument from SQL to an XML function or
+ receiving a result: BY REF, in which a particular XML
+ value retains its node identity, and BY VALUE, in which
+ the content of the XML is passed but node identity is not preserved. A
+ mechanism can be specified before a list of parameters, as the default
+ mechanism for all of them, or after any parameter, to override the
+ default.
+
+
+
+ To illustrate the difference, if
+ x is an XML value, these two queries in
+ an SQL:2006 environment would produce true and false, respectively:
+
+
+SELECT XMLQUERY('$a is $b' PASSING BY REF x AS a, x AS b NULL ON EMPTY);
+SELECT XMLQUERY('$a is $b' PASSING BY VALUE x AS a, x AS b NULL ON EMPTY);
+
+
+
+
+ In this release, PostgreSQL will accept
+ BY REF in an
+ XMLEXISTS or XMLTABLE
+ construct, but will ignore it. The xml data type holds
+ a character-string serialized representation, so there is no node
+ identity to preserve, and passing is always effectively BY
+ VALUE.
+
+
+
+
+ Cannot pass named parameters to queries
+
+
+ The XPath-based functions support passing one parameter to serve as the
+ XPath expression's context item, but do not support passing additional
+ values to be available to the expression as named parameters.
+
+
+
+
+ No XML(SEQUENCE) type
+
+
+ The PostgreSQL xml data type
+ can only hold a value in DOCUMENT
+ or CONTENT form. An XQuery/XPath expression
+ context item must be a single XML node or atomic value, but XPath 1.0
+ further restricts it to be only an XML node, and has no node type
+ allowing CONTENT. The upshot is that a
+ well-formed DOCUMENT is the only form of XML value
+ that PostgreSQL can supply as an XPath
+ context item.
+
+
+
+
+
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index c9478e7f0b1..8bd35173420 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -9903,17 +9903,26 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
+
XML Functions
+
+ XML Functions
+
+
The functions and function-like expressions described in this
- section operate on values of type xml. Check xml. See for information about the xml
type. The function-like expressions xmlparse
and xmlserialize for converting to and from
- type xml are not repeated here. Use of most of these
- functions requires the installation to have been built
- with configure --with-libxml>.
+ type xml are documented there, not in this section.
+
+
+
+ Use of most of these functions
+ requires PostgreSQL to have been built
+ with configure --with-libxml.
@@ -10107,8 +10116,8 @@ SELECT xmlelement(name foo, xmlattributes('xyz' as bar),
encoding, depending on the setting of the configuration parameter
. The particular behavior for
individual data types is expected to evolve in order to align the
- SQL and PostgreSQL data types with the XML Schema specification,
- at which point a more precise description will appear.
+ PostgreSQL mappings with those specified in SQL:2006 and later,
+ as discussed in .
@@ -10350,10 +10359,13 @@ SELECT xmlagg(x) FROM (SELECT * FROM test ORDER BY y DESC) AS tab;
- The function xmlexists returns true if the
- XPath expression in the first argument returns any nodes, and
- false otherwise. (If either argument is null, the result is
- null.)
+ The function xmlexists evaluates an XPath 1.0
+ expression (the first argument), with the passed XML value as its context
+ item. The function returns false if the result of that evaluation
+ yields an empty node-set, true if it yields any other value. The
+ function returns null if any argument is null. A nonnull value
+ passed as the context item must be an XML document, not a content
+ fragment or any non-XML value.
@@ -10369,14 +10381,14 @@ SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF 'Tor
- The BY REF clauses have no effect in
- PostgreSQL, but are allowed for SQL conformance and compatibility
- with other implementations. Per SQL standard, the
- first BY REF is required, the second is
- optional. Also note that the SQL standard specifies
- the xmlexists construct to take an XQuery
- expression as first argument, but PostgreSQL currently only
- supports XPath, which is a subset of XQuery.
+ The BY REF clauses
+ are accepted in PostgreSQL, but are ignored,
+ as discussed in .
+ In the SQL standard, the xmlexists function
+ evaluates an expression in the XML Query language,
+ but PostgreSQL allows only an XPath 1.0
+ expression, as discussed in
+ .
@@ -10482,12 +10494,12 @@ SELECT xml_is_well_formed_document('test
The function xpath_exists is a specialized form
of the xpath function. Instead of returning the
- individual XML values that satisfy the XPath, this function returns a
- Boolean indicating whether the query was satisfied or not. This
- function is equivalent to the standard XMLEXISTS> predicate,
+ individual XML values that satisfy the XPath 1.0 expression, this function
+ returns a Boolean indicating whether the query was satisfied or not
+ (specifically, whether it produced any value other than an empty node-set).
+ This function is equivalent to the XMLEXISTS predicate,
except that it also offers support for a namespace mapping argument.
@@ -10592,8 +10605,8 @@ SELECT xpath_exists('/my:a/text()', 'test
The xmltable function produces a table based
- on the given XML value, an XPath filter to extract rows, and an
- optional set of column definitions.
+ on the given XML value, an XPath filter to extract rows, and a
+ set of column definitions.
@@ -10604,30 +10617,34 @@ SELECT xpath_exists('/my:a/text()', 'test
- The required row_expression> argument is an XPath
- expression that is evaluated against the supplied XML document to
- obtain an ordered sequence of XML nodes. This sequence is what
- xmltable> transforms into output rows.
+ The required row_expression argument is
+ an XPath 1.0 expression that is evaluated, passing the
+ document_expression as its context item, to
+ obtain a set of XML nodes. These nodes are what
+ xmltable transforms into output rows. No rows
+ will be produced if the document_expression
+ is null, nor if the row_expression produces
+ an empty node-set or any value other than a node-set.
- document_expression> provides the XML document to
- operate on.
- The BY REF clauses have no effect in PostgreSQL,
- but are allowed for SQL conformance and compatibility with other
- implementations.
- The argument must be a well-formed XML document; fragments/forests
- are not accepted.
+ document_expression provides the context
+ item for the row_expression. It must be a
+ well-formed XML document; fragments/forests are not accepted.
+ The BY REF clause
+ is accepted but ignored, as discussed in
+ .
+ In the SQL standard, the xmltable function
+ evaluates expressions in the XML Query language,
+ but PostgreSQL allows only XPath 1.0
+ expressions, as discussed in
+ .
The mandatory COLUMNS clause specifies the list
of columns in the output table.
- If the COLUMNS> clause is omitted, the rows in the result
- set contain a single column of type xml> containing the
- data matched by row_expression>.
- If COLUMNS is specified, each entry describes a
- single column.
+ Each entry describes a single column.
See the syntax summary above for the format.
The column name and type are required; the path, default and
nullability clauses are optional.
@@ -10635,48 +10652,57 @@ SELECT xpath_exists('/my:a/text()', 'test
A column marked FOR ORDINALITY will be populated
- with row numbers matching the order in which the
- output rows appeared in the original input XML document.
+ with row numbers, starting with 1, in the order of nodes retrieved from
+ the row_expression's result node-set.
At most one column may be marked FOR ORDINALITY.
+
+
+ XPath 1.0 does not specify an order for nodes in a node-set, so code
+ that relies on a particular order of the results will be
+ implementation-dependent. Details can be found in
+ .
+
+
+
- The column_expression> for a column is an XPath expression
- that is evaluated for each row, relative to the result of the
- row_expression>, to find the value of the column.
- If no column_expression> is given, then the column name
- is used as an implicit path.
+ The column_expression for a column is an
+ XPath 1.0 expression that is evaluated for each row, with the current
+ node from the row_expression result as its
+ context item, to find the value of the column. If
+ no column_expression is given, then the
+ column name is used as an implicit path.
- If a column's XPath expression returns multiple elements, an error
- is raised.
- If the expression matches an empty tag, the result is an
- empty string (not NULL>).
- Any xsi:nil> attributes are ignored.
+ If a column's XPath expression returns a non-XML value (limited to
+ string, boolean, or double in XPath 1.0) and the column has a
+ PostgreSQL type other than xml, the column will be set
+ as if by assigning the value's string representation to the PostgreSQL
+ type. In this release, an XPath boolean or double result must be explicitly
+ cast to string (that is, the XPath 1.0 string function
+ wrapped around the original column expression);
+ PostgreSQL can then successfully assign the
+ string to an SQL result column of boolean or double type.
+ These conversion rules differ from those of the SQL
+ standard, as discussed in .
- The text body of the XML matched by the column_expression>
- is used as the column value. Multiple text() nodes
- within an element are concatenated in order. Any child elements,
- processing instructions, and comments are ignored, but the text contents
- of child elements are concatenated to the result.
- Note that the whitespace-only text()> node between two non-text
- elements is preserved, and that leading whitespace on a text()>
- node is not flattened.
+ In this release, SQL result columns of xml type, or
+ column XPath expressions evaluating to an XML type, regardless of the
+ output column SQL type, are handled as described in
+ ; the behavior
+ changes significantly in PostgreSQL 12.
- If the path expression does not match for a given row but
- default_expression> is specified, the value resulting
- from evaluating that expression is used.
- If no DEFAULT> clause is given for the column,
- the field will be set to NULL>.
- It is possible for a default_expression> to reference
- the value of output columns that appear prior to it in the column list,
- so the default of one column may be based on the value of another
- column.
+ If the path expression returns an empty node-set
+ (typically, when it does not match)
+ for a given row, the column will be set to NULL, unless
+ a default_expression is specified; then the
+ value resulting from evaluating that expression is used.
@@ -10688,20 +10714,14 @@ SELECT xpath_exists('/my:a/text()', 'test
- Unlike regular PostgreSQL functions, column_expression>
- and default_expression> are not evaluated to a simple
- value before calling the function.
- column_expression> is normally evaluated
- exactly once per input row, and default_expression>
- is evaluated each time a default is needed for a field.
- If the expression qualifies as stable or immutable the repeat
+ A default_expression, rather than being
+ evaluated immediately when xmltable is called,
+ is evaluated each time a default is needed for the column.
+ If the expression qualifies as stable or immutable, the repeat
evaluation may be skipped.
- Effectively xmltable> behaves more like a subquery than a
- function call.
This means that you can usefully use volatile functions like
- nextval> in default_expression>, and
- column_expression> may depend on other parts of the
- XML document.
+ nextval in
+ default_expression.
diff --git a/src/backend/catalog/sql_features.txt b/src/backend/catalog/sql_features.txt
index 8e746f36d4e..ccd34501e69 100644
--- a/src/backend/catalog/sql_features.txt
+++ b/src/backend/catalog/sql_features.txt
@@ -593,7 +593,7 @@ X085 Predefined namespace prefixes NO
X086 XML namespace declarations in XMLTable NO
X090 XML document predicate YES
X091 XML content predicate NO
-X096 XMLExists NO XPath only
+X096 XMLExists NO XPath 1.0 only
X100 Host language support for XML: CONTENT option NO
X101 Host language support for XML: DOCUMENT option NO
X110 Host language support for XML: VARCHAR mapping NO
@@ -661,11 +661,11 @@ X282 XMLValidate with CONTENT option NO
X283 XMLValidate with SEQUENCE option NO
X284 XMLValidate: NAMESPACE without ELEMENT clause NO
X286 XMLValidate: NO NAMESPACE with ELEMENT clause NO
-X300 XMLTable NO XPath only
+X300 XMLTable NO XPath 1.0 only
X301 XMLTable: derived column list option YES
X302 XMLTable: ordinality column option YES
X303 XMLTable: column default option YES
-X304 XMLTable: passing a context item YES
+X304 XMLTable: passing a context item YES must be XML DOCUMENT
X305 XMLTable: initializing an XQuery variable NO
X400 Name and identifier mapping YES
X410 Alter column data type: XML type YES