mirror of
https://github.com/postgres/postgres.git
synced 2025-07-30 11:03:19 +03:00
Thanks to the generous support of Torchbox (http://www.torchbox.com), I
have been able to significantly improve the contrib/xml XPath integration code. New features: * XPath set-returning function allows multiple results from an several XPath queries to be used as a virtual table. * Using libxslt, XSLT transformations (with and without parameters) are supported. (Caution: This support allows generic URL fetching from within the backend as well). I've removed the old code so that it is all libxml based. Rather than attach as a patch, I've put the tar.gz (10k!) at http://www.azuli.co.uk/pgxml-1.0.tar.gz (all files in archive are xml/....). I think this is worth replacing the contrib version with, even though the function names have changed (though the same functionality is there), because it includes a SRF and some SPI usage, in addition to linking to an external library. And it isn't a big module! Obviously, I understand that people might prefer to move it elsewhere, or might have reservations about replacing an existing contrib module with an incompatible one. I'm open to suggestions. John Gray
This commit is contained in:
@ -1,78 +0,0 @@
|
||||
PGXML TODO List
|
||||
===============
|
||||
|
||||
Some of these items still require much more thought! Since the first
|
||||
release, the XPath support has improved (because I'm no longer using a
|
||||
homemade algorithm!).
|
||||
|
||||
1. Performance considerations
|
||||
|
||||
At present each document is parsed to produce the DOM tree on every query.
|
||||
|
||||
Pros:
|
||||
Easy
|
||||
No persistent memory or storage allocation for parsed trees
|
||||
(libxml docs suggest representation of a document might
|
||||
be 4 times the size of the text)
|
||||
|
||||
Cons:
|
||||
Slow/ CPU intensive to parse.
|
||||
Makes it difficult for PLs to apply libxml manipulations to create
|
||||
new documents or amend existing ones.
|
||||
|
||||
|
||||
2. XQuery
|
||||
|
||||
I'm not sure if the addition of XQuery would be best as a function or
|
||||
as a new front-end parser. This is one to think about, but with a
|
||||
decent implementation of XPath, one of the prerequisites is covered.
|
||||
|
||||
3. DOM Interfaces
|
||||
|
||||
Expose more aspects of the DOM to user functions/ PLs. This would
|
||||
allow a procedure in a PL to run some queries and then use exposed
|
||||
interfaces to libxml to create an XML document out of the query
|
||||
results. I accept the argument that this might be more properly
|
||||
performed on the client side.
|
||||
|
||||
4. Returning sets of documents from XPath queries.
|
||||
|
||||
Although the current implementation allows you to amalgamate the
|
||||
returned results into a single document, it's quite possible that
|
||||
you'd like to use the returned set of nodes as a source for FROM.
|
||||
|
||||
Is there a good way to optimise/index the results of certain XPath
|
||||
operations to make them faster?:
|
||||
|
||||
select docid, pgxml_xpath(document,'//site/location/text()','','') as location
|
||||
where pgxml_xpath(document,'//site/name/text()','','') = 'Church Farm';
|
||||
|
||||
and with multiple element occurences in a document?
|
||||
|
||||
select d.docid, pgxml_xpath(d.document,'//site/location/text()','','')
|
||||
from docstore d,
|
||||
pgxml_xpaths('docstore','document','//feature/type/text()','docid') ft
|
||||
where ft.key = d.docid and ft.value ='Limekiln';
|
||||
|
||||
pgxml_xpaths params are relname, attrname, xpath, returnkey. It would
|
||||
return a set of two-element tuples (key,value) consisting of the value of
|
||||
returnkey, and the cdata value of the xpath. The XML document would be
|
||||
defined by relname and attrname.
|
||||
|
||||
The pgxml_xpaths function could be the basis of a functional index,
|
||||
which could speed up the above query very substantially, working
|
||||
through the normal query planner mechanism.
|
||||
|
||||
5. Return type support.
|
||||
|
||||
Better support for returning e.g. numeric or boolean values. I need to
|
||||
get to grips with the returned data from libxml first.
|
||||
|
||||
|
||||
John Gray <jgray@azuli.co.uk> 16 August 2001
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
Reference in New Issue
Block a user