As written by Martin Kletzander <mkletzan@redhat.com>:
Since commit 8eb55d782a, when you parse
and save an URI that has no server (or similar) part, two slashes
after the 'schema:' get lost. It means 'uri:///noserver' is turned
into 'uri:/noserver'.
basically
foo:///only/path
means a host of "" while
foo:/only/path
means no host at all
So the best fix IMHO is to fix the URI parser to record the first
case and an empty host string and the second case as a NULL host string
I would not revert the initial patch, we should not 'invent' those
slash, but we should instead when parsing keep the information that
it's a host based path and that foo:/// means the presence of a host
but an empty one.
Once applied the resulting patch below, all cases seems to be saved
properly:
thinkpad:~/XML -> ./testURI uri:/noserver
uri:/noserver
thinkpad:~/XML -> ./testURI uri:///noserver
uri:///noserver
thinkpad:~/XML -> ./testURI uri://server/foo
uri://server/foo
thinkpad:~/XML -> ./testURI uri:/noserver/foo
uri:/noserver/foo
thinkpad:~/XML -> ./testURI uri:///
uri:///
thinkpad:~/XML -> ./testURI uri://
uri://
thinkpad:~/XML -> ./testURI uri:/
uri:/
thinkpad:~/XML ->
If you revert the initial patch that last case fails
The problem is that I don't want to change the xmlURI structure to
minimize ABI breakage, so I could not extend the field. The natural
solution is to denote that uri:/// has an empty host by making
the uri server field an empty string which works very well but breaks
applications (like libvirt ;-) who blindly look at uri->server
not being NULL to try to reach it !
Simplest was to stick the port to -1 in that case, instead of 0
application don't bother looking at the port of there is no server
string, this makes the patch more complex than a 1 liner, but
is better for ABI.
For https://bugzilla.gnome.org/show_bug.cgi?id=731063
xmlSaveUri() of libxml2 (snapshot 2014-05-31 and earlier) returns
bogus values when called with URIs that have rootless paths
(e.g. "urx🅱️b" becomes "urx://b%3Ab" where "urx:b%3Ab" would be
correct)
so we've got this patch to libxml2 2.7.6 in the LibreOffice code base,
inherited from OOo. it fixes a definite problem, which is that Windows
has a rather low maximum path length restriction, and there is a special
trick on NT whereby path names can be prefixed with "\\?\", in which
case the maximum length is 32k, which ought to be sufficient even for
bloated office suites :)
I'll attach the patch to the xmlCanonicPath function. note that i
didn't write this and am by no means an expert on either Microsoftean
platforms or libxml so maybe it's not the best way to do it.
* uri.c: cleanup the code doing the allocations, set up a structured
error handler to report memory errors, and set up an abitrary
limit on URI saving size
* error.c include/libxml/xmlerror.h: add a new FROM_URI indication
for structured error reporting, also adding strings for schematron
and buffer which were missing
clang recently grew a warning on `for (...);`. This patch
fixes all two instances of this pattern in libxml. The changes
don't modify the code semantic.
* uri.c: Ralf Junker pointed out that URI with no path
like http://www.domain.com when parsed ended up with an
empty path value instead of NULL, this fixes the problem
* HTMLparser.c c14n.c debugXML.c entities.c nanohttp.c parser.c
testC14N.c uri.c xmlcatalog.c xmllint.c xmlregexp.c xpath.c:
fix unused variables, or unneeded increments as well as a couple
of space issues
* runtest.c: check for NULL before calling unlink()
* uri.c: allow [ and ] in fragment identifiers, 3986 disallow them
but it's widely used for XPointer, and would break DocBook
processing among others
Daniel
svn path=/trunk/; revision=3765
* uri.c include/libxml/uri.h: rewrite the URI parser to update to
rfc3986 (from 2396)
* test/errors/webdav.xml result/errors/webdav.xml*: removed the
error test, 'DAV:' is a correct URI under 3986
* Makefile.am: small cleanup in make check
Daniel
svn path=/trunk/; revision=3763
* uri.c: applied patch from Ashwin fixing a number of realloc problems
* HTMLparser.c: improve handling for misplaced html/head/body
Daniel
svn path=/trunk/; revision=3740
* uri.c include/libxml/uri.h: add a new function xmlPathToUri()
to provide a clean conversion when setting up a base
* SAX2.c tree.c: use said function when setting up doc->URL
or using the xmlSetBase function. Should fix#346261
Daniel
* test/relaxng/docbook_0.xml: get rid of the dependancy on a locally
installed DTD
* uri.c include/libxml/uri.h xmlIO.c nanoftp.c nanohttp.c: try to
cleanup the Path/URI conversion mess, needed fixing in various
layers and a new API to the uri module which also fixes#306861
* runtest.c: integrated a regression test specific to check the
URI conversions done before calling the I/O handlers.
Daniel
* doc/apibuild.py doc/elfgcchack.xsl: revamped the elfgcchack.h
format to cope with gcc4 change of aliasing allowed scopes, had
to add extra informations to doc/libxml2-api.xml to separate
the header from the c module source.
* *.c: updated all c library files to add a #define bottom_xxx
and reimport elfgcchack.h thereafter, and a bit of cleanups.
* doc//* testapi.c: regenerated when rebuilding the API
Daniel
execution time of testapi.c, which was being delayed by
nameserver requests for non-existent URL's. From there it
just sort of grew, and grew....
* nanohttp.c, nanoftp.c: changed the processing of URL's
to use the uri.c routines instead of custom code.
* include/libxml/xmlerror.h: added code XML_FTP_URL_SYNTAX
* uri.c: added accepting ipV6 addresses, in accordance with
RFC's 2732 and 2373 (TODO: allow ipV4 within ipV6)
* gentest.py, testapi.c: fixed a few problems with the
testing of the nanoftp and nanohttp routines.
* include/libxml/xmlversion.h: minor change to fix a
warning on the docs generation
* regenerated the docs
* uri.c: fixed problem with xmlURIEscape when query part was
empty (actually fixed xmlURIEscapeStr to return an empty
string rather than NULL for empty string input) (bug 163079)
* SAX2.c, encoding.c, error.c, parser.c, tree.c, uri.c, xmlIO.c,
xmlreader.c, include/libxml/tree.h: many further little changes
for OOM problems. Now seems to be getting closer to "ok".
* testOOM.c: added code to intercept more errors, found more
problems with library. Changed method of flagging / counting
errors intercepted.
* SAX2.c, tree.c, uri.c, xmlIO.c, xmlreader.c: further
fixes for out of memory condition, mostly from Olivier
Andrieu.
* testOOM.c: some further improvement by Olivier, with
a further small enhancement for easier debugging.
* uri.c: fixed a problem when base path was "./xxx"
* result/XInclude/*: 5 test results changed by above.
* Makefile.am: fixed a couple of spots where a new
result file used different flags that the testing one.
* uri.c, include/libxml/uri.h: added a new routine
xmlBuildRelativeURI needed for enhancement of xinclude.c
* xinclude.c: changed handling of xml:base (bug 135864)
* result/XInclude/*: results of 5 tests changed as a result
of the above change
* Makefile.am testOOM.c testOOMlib.[ch] : integrated the Out Of
Memory test from Havoc Pennington #109368
* SAX.c parser.c parserInternals.c tree.c uri.c valid.c
xmlmemory.c xmlreader.c xmlregexp.c include/libxml/tree.h
include/libxml/parser.h: a lot of memory allocation cleanups
based on the results of the OOM testing
* check-relaxng-test-suite2.py: seems I forgot to commit the
script.
Daniel
* DOCBparser.c HTMLparser.c c14n.c catalog.c encoding.c globals.c
nanohttp.c parser.c parserInternals.c relaxng.c tree.c uri.c
xmlmemory.c xmlreader.c xmlregexp.c xpath.c xpointer.c
include/libxml/globals.h include/libxml/xmlmemory.h: added
xmlMallocAtomic() to be used when allocating blocks which
do not contains pointers, add xmlGcMemSetup() and xmlGcMemGet()
to allow registering the full set of functions needed by
a garbage collecting allocator like libgc, ref #109944
Daniel
* test/xsdtest/xsdtest.xml uri.c: after and exchange with James
Clark it appeared I had bug in URI parsing code ...
* relaxng.c include/libxml/relaxng.h: completely revamped error
reporting to not loose message from optional parts.
* xmllint.c: added timing for RNG validation steps
* result/relaxng/*: updated the result, all error messages changed
Daniel
* uri.c parser.c: some warning removal on Igor's patch
* tree.c: seems I messed up with #106788 fix
* python/libxml.c: fixed some base problems when Python provides
the resolver.
* relaxng.c: fixed the interleave algorithm
found 373 test schemas: 364 success 9 failures
found 529 test instances: 525 success 4 failures
the resulting failures are bug in the algorithm from 7.3 and
lack of support for params
Daniel
* check-relaxng-test-suite.py relaxng.c: more testing on the
Relax-NG front, cleaning up the regression tests failures
current state and I forgot support for "mixed":
found 373 test schemas: 280 success 93 failures
found 529 test instances: 401 success 68 failures
* tree.c include/libxml/tree.h xmlschemastypes.c: finished and
moved the Name, NCName and QName validation routine in tree.c
* uri.c: fixed handling of URI ending up with #, i.e. having
an empty fragment ID.
* result/relaxng/*: updated the results
Daniel