1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-26 21:41:34 +03:00
Commit Graph

4506 Commits

Author SHA1 Message Date
Nick Wellnhofer
855c19efb7 Avoid reparsing in xmlParseStartTag2
The code in xmlParseStartTag2 must handle the case that the input
buffer was grown and reallocated which can invalidate pointers to
attribute values. Before, this was handled by detecting changes of
the input buffer "base" pointer and, in case of a change, jumping
back to the beginning of the function and reparsing the start tag.

The major problem of this approach is that whether an input buffer is
reallocated is nondeterministic, resulting in seemingly random test
failures. See the mailing list thread "runtest mystery bug: name2.xml
error case regression test" from 2012, for example.

If a reallocation was detected, the code also made no attempts to
continue parsing in case of errors which makes a difference in
the lax "recover" mode.

Now we store the current input buffer "base" pointer for each (not
separately allocated) attribute in the namespace URI field, which isn't
used until later. After the whole start tag was parsed, the pointers
to the attribute values are reconstructed using the offset between the
new and the old input buffer. This relies on arithmetic on dangling
pointers which is technically undefined behavior. But it seems like
the easiest and most efficient fix and a similar approach is used in
xmlParserInputGrow.

This changes the error output of several tests, typically making it
more verbose because we try harder to continue parsing in case of
errors.

(Another possible solution is to check not only the "base" pointer
but the size of the input buffer as well. But this would result in
even more reparsing.)
2017-06-01 14:31:28 +02:00
Nick Wellnhofer
07b7428b69 Simplify control flow in xmlParseStartTag2
Remove some goto labels and deduplicate a bit of code after handling
namespaces.

Before:

    loop {
        parseAttribute
        if (ok) {
            if (defaultNamespace) {
                handleDefaultNamespace
                if (error)
                    goto skip_default_ns;
                handleDefaultNamespace
    skip_default_ns:
                freeAttr
                nextAttr
                continue;
            }
            if (namespace) {
                handleNamespace
                if (error)
                    goto skip_ns;
                handleNamespace
    skip_ns:
                freeAttr
                nextAttr;
                continue;
            }
            handleAttr
        } else {
            freeAttr
        }
        nextAttr
    }

After:

    loop {
        parseAttribute
        if (!ok)
            goto next_attr;
        if (defaultNamespace) {
            handleDefaultNamespace
            if (error)
                goto next_attr;
            handleDefaultNamespace
        } else if (namespace) {
            handleNamespace
            if (error)
                goto next_attr;
            handleNamespace
        } else {
            handleAttr
        }
    next_attr:
        freeAttr
        nextAttr
    }
2017-06-01 14:31:28 +02:00
Nick Wellnhofer
ac9a4560ee Disable LeakSanitizer when running API tests
The autogenerated API tests leak memory.
2017-06-01 14:31:28 +02:00
Nick Wellnhofer
ff34ba3e88 Avoid out-of-bound array access in API tests
The API tests combine string buffers with arbitrary length values which
makes ASan detect out-of-bound array accesses. Even without ASan, this
could lead to unwanted test failures.

Add a check for "len", "size", and "start" arguments, assuming they
apply to the nearest char pointer. Skip the test if they exceed the
buffer size. This is a somewhat naive heuristic but it seems to work
well.
2017-06-01 14:31:28 +02:00
Nick Wellnhofer
34e445674d Fix undefined behavior in xmlRegExecPushStringInternal
It's stupid, but the behavior of memcpy(NULL, NULL, 0) is undefined.
2017-06-01 14:31:27 +02:00
Nick Wellnhofer
474967241c Avoid spurious UBSan errors in parser.c
If available, use a C99 flexible array member to avoid spurious UBSan
errors.
2017-06-01 14:31:27 +02:00
Nick Wellnhofer
f4029cd413 Check XPath exponents for overflow
Avoid undefined behavior and wrong results with huge exponents.

Found with afl-fuzz and UBSan.
2017-05-31 16:04:37 +02:00
Nick Wellnhofer
a58331a6ee Check for overflow in xmlXPathIsPositionalPredicate
Avoid undefined behavior when casting from double to int.

Found with afl-fuzz and UBSan.
2017-05-31 16:04:26 +02:00
Nick Wellnhofer
a851868a75 Parse small XPath numbers more accurately
Don't count leading zeros towards the fraction size limit. This allows
to parse numbers like

    0.0000000000000000000000000000000000000000000000000000000001

which is the only standard-conformant way to represent such numbers, as
scientific notation isn't allowed in XPath 1.0. (It is allowed in XPath
2.0 and in libxml2 as an extension, though.)

Overall accuracy is still bad, see bug 783238.
2017-05-31 15:46:29 +02:00
Nick Wellnhofer
4bebb030db Rework XPath rounding functions
Use the C library's floor and ceil functions. The old code was overly
complicated for no apparent reason and could result in undefined
behavior when handling NaNs (found with afl-fuzz and UBSan).

Fix wrong comment in xmlXPathRoundFunction. The implementation was
already following the spec and rounding half up.
2017-05-31 15:38:42 +02:00
Nick Wellnhofer
43f50f4dfc Fix white space in test output
Quote echoed variable to avoid newlines being converted to space.
2017-05-31 15:30:19 +02:00
Nick Wellnhofer
40f5852149 Fix axis traversal from attribute and namespace nodes
When traversing the "preceding" axis from an attribute node, we must
first go up to the attribute's containing element. Otherwise, text
children of other attributes could be returned. This made it possible
to hit a code path in xmlXPathNextAncestor which contained another bug:
The attribute node was initialized with the context node instead of the
current node. Normally, this code path is only hit via
xmlXPathNextAncestorOrSelf in which case the current and context node
are the same.

The combination of the two bugs could result in an infinite loop, found
with libFuzzer.

Traversing the "following" and the "preceding" axis from namespace nodes
should be handled similarly. This wasn't supported at all previously.
2017-05-31 14:57:46 +02:00
Nick Wellnhofer
a07a4e96d0 Fix spurious error message
Commit c851970 introduced a spurious error message when evaluating
XPath expressions with xmlXPathCompiledEvalToBoolean.
2017-05-27 17:07:53 +02:00
Nick Wellnhofer
aed407c14b Check for trailing characters in XPath expressions earlier
Move the check for trailing characters from xmlXPathEval to
xmlXPathEvalExpr. Otherwise, a valid portion of a syntactically invalid
expression would be evaluated before returning an error.
2017-05-27 16:04:07 +02:00
Nick Wellnhofer
c851970c6e Rework final handling of XPath results
Move cleanup of XPath stack to xmlXPathFreeParserContext. This avoids
memory leaks if valuePop fails in some error cases. Found with
libFuzzer and ASan.

Rework handling of the final XPath result object in
xmlXPathCompiledEvalInternal and xmlXPathEval to avoid useless error
messages.
2017-05-27 16:03:48 +02:00
Nick Wellnhofer
640a368c80 Make xmlXPathEvalExpression call xmlXPathEval
Both functions are supposed to do exactly the same.
2017-05-27 15:59:18 +02:00
Nick Wellnhofer
d6b3645f9b Fix memory leak in xmlCanonicPath
Found with libFuzzer and ASan.
2017-05-27 15:59:18 +02:00
Nick Wellnhofer
cf60dbe461 Fix memory leak in xmlXPathCompareNodeSetValue
Implement TODO block to free the arguments in error case.

Found with libFuzzer and ASan.
2017-05-27 15:59:18 +02:00
Nick Wellnhofer
1f131f1133 Fix memory leak in pattern error path
Found with libFuzzer and ASan.
2017-05-27 15:59:18 +02:00
Nick Wellnhofer
8627e4ed20 Fix memory leak in parser error path
Triggered in mixed content ELEMENT declarations if there's an invalid
name after the first valid name:

    <!ELEMENT para (#PCDATA|a|<invalid>)*>

Found with libFuzzer and ASan.
2017-05-27 15:59:18 +02:00
Nick Wellnhofer
bd1571cdc5 Fix memory leaks in XPointer error paths
Found with libFuzzer and ASan.
2017-05-27 15:59:18 +02:00
Nick Wellnhofer
9d08b34716 Fix memory leak in xmlXPathNodeSetMergeAndClear
Namespaces nodes must not be duplicated when merging.

Found with libFuzzer and ASan.
2017-05-27 15:59:18 +02:00
Nick Wellnhofer
95a9249a60 Fix memory leak in XPath filter optimizations
Namespace nodes must be freed when selecting the first or last element
of a node set.

Found with libFuzzer and ASan.
2017-05-27 15:59:05 +02:00
Nick Wellnhofer
d42a7063da Fix memory leaks in XPath error paths
Found with libFuzzer and ASan.
2017-05-27 14:58:19 +02:00
David Tardon
074180119f Do not leak the new CData node if adding fails
For https://bugzilla.gnome.org/show_bug.cgi?id=780918
2017-04-07 18:24:52 +02:00
Neel Mehta
90ccb58242 Prevent unwanted external entity reference
For https://bugzilla.gnome.org/show_bug.cgi?id=780691

* parser.c: add a specific check to avoid PE reference
2017-04-07 17:45:14 +02:00
Daniel Veillard
5dca9eea1b Increase buffer space for port in HTTP redirect support
For https://bugzilla.gnome.org/show_bug.cgi?id=780690

nanohttp.c: the code wrongly assumed a short int port value.
2017-04-07 17:13:28 +02:00
Doran Moppert
2304078555 Add an XML_PARSE_NOXXE flag to block all entities loading even local
For https://bugzilla.gnome.org/show_bug.cgi?id=772726

* include/libxml/parser.h: Add a new parser flag XML_PARSE_NOXXE
* elfgcchack.h, xmlIO.h, xmlIO.c: associated loading routine
* include/libxml/xmlerror.h: new error raised
* xmllint.c: adds --noxxe flag to activate the option
2017-04-07 16:55:05 +02:00
Nick Wellnhofer
e905f08123 Fix more NULL pointer derefs in xpointer.c
Found with afl-fuzz.
2016-10-12 14:00:03 +02:00
Nick Wellnhofer
229d1f93ce Avoid function/data pointer conversion in xpath.c
Fixes a `-pedantic` compiler warning.
2016-10-12 13:23:16 +02:00
Nick Wellnhofer
94613f64c0 Remove unused variables 2016-10-12 13:23:08 +02:00
Nick Wellnhofer
c2545cbb6d Fix format string warnings
Also fixes bug #768199:

https://bugzilla.gnome.org/show_bug.cgi?id=768199
2016-10-12 13:22:57 +02:00
Nick Wellnhofer
c1d1f71211 Disallow namespace nodes in XPointer ranges
Namespace nodes must be copied to avoid use-after-free errors.
But they don't necessarily have a physical representation in a
document, so simply disallow them in XPointer ranges.

Found with afl-fuzz.

Fixes CVE-2016-4658.
2016-10-12 13:12:18 +02:00
Nick Wellnhofer
3f8a91036d Disallow namespace nodes in XPointer points 2016-10-12 13:12:18 +02:00
Nick Wellnhofer
9ab01a277d Fix XPointer paths beginning with range-to
The old code would invoke the broken xmlXPtrRangeToFunction. range-to
isn't really a function but a special kind of location step. Remove
this function and always handle range-to in the XPath code.

The old xmlXPtrRangeToFunction could also be abused to trigger a
use-after-free error with the potential for remote code execution.

Found with afl-fuzz.

Fixes CVE-2016-5131.
2016-10-12 13:12:18 +02:00
Nick Wellnhofer
a005199330 Fix comparison with root node in xmlXPathCmpNodes
This change has already been made in xmlXPathCmpNodesExt but not in
xmlXPathCmpNodes.
2016-10-12 13:09:21 +02:00
Alex Henrie
3169602058 Fix attribute decoding during XML schema validation
For https://bugzilla.gnome.org/show_bug.cgi?id=766834

vctxt->parserCtxt is always NULL in xmlSchemaSAXHandleStartElementNs,
so this function can't call xmlStringLenDecodeEntities to decode the
entities.
2016-08-29 11:21:08 +02:00
Nick Wellnhofer
d8083bf779 Fix NULL pointer deref in XPointer range-to
- Check for errors after evaluating first operand.
- Add sanity check for empty stack.

Found with afl-fuzz.
2016-06-25 14:24:51 +02:00
Nick Wellnhofer
1fc55ca72b Don't print generic error messages in XPath tests 2016-06-25 14:24:51 +02:00
Chun-wei Fan
d77e5fc4bc relaxng.c, xmlschemas.c: Fix build on pre-C99 compilers
Make sure that the variables are declared at the top of the block.

https://bugzilla.gnome.org/show_bug.cgi?id=767063
2016-06-23 19:02:26 +08:00
Daniel Veillard
bdec2183f3 Release of libxml2-2.9.4
* doc/xml.html libxml.spec.in: updated for the release
* doc/*: regenerated but no API additions
v2.9.4
2016-05-23 16:04:52 +08:00
David Kilzer
502f6a6d08 More format string warnings with possible format string vulnerability
For https://bugzilla.gnome.org/show_bug.cgi?id=761029

adds a new xmlEscapeFormatString() function to escape composed format
strings
2016-05-23 15:01:08 +08:00
Daniel Veillard
bdd66182ef Avoid building recursive entities
For https://bugzilla.gnome.org/show_bug.cgi?id=762100

When we detect a recusive entity we should really not
build the associated data, moreover if someone bypass
libxml2 fatal errors and still tries to serialize a broken
entity make sure we don't risk to get ito a recursion

* parser.c: xmlParserEntityCheck() don't build if entity loop
  were found and remove the associated text content
* tree.c: xmlStringGetNodeList() avoid a potential recursion
CVE-2016-3627
2016-05-23 15:01:07 +08:00
Pranjal Jumde
0bcd05c5cd Heap-based buffer overread in htmlCurrentChar
For https://bugzilla.gnome.org/show_bug.cgi?id=758606

* parserInternals.c:
(xmlNextChar): Add an test to catch other issues on ctxt->input
corruption proactively.
For non-UTF-8 charsets, xmlNextChar() failed to check for the end
of the input buffer and would continuing reading.  Fix this by
pulling out the check for the end of the input buffer into common
code, and return if we reach the end of the input buffer
prematurely.
* result/HTML/758606.html: Added.
* result/HTML/758606.html.err: Added.
* result/HTML/758606.html.sax: Added.
* result/HTML/758606_2.html: Added.
* result/HTML/758606_2.html.err: Added.
* result/HTML/758606_2.html.sax: Added.
* test/HTML/758606.html: Added test case.
* test/HTML/758606_2.html: Added test case.
CVE-2016-1833
2016-05-23 15:01:07 +08:00
David Kilzer
0090675905 Heap-based buffer-underreads due to xmlParseName
For https://bugzilla.gnome.org/show_bug.cgi?id=759573

* parser.c:
(xmlParseElementDecl): Return early on invalid input to fix
non-minimized test case (759573-2.xml).  Otherwise the parser
gets into a bad state in SKIP(3) at the end of the function.
(xmlParseConditionalSections): Halt parsing when hitting invalid
input that would otherwise caused xmlParserHandlePEReference()
to recurse unexpectedly.  This fixes the minimized test case
(759573.xml).

* result/errors/759573-2.xml: Add.
* result/errors/759573-2.xml.err: Add.
* result/errors/759573-2.xml.str: Add.
* result/errors/759573.xml: Add.
* result/errors/759573.xml.err: Add.
* result/errors/759573.xml.str: Add.
* test/errors/759573-2.xml: Add.
* test/errors/759573.xml: Add.
2016-05-23 15:01:07 +08:00
Pranjal Jumde
38eae57111 Heap use-after-free in xmlSAX2AttributeNs
For https://bugzilla.gnome.org/show_bug.cgi?id=759020

* parser.c:
(xmlParseStartTag2): Attribute strings are only valid if the
base does not change, so add another check where the base may
change.  Make sure to set 'attvalue' to NULL after freeing it.
* result/errors/759020.xml: Added.
* result/errors/759020.xml.err: Added.
* result/errors/759020.xml.str: Added.
* test/errors/759020.xml: Added test case.
CVE-2016-1835
2016-05-23 15:01:07 +08:00
Pranjal Jumde
11ed4a7a90 Heap use-after-free in htmlParsePubidLiteral and htmlParseSystemiteral
For https://bugzilla.gnome.org/show_bug.cgi?id=760263

* HTMLparser.c: Add BASE_PTR convenience macro.
(htmlParseSystemLiteral): Store length and start position instead
of a pointer while iterating through the public identifier since
the underlying buffer may change, resulting in a stale pointer
being used.
(htmlParsePubidLiteral): Ditto.
CVE-2016-1837
2016-05-23 15:01:07 +08:00
David Kilzer
4472c3a5a5 Fix some format string warnings with possible format string vulnerability
For https://bugzilla.gnome.org/show_bug.cgi?id=761029

Decorate every method in libxml2 with the appropriate
LIBXML_ATTR_FORMAT(fmt,args) macro and add some cleanups
following the reports.
2016-05-23 15:01:07 +08:00
Hugh Davenport
beca86e8c8 Detect change of encoding when parsing HTML names
From https://bugzilla.gnome.org/show_bug.cgi?id=758518

Happens when a file has a name getting parsed, but no valid encoding
set, so libxml has to guess what the encoding is. This patch detects
when the buffer location changes, and if it does, restarts the parsing
of the name.

This slightly change a couple of regression tests output
2016-05-23 15:01:07 +08:00
Daniel Veillard
b1d34de46a Fix inappropriate fetch of entities content
For https://bugzilla.gnome.org/show_bug.cgi?id=761430

libfuzzer regression testing exposed another case where the parser would
fetch content of an external entity while not in validating mode.
Plug that hole
CVE-2016-4449
2016-05-23 15:01:07 +08:00