Memory allocation errors in the following functions a often ignored.
Add TODO comments.
- xmlXPathNodeSetCreate
- xmlXPathNodeSetAdd*
- xmlXPathNodeSetMerge*
- xmlXPathNodeSetDupNs
Note that the following functions currently lack a way to propagate
memory errors:
- xmlXPathCompareNodeSets
- xmlXPathEqualNodeSets
Currently, many memory allocation errors in xpath.c aren't propagated to
the parser/evaluation context and for the most part ignored. Most
XPath objects allocated via one of the New, Wrap or Copy functions end
up being pushed on the stack, so adding a check in valuePush handles
many cases without much effort.
Also simplify the code a little and make sure to return -1 in case of
error.
Make sure that memory errors in xmlXPathCompExprAdd are propagated to
the parser context. Hitting the step limit or running out of memory
without raising an error could also lead to an out-of-bounds read.
Also fixes a memory leak in xmlXPathErrMemory.
Found by OSS-Fuzz.
Consolidate code paths evaluating XPath predicates and filters.
Don't push context node on stack when evaluating predicates. I have no
idea why this was done. It seems completely useless and trying to pop
the context node from a corrupted stack has already caused security
issues.
Filter nodesets in-place and don't create node sets with NULL gaps which
allows to simplify merging a great deal. Simply move matched nodes
backward and create a compact node set.
Merge xmlXPathCompOpEvalPositionalPredicate into
xmlXPathCompOpEvalPredicate.
Useful to avoid call stack overflows when fuzzing. Note that parsing a
parenthesized expression currently consumes more than 10 stack frames,
so this limit should be set rather low.
Optionally limit the maximum numbers of XPath operations when evaluating
an expression. Useful to avoid timeouts when fuzzing. The following
operations count towards the limit:
- XPath operations
- Location step iterations
- Union operations
Enabled by setting opLimit to a non-zero value. Note that it's the user's
responsibility to reset opCount. This allows to enforce the operation
limit across multiple reuses of an XPath context.
Check that there's exactly one return value on the stack after calling
XPath functions. Otherwise, functions that corrupt the stack without
signaling an error could lead to memory errors.
Found with libFuzzer and UBSan.
Rewrite conversion of double to int in xmlXPathSubstringFunction, adding
range checks to avoid undefined behavior. Make sure to add start and
length as floating-point numbers before converting to int. Fix a bug
when rounding negative start indices.
Remove unneeded calls to xmlXPathIs{Inf,NaN} and rely on IEEE math
instead. Avoid computing the string length. xmlUTF8Strsub works as
expected if the length of the requested substring exceeds the input.
Found with libFuzzer and UBSan.
If a nodeset to be filtered is empty, it can be returned without popping
it from the stack.
Make sure to restore the context node in all error paths and never set
it to NULL.
Save and restore the context node in RANGETO operations.
On some pre-C99 compilers, the NAN and INFINITY macros don't expand to
constant expressions.
Some MSVC versions complain about floating point division by zero in
constants.
Thanks to Fabrice Manfroi for the report.
Use C99 macros NAN, INFINITY, isnan, isinf. If they're not available:
- Assume that (0.0 / 0.0) generates a NaN and !(x == x) tests for NaN.
- Use C89's HUGE_VAL for INFINITY.
Remove manual handling of NaN, infinity and negative zero in functions
xmlXPathValueFlipSign and xmlXPathDivValues.
Remove xmlXPathGetSign. All the tests for negative zero can be replaced
with a test for negative or positive zero.
Simplify xmlXPathRoundFunction.
Remove Trio dependency.
This should work on IEEE 754 compliant implementations even if the C99
macros aren't available, but will likely break some ancient platforms.
If problems arise, my plan is to port the relevant trionan.c solution
to xpath.c. Note that non-compliant implementations are impossible
to fully support, anyway, since XPath requires IEEE 754.
Make sure that all parameters and return values of hash callback
functions exactly match the callback function type. This is required
to pass clang's Control Flow Integrity checks and to allow compilation
to asm.js with Emscripten.
Fixes bug 784861.
On 64-bit Windows, `long` is 32 bits wide and can't hold a pointer.
Switch to ptrdiff_t instead which should be the same size as a pointer
on every somewhat sane platform without requiring C99 types like
intptr_t.
Fixes bug 788312.
Thanks to J. Peter Mugaas for the report and initial patch.
Fix two bugs in xmlXPathNodeValHash which could lead to errors when
comparing nodesets to strings:
- Only use contents of text nodes to compute the hash for element nodes.
Comments, PIs, and other node types don't affect the string-value and
must be ignored.
- Reset `string` to NULL for node types other than text.
Reported by Aleksei on the mailing list:
https://mail.gnome.org/archives/xml/2017-September/msg00016.html
Move the calls to xmlXPathSetFrame and xmlXPathPopFrame around in
xmlXPathCompOpEvalPositionalPredicate to make sure that the context
object on the stack is actually protected. Otherwise, memory corruption
can occur when calling sloppily coded XPath extension functions.
Fixes bug 783160.
Commit c851970 removed a redundant error message if XPath evaluation
failed. This uncovered a case where an undefined XPath variable error
wasn't reported correctly.
Thanks to Petr Pisar for the report.
Fixes bug 787941.
First set of patches for zOS
- entities.c parser.c tree.c xmlschemas.c xmlschemastypes.c xpath.c xpointer.c:
ask conversion of code to ISO Latin 1 to avoid having the compiler assume
EBCDIC codepoint for characters.
- xmlmodule.c: make sure we have support for modules
- xmlIO.c: zOS path names are special avoid dsome of the expectstions from
Unix/Windows
Don't count leading zeros towards the fraction size limit. This allows
to parse numbers like
0.0000000000000000000000000000000000000000000000000000000001
which is the only standard-conformant way to represent such numbers, as
scientific notation isn't allowed in XPath 1.0. (It is allowed in XPath
2.0 and in libxml2 as an extension, though.)
Overall accuracy is still bad, see bug 783238.
Use the C library's floor and ceil functions. The old code was overly
complicated for no apparent reason and could result in undefined
behavior when handling NaNs (found with afl-fuzz and UBSan).
Fix wrong comment in xmlXPathRoundFunction. The implementation was
already following the spec and rounding half up.
When traversing the "preceding" axis from an attribute node, we must
first go up to the attribute's containing element. Otherwise, text
children of other attributes could be returned. This made it possible
to hit a code path in xmlXPathNextAncestor which contained another bug:
The attribute node was initialized with the context node instead of the
current node. Normally, this code path is only hit via
xmlXPathNextAncestorOrSelf in which case the current and context node
are the same.
The combination of the two bugs could result in an infinite loop, found
with libFuzzer.
Traversing the "following" and the "preceding" axis from namespace nodes
should be handled similarly. This wasn't supported at all previously.