Fix accounting of recursion depth when parsing XPath expressions.
This silly bug introduced in commit 804c5297 could lead to spurious
errors when parsing larger expressions or XSLT documents.
Should fix#264.
The DBL_MAX approach could lead to errors caused by excess precision.
Switch back to the division-by-zero approach with a work-around for
MSVC and use the extern globals instead of macro expressions.
Always limit nested functions calls to 5000. This avoids call stack
overflows with deeply nested expressions.
The expression parser produces about 10 nested function calls when
parsing a subexpression in parentheses, so the effective nesting limit
is about 500 which should be more than enough.
Use a lower limit when fuzzing to account for increased memory usage
when using sanitizers.
RVTs from libxslt are document nodes which are linked using the 'next'
pointer. These pointers must never be used to navigate the document
tree. Otherwise, random content from other RVTs could be returned
when evaluating XPath expressions.
It's interesting that this seemingly long-standing bug wasn't
discovered earlier. This issue could also cause severe performance
degradation.
Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/37
Memory allocation errors in the following functions a often ignored.
Add TODO comments.
- xmlXPathNodeSetCreate
- xmlXPathNodeSetAdd*
- xmlXPathNodeSetMerge*
- xmlXPathNodeSetDupNs
Note that the following functions currently lack a way to propagate
memory errors:
- xmlXPathCompareNodeSets
- xmlXPathEqualNodeSets
Currently, many memory allocation errors in xpath.c aren't propagated to
the parser/evaluation context and for the most part ignored. Most
XPath objects allocated via one of the New, Wrap or Copy functions end
up being pushed on the stack, so adding a check in valuePush handles
many cases without much effort.
Also simplify the code a little and make sure to return -1 in case of
error.
Make sure that memory errors in xmlXPathCompExprAdd are propagated to
the parser context. Hitting the step limit or running out of memory
without raising an error could also lead to an out-of-bounds read.
Also fixes a memory leak in xmlXPathErrMemory.
Found by OSS-Fuzz.
Consolidate code paths evaluating XPath predicates and filters.
Don't push context node on stack when evaluating predicates. I have no
idea why this was done. It seems completely useless and trying to pop
the context node from a corrupted stack has already caused security
issues.
Filter nodesets in-place and don't create node sets with NULL gaps which
allows to simplify merging a great deal. Simply move matched nodes
backward and create a compact node set.
Merge xmlXPathCompOpEvalPositionalPredicate into
xmlXPathCompOpEvalPredicate.
Useful to avoid call stack overflows when fuzzing. Note that parsing a
parenthesized expression currently consumes more than 10 stack frames,
so this limit should be set rather low.
Optionally limit the maximum numbers of XPath operations when evaluating
an expression. Useful to avoid timeouts when fuzzing. The following
operations count towards the limit:
- XPath operations
- Location step iterations
- Union operations
Enabled by setting opLimit to a non-zero value. Note that it's the user's
responsibility to reset opCount. This allows to enforce the operation
limit across multiple reuses of an XPath context.
Check that there's exactly one return value on the stack after calling
XPath functions. Otherwise, functions that corrupt the stack without
signaling an error could lead to memory errors.
Found with libFuzzer and UBSan.
Rewrite conversion of double to int in xmlXPathSubstringFunction, adding
range checks to avoid undefined behavior. Make sure to add start and
length as floating-point numbers before converting to int. Fix a bug
when rounding negative start indices.
Remove unneeded calls to xmlXPathIs{Inf,NaN} and rely on IEEE math
instead. Avoid computing the string length. xmlUTF8Strsub works as
expected if the length of the requested substring exceeds the input.
Found with libFuzzer and UBSan.
If a nodeset to be filtered is empty, it can be returned without popping
it from the stack.
Make sure to restore the context node in all error paths and never set
it to NULL.
Save and restore the context node in RANGETO operations.
On some pre-C99 compilers, the NAN and INFINITY macros don't expand to
constant expressions.
Some MSVC versions complain about floating point division by zero in
constants.
Thanks to Fabrice Manfroi for the report.
Use C99 macros NAN, INFINITY, isnan, isinf. If they're not available:
- Assume that (0.0 / 0.0) generates a NaN and !(x == x) tests for NaN.
- Use C89's HUGE_VAL for INFINITY.
Remove manual handling of NaN, infinity and negative zero in functions
xmlXPathValueFlipSign and xmlXPathDivValues.
Remove xmlXPathGetSign. All the tests for negative zero can be replaced
with a test for negative or positive zero.
Simplify xmlXPathRoundFunction.
Remove Trio dependency.
This should work on IEEE 754 compliant implementations even if the C99
macros aren't available, but will likely break some ancient platforms.
If problems arise, my plan is to port the relevant trionan.c solution
to xpath.c. Note that non-compliant implementations are impossible
to fully support, anyway, since XPath requires IEEE 754.
Make sure that all parameters and return values of hash callback
functions exactly match the callback function type. This is required
to pass clang's Control Flow Integrity checks and to allow compilation
to asm.js with Emscripten.
Fixes bug 784861.