1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2026-01-29 23:42:18 +03:00
Commit Graph

5008 Commits

Author SHA1 Message Date
Mike Dalessio
a67b63d183 use new htmlParseLookupCommentEnd to find comment ends
Note that the caret in error messages generated during comment parsing
may have moved by one byte.

See guidance provided on incorrectly-closed comments here:

https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-closed-comment
2020-12-16 16:12:07 +01:00
Mike Dalessio
29f5d20e84 htmlParseComment: treat --!> as if it closed the comment
See guidance provided on incorrectly-closed comments here:

https://html.spec.whatwg.org/multipage/parsing.html#parse-error-incorrectly-closed-comment
2020-12-16 16:12:07 +01:00
Mike Dalessio
e28d9347bc add test coverage for incorrectly-closed comments
this establishes the baseline behavior so that subsequent commits
which modify this behavior are clear about what's being changed.
2020-12-16 16:12:07 +01:00
Nick Wellnhofer
9086988ffa Enforce maximum length of fuzz input
Remove the libfuzzer max_len option which doesn't apply to other
fuzzing engines. Enforce the maximum length directly in the fuzz
targets. For the xml target, lower the maximum when expanding entities
to avoid timeout and OOM errors.
2020-12-16 16:12:07 +01:00
Nick Wellnhofer
1fe385304f Remove temporary members from struct _xmlXPathContext
These values are hardcoded now and the struct members, while public,
were recently introduced and never part of an official release.
2020-12-16 15:27:13 +01:00
Nick Wellnhofer
8ca3a59b2e Fix integer overflow in xmlSchemaGetParticleTotalRangeMin
The function is only used once and its return value is only checked for
zero. Disable the function like its Max counterpart and add an
implementation for the special case.

Found by OSS-Fuzz.
2020-12-15 20:14:28 +01:00
Xiaoming Ni
649d02eaa4 encoding: fix memleak in xmlRegisterCharEncodingHandler()
The return type of xmlRegisterCharEncodingHandler() is void. The invoker
cannot determine whether xmlRegisterCharEncodingHandler() is executed
successfully. when nbCharEncodingHandler >= MAX_ENCODING_HANDLERS, the
"handler" is not added to the array "handlers". As a result, the memory
of "handler" cannot be managed and released: memory leakage.

so add "xmlfree(handler)" to fix memory leakage on the failure branch of
xmlRegisterCharEncodingHandler().

Reported-by: wuqing <wuqing30@huawei.com>
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
2020-12-07 14:38:14 +01:00
Xiaoming Ni
cb7a572b3e xmlschemastypes.c: xmlSchemaGetFacetValueAsULong add, check "facet->val"
The xmlSchemaGetFacetValueAsUlong() API is an external API.
The validity of external input parameters must be strictly verified.
Before accessing "facet->val->value", we need check whether "facet->val" is
a null pointer.

Signed-off-by: wuqing <wuqing30@huawei.com>
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
2020-12-07 14:37:55 +01:00
Markus Rickert
84b76d99f1 Update CMake config files 2020-12-07 14:37:23 +01:00
Markus Rickert
d0ccb3a6b6 Add xmlcatalog and xmllint to CMake export 2020-12-07 14:37:18 +01:00
Nick Wellnhofer
acdc2ff360 Simplify xmlexports.h
All the compiler switches essentially set the same macros. The only
exception was MSVC which omitted the "extern" keyword for exported
variables. This in turn broke clang-cl.

This commit rewrites and simplifies the whole header.

Closes #163.
2020-12-06 17:31:38 +01:00
Nick Wellnhofer
a218ff0ec0 Fix null pointer deref in xmlXPtrRangeInsideFunction
Found by OSS-Fuzz.
2020-12-06 17:26:36 +01:00
Nick Wellnhofer
94c2e415a9 Fix quadratic runtime in HTML push parser with null bytes
Null bytes in the input stream do not necessarily signal an EOF
condition. Check the stream pointers for EOF to avoid quadratic
rescanning of input data.

Note that the CUR_CHAR macro used in functions like htmlParseCharData
calls htmlCurrentChar which translates null bytes.

Found by OSS-Fuzz.
2020-12-06 16:44:11 +01:00
Markus Rickert
1c4f9a6db5 Require dependencies based on enabled CMake options 2020-11-30 12:43:48 +01:00
Michael Matz
faea2fa9b8 Avoid quadratic checking of identity-constraints
key/unique/keyref schema attributes currently use qudratic loops
to check their various constraints (that keys are unique and that
keyrefs refer to existing keys).  That becomes extremely slow if
there are many elements with keys.  This happens in the wild with
e.g. the OVAL XML descriptions of security patches.  You need the
openscap schemata, and then an example xml file:

% zypper in openscap-utils
% wget ftp://ftp.suse.com/pub/projects/security/oval/opensuse.leap.15.1.xml
% time xmllint --schema /usr/share/openscap/schemas/oval/5.5/oval-definitions-schema.xsd opensuse.leap.15.1.xml > /dev/null
opensuse.leap.15.1.xml validates

real    16m59,857s
user    16m55,787s
sys     0m1,060s

This patch makes libxml use a hash table to avoid the quadratic
behaviour.  The existing hash table only accepts strings as keys, so
we're mostly reusing the canonical representation of key values to derive
such strings (with the caveat given in a comment).  The alternative
would be to rework the hash table code to accept either numbers or free
functions as hash workers, but the code is fast enough as is.

With the patch we have this then:

% time LD_LIBRARY_PATH=./libxml2/.libs/ ./libxml2/.libs/xmllint --schema /usr/share/openscap/schemas/oval/5.5/oval-definitions-schema.xsd opensuse.leap.15.1.xml > /dev/null
opensuse.leap.15.1.xml validates

real    0m3,531s
user    0m3,427s
sys     0m0,103s

So, a ~300x speedup.  This patch survives 'make check' and 'make tests'.
2020-11-30 11:22:54 +01:00
Markus Rickert
8272db5318 Use NAMELINK_COMPONENT in CMake install 2020-11-30 11:22:54 +01:00
Markus Rickert
5c7bdbc906 Add CMake files to EXTRA_DIST 2020-11-30 11:22:53 +01:00
Markus Rickert
7a62870a3c Add missing compile definition for static builds to CMake 2020-11-30 11:08:14 +01:00
Markus Rickert
e028d29379 Add CI for CMake on Linux and MinGW 2020-11-30 11:07:46 +01:00
Frederik Seiffert
b516ed189e Fix building with ICU 68.
ICU 68 no longer defines the TRUE macro.

Closes #204.
2020-11-19 18:10:32 +01:00
Victor Stinner
ac5e99911a Convert python/libxml.c to PY_SSIZE_T_CLEAN
Define PY_SSIZE_T_CLEAN macro in python/libxml.c and cast the string
length (int len) explicitly to Py_ssize_t when passing a string to a
function call using PyObject_CallMethod() with the "s#" format.
2020-11-19 18:09:22 +01:00
Victor Stinner
f42a0524c6 Build the Python extension with PY_SSIZE_T_CLEAN
The Python extension module now uses Py_ssize_t rather than int for
string lengths. This change makes the extension compatible with
Python 3.10.

Fixes #203.
2020-11-19 18:09:22 +01:00
Nick Wellnhofer
0ace6c4d7e Add CI test for Python 3 2020-11-19 18:09:22 +01:00
Elliott Hughes
7c06d99e1f Fix xmlURIEscape memory leaks.
Found by running the fuzz/uri.c fuzzer under asan (internal Android bug
171610679).

Always free `ret` when exiting on failure. I've moved the definition of
NULLCHK down past where ret is always initialized to make it clear that
this is safe.

This patch also fixes the indentation of two of the NULLCHK call sites
to make it more obvious that NULLCHK isn't `if`-like.
2020-11-09 18:17:01 +01:00
Nick Wellnhofer
31c6ce3b63 Avoid call stack overflow with XML reader and recursive XIncludes
Don't process XIncludes in the result of another inclusion to avoid
infinite recursion resulting in a call stack overflow.

This is something the XInclude engine shouldn't allow but correct
handling of intra-document includes would require major changes.

Found by OSS-Fuzz.
2020-11-09 17:55:44 +01:00
Nick Wellnhofer
7d6837ba0e Fix caret in regexp character group
Apply Per Hedeland's patch from

    https://bugzilla.gnome.org/show_bug.cgi?id=779751

Fixes #188.
2020-10-25 20:21:43 +01:00
Nick Wellnhofer
8a85263f13 Add fuzzing dictionaries to EXTRA_DIST
Also add static seed corpus for the URI fuzzer.
2020-10-25 20:08:16 +01:00
Nick Wellnhofer
1bde104060 Add 'fuzz' subdirectory to DIST_SUBDIRS
Fixes #191.
2020-10-25 20:02:23 +01:00
Mike Dalessio
c0c26ff201 parser.c: xmlParseCharData peek behavior fixed wrt newlines
Previously, xmlParseCharData and xmlParseComment would consider 0xA to
be unhandleable when seen as the first byte of an input chunk, and
fall back to xmlParseCharDataComplex and xmlParseCommentComplex, which
have different memory and performance characteristics.

Fixes GNOME/libxml2#192
2020-10-25 20:00:59 +01:00
Nick Wellnhofer
b46016b870 Allow port numbers up to INT_MAX
Also return an error on overflow.
2020-10-17 18:03:09 +02:00
Nick Wellnhofer
46837d47d5 Fix memory leaks in XPointer string-range function
Found by OSS-Fuzz.
2020-10-03 01:13:35 +02:00
Nick Wellnhofer
0b3c64d9f2 Handle dumps of corrupted documents more gracefully
Check parent pointers for NULL after the non-recursive rewrite of the
serialization code. This avoids segfaults with corrupted documents
which can apparently be seen with lxml, see issue #187.
2020-09-29 18:08:37 +02:00
Nick Wellnhofer
847a3a1181 Fix use-after-free when XIncluding text from Reader
The XML Reader can free text nodes coming from the XInclude engine
before parsing has finished. Cache a copy of the text string, not the
included node to avoid use after free.

Found by OSS-Fuzz.
2020-09-28 12:37:51 +02:00
yanjinjq
7929f05710 Fix SEGV in xmlSAXParseFileWithData
Fixes #181.
2020-09-21 13:12:31 +02:00
Nick Wellnhofer
e6ec58ecf7 Fix null deref in XPointer expression error path
Make sure that the filter functions introduced with commit c2f4da1a
return node-sets without NULL pointers also in the error case.

Found by OSS-Fuzz.
2020-09-21 12:49:36 +02:00
Nick Wellnhofer
4e9cc18ba9 Fix variable name in win32/configure.js
Fix copy/paste error from previous commit.
2020-09-21 11:00:23 +02:00
Nick Wellnhofer
5614c07854 Fix version parsing in win32/configure.js
Adjust to configure.ac changes.

Should fix #185.
2020-09-21 10:55:45 +02:00
Nick Wellnhofer
8b88503a27 Don't call xmlXPathInit directly
Call xmlInitParser which uses a lock to avoid race conditions.

Fixes #184.
2020-09-18 19:15:27 +02:00
Nick Wellnhofer
b215c270fa Fix cleanup of attributes in XML reader
xml:id creates ID attributes even in documents without a DTD, so the
check in xmlTextReaderFreeProp must be changed to avoid use after free.

Found by OSS-Fuzz.
2020-09-13 12:19:48 +02:00
Nick Wellnhofer
f0fd1b67fc Limit size of free lists in XML reader when fuzzing
Keeping objects on a free list can hide memory errors. Only allow a
single node on free lists used by the XML reader when fuzzing. This
should hide fewer errors while still exercising the free list logic.
2020-08-26 00:27:53 +02:00
Nick Wellnhofer
ba589adc2f Fix double free in XML reader with XIncludes
An XInclude with empty fallback could lead to a double free in
xmlTextReaderRead.

Found by OSS-Fuzz.
2020-08-26 00:22:47 +02:00
Nick Wellnhofer
6f1470a5d6 Hardcode maximum XPath recursion depth
Always limit nested functions calls to 5000. This avoids call stack
overflows with deeply nested expressions.

The expression parser produces about 10 nested function calls when
parsing a subexpression in parentheses, so the effective nesting limit
is about 500 which should be more than enough.

Use a lower limit when fuzzing to account for increased memory usage
when using sanitizers.
2020-08-26 00:22:25 +02:00
Nick Wellnhofer
8c3ef083ca Pass URL of main entity in XML fuzzer 2020-08-24 23:17:34 +02:00
Nick Wellnhofer
0d5f3710fb Consolidate seed corpus generation
Implement file handling in C to speed up corpus generation.
2020-08-24 21:14:55 +02:00
Nick Wellnhofer
0d9da0290c Test fuzz targets with dummy driver
Run fuzz targets with files in seed corpus during test.
2020-08-24 03:57:03 +02:00
Nick Wellnhofer
3fcf319378 Fix regression introduced with commit d88df4b
Revert the commit and use a different approach.

Found by OSS-Fuzz.
2020-08-22 00:50:42 +02:00
Nick Wellnhofer
87d20b554c Fix regression introduced with commit 74dcc10b
The code wasn't dead after all, but I can see no reason in delaying
the XPointer evaluation. This could lead to nodes included earlier
appearing in XPointer results.
2020-08-19 13:52:08 +02:00
Nick Wellnhofer
fbb7fa9a9a Fix memory leak in xmlXIncludeAddNode error paths
Found by OSS-Fuzz.
2020-08-19 13:13:48 +02:00
Nick Wellnhofer
19cae17f5a Revert "Fix quadratic runtime in xi:fallback processing"
This reverts commit 27119ec33c.

Not copying fallback children didn't fix up namespaces and could lead
to use-after-free errors.

Found by OSS-Fuzz.
2020-08-19 13:13:41 +02:00
Nick Wellnhofer
d63cfeca35 Add TODO comment in xinclude.c
Add some thoughts on the major remaining problems with the XInclude
implementation.
2020-08-17 15:42:20 +02:00