1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-26 00:37:43 +03:00
Commit Graph

824 Commits

Author SHA1 Message Date
Nick Wellnhofer
bfd2f4300f Fix null deref in legacy SAX1 parser
Always call nameNsPush instead of namePush. The latter is unused now
and should probably be removed from the public API. I can't see how
it could be used reasonably from client code and the unprefixed name
has always polluted the global namespace.

Fixes a null pointer dereference introduced with de5b624f when parsing
in SAX1 mode.

Found by OSS-Fuzz.
2021-05-09 19:03:16 +02:00
Nick Wellnhofer
ce00c36e65 Store per-element parser state in a struct
Make the parser context's "pushTab" point to an array of structs
instead of void pointers. This avoids casting unrelated types to void
pointers, improving readability and portability, and allows for more
efficient packing. Ultimately, the struct could be extended to include
the contents of "nameTab" and "spaceTab", further simplifying the code.

Historically, "pushTab" was only used by the push parser (hence the
name), so the change to the public headers should be safe.

Also remove an unused parameter from xmlParseEndTag2.
2021-05-08 22:16:49 +02:00
Nick Wellnhofer
de5b624f10 Fix handling of unexpected EOF in xmlParseContent
Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was
removed in commit 62150ed2.

This commit also introduced a regression for direct users of
xmlParseContent. Unclosed tags weren't checked.
2021-05-08 20:47:36 +02:00
Nick Wellnhofer
3e80560d4b Fix line numbers in error messages for mismatched tags
Commit 62150ed2 introduced a small regression in the error messages for
mismatched tags. This typically only affected messages after the first
mismatch, but with custom SAX handlers all line numbers would be off.

This also fixes line numbers in the SAX push parser which were never
handled correctly.
2021-05-07 11:48:11 +02:00
Nick Wellnhofer
babe75030c Propagate error in xmlParseElementChildrenContentDeclPriv
Check return value of recursive calls to
xmlParseElementChildrenContentDeclPriv and return immediately in case
of errors. Otherwise, struct xmlElementContent could contain unexpected
null pointers, leading to a null deref when post-validating documents
which aren't well-formed and parsed in recovery mode.

Fixes #243.
2021-05-01 17:24:49 +02:00
Nick Wellnhofer
c3fd8c4295 Fix exponential behavior with recursive entities
Fix another case where only recursion depth was limited, but entities
would still be expanded over and over again.

The test case discovered by fuzzing only affected parsing in recovery
mode with XML_PARSE_RECOVER.

Found by OSS-Fuzz.
2021-03-13 17:37:09 +01:00
Mike Dalessio
afad37216b parser.c: shrink the input buffer when appropriate
Fixes GNOME/libxml2#200

Also see discussions at:
- GNOME/libxml2#192
- https://gitlab.gnome.org/nwellnhof/libxml2/-/commit/99bda1e
- https://github.com/sparklemotion/nokogiri/issues/2132
2021-02-08 17:14:35 +01:00
Nick Wellnhofer
79301d3d5e Fix timeout when handling recursive entities
Abort parsing early to avoid an almost infinite loop in certain error
cases involving recursive entities.

Found with libFuzzer.
2020-12-18 14:13:46 +01:00
Nick Wellnhofer
45da175c14 Fix memory leak in xmlParseElementMixedContentDecl
Free parsed content if malloc fails to avoid a memory leak.

Found with libFuzzer.
2020-12-18 14:11:58 +01:00
Mike Dalessio
c0c26ff201 parser.c: xmlParseCharData peek behavior fixed wrt newlines
Previously, xmlParseCharData and xmlParseComment would consider 0xA to
be unhandleable when seen as the first byte of an input chunk, and
fall back to xmlParseCharDataComplex and xmlParseCommentComplex, which
have different memory and performance characteristics.

Fixes GNOME/libxml2#192
2020-10-25 20:00:59 +01:00
yanjinjq
7929f05710 Fix SEGV in xmlSAXParseFileWithData
Fixes #181.
2020-09-21 13:12:31 +02:00
Nick Wellnhofer
99fc048d7f Don't use SAX1 if all element handlers are NULL
Running xmllint with "--sax --noout" installs a SAX2 handler with all
callbacks set to NULL. In this case or similar situations, we don't want
to switch to SAX1 parsing.
2020-08-17 01:17:39 +02:00
Nick Wellnhofer
b82fa3dd26 Fix column number accounting in xmlParse*NameAndCompare
Thanks to Frederic Vancraeyveldt for the report.
2020-08-09 15:02:01 +02:00
Nick Wellnhofer
438e595a8c Stop counting nbChars in parser context
The value was inaccurate and never used.
2020-08-09 15:01:45 +02:00
Nick Wellnhofer
956534e02e Check for custom free function in global destructor
Calling a custom deallocation function in the global destructor could
cause all kinds of unexpected problems. See for example

    https://github.com/sparklemotion/nokogiri/issues/2059

Only clean up if memory is managed with malloc/free.
2020-08-04 19:27:13 +02:00
David Kilzer
0e5c4fec15 Reset XML parser input before reporting errors
Apply changes to htmlParseChunk() in 13ba5b61 and 3f18e748 to
xmlParseChunk().
2020-07-19 14:10:33 +02:00
Martin Vidner
43a8836cde Fix rebuilding docs, by hiding __attribute__((...)) behind a macro.
When enabled via `./configure --enable-rebuild-docs`,
`make -C doc libxml2-api.xml` will invoke apibuild.py
to rebuild libxml2-api.xml from the sources.
But the code added in
9fa3200cb3 made it error out with

```
Parsing ../parser.c
Parse Error: parsing type : expecting a name
('Got token ', ('sep', '('))
('Last token: ', ('sep', '('))
('Token queue: ', [('name', 'destructor'), ('sep', ')'), ('sep', ')')])
('Line 14689 end: ', '')
```
2020-06-24 19:55:52 +02:00
Nick Wellnhofer
a28f7d8789 Never expand parameter entities in text declaration
When parsing the text declaration of external DTDs or entities, make
sure that parameter entities are not expanded. This also fixes a memory
leak in certain error cases.

The change to xmlSkipBlankChars assumes that the parser state is
maintained correctly when parsing external DTDs or parameter entities,
and might expose bugs in the code that were hidden previously.

Found by OSS-Fuzz.
2020-06-10 14:25:19 +02:00
Nick Wellnhofer
2e8cc66d8f xmlParseBalancedChunkMemory must not be called with NULL doc
There is no way to avoid memory leaks without a document to hold the
namespace list.
2020-05-30 15:43:34 +02:00
Nick Wellnhofer
a0a8059b2c Revert "Fix memory leak in xmlParseBalancedChunkMemoryRecover"
This reverts commit 5a02583c7e.

Fixes #161.
2020-05-30 15:43:34 +02:00
Samuel Thibault
9fa3200cb3 Call xmlCleanupParser on ELF destruction
Fixes #153.
2020-05-04 13:53:11 +02:00
Nick Wellnhofer
20c60886e4 Fix typos
Resolves #133.
2020-03-08 17:41:53 +01:00
Nick Wellnhofer
1a3e584a5a Merge code paths loading external entities
Merge xmlParseCtxtExternalEntity into xmlParseExternalEntityPrivate.
2020-02-11 16:55:00 +01:00
Nick Wellnhofer
f9ea1a24ed Fix copying of entities in xmlParseReference
Before, reader mode would end up in a branch that didn't handle
entities with multiple children and failed to update ent->last, so the
hack copying the "extra" reader data wouldn't trigger. Consequently,
some empty nodes in entities are correctly detected now in the test
suite. (The detection of empty nodes in entities is still buggy,
though.)
2020-02-11 16:37:52 +01:00
Kevin Puetz
c7c526d6d0 Fix memory leak when shared libxml.dll is unloaded
When a multiple modules (process/plugins) all link to libxml2.dll
they will in fact share a single loaded instance of it.
It is unsafe for any of them to call xmlCleanupParser,
as this would deinitialize the shared state and break others that might
still have ongoing use.

However, on windows atexit is per-module (rather process-wide), so if used
*within* libxml2 it is possible to register a clean up when all users
are done and libxml2.dll is about to actually unload.

This allows multiple plugins to link with and share libxml2 without
a premature cleanup if one is unloaded, while still cleaning up if *all*
such callers are themselves unloaded.
2020-02-11 11:34:59 +01:00
Nick Wellnhofer
9bd7abfba4 Remove useless comparisons
Found by lgtm.com
2020-01-02 14:14:48 +01:00
Zhipeng Xie
0e1a49c890 Fix infinite loop in xmlStringLenDecodeEntities
When ctxt->instate == XML_PARSER_EOF,xmlParseStringEntityRef
return NULL which cause a infinite loop in xmlStringLenDecodeEntities

Found with libFuzzer.

Signed-off-by: Zhipeng Xie <xiezhipeng1@huawei.com>
2020-01-02 13:48:29 +01:00
Nick Wellnhofer
9737ec0717 Another fix for conditional sections at end of document
The previous fix introduced an uninitialized read.
2019-10-29 16:20:32 +01:00
Nick Wellnhofer
c1035664f9 Fix for conditional sections at end of document
Parsing conditional sections would fail if the final ']]>' was at the
end of the document. Short-lived regression caused by commit c51e38cb.
2019-10-23 11:40:34 +02:00
Jared Yanovich
2a350ee9b4 Large batch of typo fixes
Closes #109.
2019-09-30 18:04:38 +02:00
Nick Wellnhofer
c2f209c09f Disallow conditional sections in internal subset
Conditional sections are only allowed in *external* parameter entities
referenced from the internal subset.
2019-09-30 15:47:30 +02:00
Nick Wellnhofer
c51e38cb3a Make xmlParseConditionalSections non-recursive
Avoid call stack overflow in deeply nested conditional sections.

Found by OSS-Fuzz.
2019-09-30 15:47:30 +02:00
Nick Wellnhofer
62150ed2ab Make xmlParseContent and xmlParseElement non-recursive
Split xmlParseElement into subfunctions. Use nameNsPush to store prefix,
URI and nsNr on the heap, similar to the push parser.

Closes #84.
2019-09-23 17:45:50 +02:00
Nick Wellnhofer
a28bc75158 Fix integer overflow in entity recursion check 2019-09-20 13:46:58 +02:00
Nick Wellnhofer
e91cbcf639 Don't read external entities or XIncludes from stdin
The file input callbacks try to read from stdin if "-" is passed as URL.
This should never be done when loading indirect resources like external
entities or XIncludes. Unfortunately, the stdin substitution happens
deep inside the IO code, so we simply replace "-" with "./-" in specific
locations.

This issue also affects other users of the library like libxslt.
Ideally, stdin should only be substituted on explicit request. But more
intrusive changes could break existing code.

Closes #90 and #102.
2019-09-20 13:26:51 +02:00
Zhipeng Xie
5a02583c7e Fix memory leak in xmlParseBalancedChunkMemoryRecover
When doc is NULL, namespace created in xmlTreeEnsureXMLDecl
is bind to newDoc->oldNs, in this case, set newDoc->oldNs to
NULL and free newDoc will cause a memory leak.

Found with libFuzzer.

Closes #82.
2019-08-26 11:20:49 +02:00
Stephen Chenney
87125732cc Switched from unsigned long to ptrdiff_t in parser.c
Using unsigned long instead of ptrdiff_t results in non-zero
pointer deltas being stored as zero delta, giving incorrect offsets
into arrays and hence out of bounds reads.

This patch fixes the issue in all places in parser.c and adds a macro
to reduce the chances of cut-and-paste errors.

Only affects platforms where 'sizeof(long) < sizeof(size_t)' like
64-bit Windows.

See https://bugs.chromium.org/p/chromium/issues/detail?id=894933

Closes #44.
2019-07-08 13:00:12 +02:00
Nick Wellnhofer
01ea9c5af7 Fix another code path in xmlParseQName
Check for buffer errors in another code path missed in the previous
commit.

Found by OSS-Fuzz.
2019-07-08 11:29:40 +02:00
Nick Wellnhofer
5ccac8cecf Make sure that xmlParseQName returns NULL in error case
If there's an error growing the input buffer when recovering from
invalid QNames, make sure to return NULL. Otherwise, callers could be
confused. In xmlParseStartTag2, for example, `tlen` could become
negative.

Found by OSS-Fuzz.
2019-06-27 10:23:36 +02:00
Nick Wellnhofer
f9fce96313 Fix unsigned integer overflow
It's defined behavior but -fsanitize=unsigned-integer-overflow is
useful to discover bugs.
2019-05-20 13:38:22 +02:00
David Warring
3c0d62b419 Fix parser termination from "Double hyphen within comment" error
The patch fixes the parser not halting immediately when the error
handler attempts to stop the parser.

Rather it was running on and continuing to reference the freed buffer
in the while loop termination test.

This is only a problem if xmlStopParser is called from an error
handler. Probably caused by commit 123234f2. Fixes #58.
2019-05-14 15:55:12 +02:00
Nick Wellnhofer
b48226f78c Fix memory leaks in xmlParseStartTag2 error paths
Found by OSS-Fuzz.
2019-01-07 18:07:00 +01:00
Nick Wellnhofer
8919885ff9 Fix -Wformat-truncation warnings (GCC 8) 2019-01-06 14:24:59 +01:00
Nick Wellnhofer
123234f2cf Free input buffer in xmlHaltParser
This avoids miscalculation of available bytes.

Thanks to Yunho Kim for the report.

Closes: #26
2018-09-11 15:06:17 +02:00
Nick Wellnhofer
707ad080e6 Fix xmlParserEntityCheck
A previous commit removed the check for XML_ERR_ENTITY_LOOP which is
required to abort early in case of excessive entity recursion.
2018-01-23 16:37:54 +01:00
Nick Wellnhofer
ab362ab0ad Halt parser in case of encoding error
Should fix crbug.com/793715, although I wasn't able to reproduce the
issue.
2018-01-22 15:42:26 +01:00
Nick Wellnhofer
60dded12cb Clear entity content in case of errors
This only affects recovery mode and avoids integer overflow in
xmlStringGetNodeList and possibly other nasty surprises.

See bug 783052 and

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3874
https://bugs.chromium.org/p/chromium/issues/detail?id=796804
2018-01-22 15:23:22 +01:00
Nick Wellnhofer
132af1a0d1 Fix buffer over-read in xmlParseNCNameComplex
Calling GROW can halt the parser if the buffer grows too large. This
will set the buffer to an empty string. Return immediately in this case,
otherwise the "current" pointer is advanced leading to a buffer over-read.

Found with OSS-Fuzz. See

https://oss-fuzz.com/testcase?key=6683819592646656
https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=5031
2018-01-08 18:48:01 +01:00
Daniel Veillard
ad88b54f1a Improve handling of context input_id
For https://bugzilla.gnome.org/show_bug.cgi?id=772726
This was used in xmlsec to detect issues with accessing external entities
and prevent them, but was unreliable, based on a patch from Aleksey Sanin

* parser.c: make sure input_id is incremented when creating sub-entities
            for parsing or when parsing out of context
2017-12-08 09:42:31 +01:00
Nick Wellnhofer
cb5541c9f3 Fix libz and liblzma detection
If libz or liblzma are detected with pkg-config, AC_CHECK_HEADERS must
not be run because the correct CPPFLAGS aren't set. It is actually not
required have separate checks for LIBXML_ZLIB_ENABLED and HAVE_ZLIB_H.
Only check for LIBXML_ZLIB_ENABLED and remove HAVE_ZLIB_H macro.

Fixes bug 764657, bug 787041.
2017-11-27 14:33:37 +01:00