1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-24 13:33:01 +03:00
Commit Graph

798 Commits

Author SHA1 Message Date
Alex Richardson
4b959ee168 Remove hacky heuristic from b2dc5675e9
Checking whether the context is close to the parent context by hardcoding
250 is not portable (I noticed tests were failing on Morello since the value
is 288 there due to pointers being 128 bits). Instead we should ensure
that the XML_VCTXT_USE_PCTXT flag is not set in cases where the user data
is not actually a parser context (or ideally add a separate field but that
would be an ABI break.

From what I can see in the source, the XML_VCTXT_USE_PCTXT is only set if
the userData field points to a valid context, and if this is not the case
the flag should be cleared when changing userData rather than relying on
the offset between the two. Looking at the history, I think
d7cb33cf44 fixed most of the need for this
workaround, but it looks like there are a few more locations that need
updating; This commit changes two more places to set/clear/copy the
XML_VCTXT_USE_PCTXT flag, so this heuristic should not be needed anymore.
I've also drop two = NULL assignment in xmllint since this is not needed
after a call to memset().

There was also an uninitialized vctxt.flags (and other fields) in
`xmlShellValidate()`, which I've fixed by adding a memset() call.
2022-12-01 15:31:25 +00:00
Alex Richardson
c62c0d82cc Correctly relocate internal pointers after realloc()
Adding an offset to a deallocated pointer and assuming that it can be
dereferenced is undefined behaviour. When running libxml2 on CHERI-enabled
systems such as Arm Morello this results in the creation of an out-of-bounds
pointer that cannot be dereferenced and therefore crashes at runtime.

The effect of this UB is not just limited to architectures such as CHERI,
incorrect relocation of pointers after realloc can in fact cause
FORTIFY_SOURCE errors with recent GCC:
https://developers.redhat.com/articles/2022/09/17/gccs-new-fortification-level
2022-12-01 15:14:40 +00:00
Nick Wellnhofer
c16fd705bb xpath: Make init function private 2022-11-27 02:11:07 +01:00
Nick Wellnhofer
53ab38408d encoding: Make init function private 2022-11-27 02:11:07 +01:00
Nick Wellnhofer
05c3a458aa tests: Check that xmlInitParser doesn't allocate memory 2022-11-27 02:11:07 +01:00
Nick Wellnhofer
78c0391bc7 parser: Register atexit handler in locked section 2022-11-25 15:12:56 +01:00
Nick Wellnhofer
ed053c50cf dict: Make init/cleanup functions private 2022-11-25 15:02:04 +01:00
Nick Wellnhofer
7010d8779b threads: Rework initialization
Make init/cleanup functions private. Merge xmlOnceInit into
xmlInitThreadsInternal.
2022-11-25 15:02:04 +01:00
Nick Wellnhofer
9dbf137455 parser: Make some module init/cleanup functions private 2022-11-25 15:02:04 +01:00
Nick Wellnhofer
cecd364dd2 parser: Don't call *DefaultSAXHandlerInit from xmlInitParser
Change the default handler definitions to match the result after calling
the initialization functions.

This makes sure that no thread-local variables are accessed when calling
xmlInitParser.
2022-11-25 15:02:04 +01:00
Nick Wellnhofer
b1f9c19383 parser: Fix push parser with unterminated CDATA sections
Short-lived regression found by OSS-Fuzz.
2022-11-22 21:39:01 +01:00
Nick Wellnhofer
0e193f0d61 parser: Remove dangerous check in xmlParseCharData
If this check succeeds, xmlParseCharData could be called over and over
again without making progress, resulting in an infinite loop.

It's only important to check for XML_PARSER_EOF which is done later.

Related to #441.
2022-11-21 22:09:19 +01:00
Nick Wellnhofer
94ca36c2c4 parser: Restore parser state in xmlParseCDSect
Fixes #441.
2022-11-21 22:07:11 +01:00
Nick Wellnhofer
a8b31e68c2 parser: Fix progress check when parsing character data
Skip over zero bytes to guarantee progress. Short-lived regression.
2022-11-21 21:39:10 +01:00
Nick Wellnhofer
c63900fbc1 parser: Check terminate flag when push parsing CDATA sections
Found by OSS-Fuzz.
2022-11-21 20:39:17 +01:00
Nick Wellnhofer
a781ee3395 Revert "parser: Add overflow checks to xmlParseLookup functions"
This reverts commit bfc55d6884.

It's better to fix the root cause.
2022-11-21 20:11:14 +01:00
Nick Wellnhofer
bfc55d6884 parser: Add overflow checks to xmlParseLookup functions
Short-lived regression found by OSS-Fuzz.
2022-11-21 18:29:54 +01:00
Nick Wellnhofer
9e4a46ace6 parser: Merge misc, prolog and epilog cases in push parser 2022-11-20 22:03:08 +01:00
Nick Wellnhofer
55fb8f72ac parser: Fix push parser with 1-3 byte initial chunk
Make sure that ctxt->charset is initialized properly.
2022-11-20 21:27:59 +01:00
Nick Wellnhofer
68a6518c45 parser: Rewrite push parser boundary checks
Remove inaccurate xmlParseCheckTransition check.

Remove non-incremental xmlParseGetLasts check.

Add functions that check for several boundary constructs more
accurately, keeping track of progress in ctxt->checkIndex.

Fixes #439.
2022-11-20 21:27:08 +01:00
Nick Wellnhofer
2059df5358 buf: Deprecate static/immutable buffers 2022-11-20 21:16:03 +01:00
Nick Wellnhofer
4955e0c9e1 io: Don't shrink memory input buffers 2022-11-20 21:16:03 +01:00
Nick Wellnhofer
117bab2256 parser: Don't call xmlSHRINK from push parser
xmlSHRINK also calls xmlParserInputGrow which isn't needed in the push
parser.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
f00739c12e parser: Ignore cdata argument in xmlParseCharData
It never could be used to parse CDATA sections.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
e4f56a7213 parser: Simplify xmlParseConditionalSections 2022-11-20 21:16:03 +01:00
Nick Wellnhofer
3582b07bd2 parser: Fix content parser progress checks
This is another attempt at fixing parser progress checks. Instead of
relying on in->consumed, which could overflow, change some content
parser functions to make guaranteed progress on certain byte sequences.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
f7ad338e09 parser: Fix attribute parser progress checks
This is another attempt at fixing parser progress checks. Instead of
relying on in->consumed, which could overflow, make the attribute parser
functions return a NULL name only if they don't make progress.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
f61b8a6233 parser: Fix DTD parser progress checks
This is another attempt at fixing parser progress checks. Instead of
relying on in->consumed, which could overflow, change some DTD parser
functions to make guaranteed progress on certain byte sequences.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
46cd7d224e io: Remove xmlInputReadCallbackNop
In some cases, for example when using encoders, the read callback was
set to NULL, in other cases it was set to xmlInputReadCallbackNop.
xmlGROW only tested for xmlInputReadCallbackNop, resulting in errors
when parsing large encoded content from memory.

Always use a NULL callback for memory buffers to avoid ambiguities.

Fixes #262.
2022-11-20 21:12:18 +01:00
Nick Wellnhofer
a70f7d4715 parser: Fix error message in xmlParseCommentComplex
Fixes #421.
2022-11-04 14:03:31 +01:00
Nick Wellnhofer
afc7e3a7f4 malloc-fail: Fix memory leak in xmlParseReference
Found with libFuzzer, see #344.
2022-11-02 16:11:00 +01:00
Nick Wellnhofer
e129c1d1a2 malloc-fail: Fix infinite loop in xmlSkipBlankChars
Found with libFuzzer, see #344.
2022-11-02 16:02:39 +01:00
Nick Wellnhofer
865e142c41 malloc-fail: Fix memory leak in xmlCreatePushParserCtxt
Found with libFuzzer, see #344.
2022-11-02 15:57:53 +01:00
Nick Wellnhofer
ffaec75809 Fix integer overflows with XML_PARSE_HUGE
Also impose size limits when XML_PARSE_HUGE is set. Limit size of names
to XML_MAX_TEXT_LENGTH (10 million bytes) and other content to
XML_MAX_HUGE_LENGTH (1 billion bytes).

Move some the length checks to the end of the respective loop to make
them strict.

xmlParseEntityValue didn't have a length limitation at all. But without
XML_PARSE_HUGE, this should eventually trigger an error in xmlGROW.

Thanks to Maddie Stone working with Google Project Zero for the report!
2022-10-14 15:01:46 +02:00
Nick Wellnhofer
1a2d8ddc06 parser: Fix potential memory leak in xmlParseAttValueInternal
Fix memory leak in case xmlParseAttValueInternal is called with a NULL
`len` a non-NULL `alloc` argument. This static function is never called
with such arguments internally, but the misleading code should be fixed
nevertheless.

Fixes #422.
2022-10-11 13:14:37 +02:00
Nick Wellnhofer
a9669679f5 error: Don't use initGenericErrorDefaultFunc
The code in xmlInitParser did only set the error handler if it was NULL
which should never happen.
2022-09-09 13:52:48 +02:00
Nick Wellnhofer
59f2f60e3e Remove "runtime debugging"
This doesn't seem useful as configuration option.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
884e142dc5 Fix --with-schemas --without-xpath build
xmlXPathInit must be called for schemas.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
6843fc726f Remove or annotate char casts 2022-09-01 04:31:30 +02:00
Nick Wellnhofer
2cac626976 Don't use sizeof(xmlChar) or sizeof(char) 2022-09-01 03:35:19 +02:00
Nick Wellnhofer
ad338ca737 Remove explicit integer casts
Remove explicit integer casts as final operation

- in assignments
- when passing arguments
- when returning values

Remove casts

- to the same type
- from certain range-bound values

The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.

Document some explicit casts as truncating and add a few missing ones.
2022-09-01 02:33:57 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
48f84ea8ed Remove internal macros from parserInternals.h
Replace MOVETO_ENDTAG with code that updates line and column numbers.
2022-08-25 21:31:08 +02:00
Nick Wellnhofer
58fc89e8a9 Deprecate internal parser functions 2022-08-25 21:04:57 +02:00
Nick Wellnhofer
34a050cdee Move some HTML functions to correct header file 2022-08-24 16:44:39 +02:00
Nick Wellnhofer
fd85b566f7 Mark more parser functions as deprecated
No compiler warnings generated yet.
2022-08-24 15:12:24 +02:00
Nick Wellnhofer
0e49f8826a Mark most SAX1 functions as deprecated
No compiler warnings generated yet.
2022-08-24 14:07:57 +02:00
Nick Wellnhofer
9a82b94a94 Introduce xmlNewSAXParserCtxt and htmlNewSAXParserCtxt
Add API functions to create a parser context with a custom SAX handler
without having to mess with ctxt->sax manually.
2022-08-24 14:07:55 +02:00
Nick Wellnhofer
5b2d07a726 Use xmlStrlen in *CtxtReadDoc
xmlStrlen handles buffers larger than INT_MAX more gracefully.
2022-08-20 17:00:50 +02:00
Nick Wellnhofer
4ad71c2d72 Fix xmlCtxtReadDoc with encoding
xmlCtxtReadDoc used to create an input stream involving
xmlNewStringInputStream. This would create a stream without an input
buffer, causing problems with encodings (see #34).

After commit aab584dc3, an error was returned even with UTF-8 encodings
which happened to work before.

Make xmlCtxtReadDoc call xmlCtxtReadMemory which doesn't suffer from
these issues. Also fix htmlCtxtReadDoc.

Fixes #397.
2022-08-20 16:34:08 +02:00