1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-24 13:33:01 +03:00
Commit Graph

975 Commits

Author SHA1 Message Date
Nick Wellnhofer
f00739c12e parser: Ignore cdata argument in xmlParseCharData
It never could be used to parse CDATA sections.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
e4f56a7213 parser: Simplify xmlParseConditionalSections 2022-11-20 21:16:03 +01:00
Nick Wellnhofer
3582b07bd2 parser: Fix content parser progress checks
This is another attempt at fixing parser progress checks. Instead of
relying on in->consumed, which could overflow, change some content
parser functions to make guaranteed progress on certain byte sequences.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
f7ad338e09 parser: Fix attribute parser progress checks
This is another attempt at fixing parser progress checks. Instead of
relying on in->consumed, which could overflow, make the attribute parser
functions return a NULL name only if they don't make progress.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
f61b8a6233 parser: Fix DTD parser progress checks
This is another attempt at fixing parser progress checks. Instead of
relying on in->consumed, which could overflow, change some DTD parser
functions to make guaranteed progress on certain byte sequences.
2022-11-20 21:16:03 +01:00
Nick Wellnhofer
46cd7d224e io: Remove xmlInputReadCallbackNop
In some cases, for example when using encoders, the read callback was
set to NULL, in other cases it was set to xmlInputReadCallbackNop.
xmlGROW only tested for xmlInputReadCallbackNop, resulting in errors
when parsing large encoded content from memory.

Always use a NULL callback for memory buffers to avoid ambiguities.

Fixes #262.
2022-11-20 21:12:18 +01:00
Nick Wellnhofer
a70f7d4715 parser: Fix error message in xmlParseCommentComplex
Fixes #421.
2022-11-04 14:03:31 +01:00
Nick Wellnhofer
afc7e3a7f4 malloc-fail: Fix memory leak in xmlParseReference
Found with libFuzzer, see #344.
2022-11-02 16:11:00 +01:00
Nick Wellnhofer
e129c1d1a2 malloc-fail: Fix infinite loop in xmlSkipBlankChars
Found with libFuzzer, see #344.
2022-11-02 16:02:39 +01:00
Nick Wellnhofer
865e142c41 malloc-fail: Fix memory leak in xmlCreatePushParserCtxt
Found with libFuzzer, see #344.
2022-11-02 15:57:53 +01:00
Nick Wellnhofer
ffaec75809 Fix integer overflows with XML_PARSE_HUGE
Also impose size limits when XML_PARSE_HUGE is set. Limit size of names
to XML_MAX_TEXT_LENGTH (10 million bytes) and other content to
XML_MAX_HUGE_LENGTH (1 billion bytes).

Move some the length checks to the end of the respective loop to make
them strict.

xmlParseEntityValue didn't have a length limitation at all. But without
XML_PARSE_HUGE, this should eventually trigger an error in xmlGROW.

Thanks to Maddie Stone working with Google Project Zero for the report!
2022-10-14 15:01:46 +02:00
Nick Wellnhofer
1a2d8ddc06 parser: Fix potential memory leak in xmlParseAttValueInternal
Fix memory leak in case xmlParseAttValueInternal is called with a NULL
`len` a non-NULL `alloc` argument. This static function is never called
with such arguments internally, but the misleading code should be fixed
nevertheless.

Fixes #422.
2022-10-11 13:14:37 +02:00
Nick Wellnhofer
a9669679f5 error: Don't use initGenericErrorDefaultFunc
The code in xmlInitParser did only set the error handler if it was NULL
which should never happen.
2022-09-09 13:52:48 +02:00
Nick Wellnhofer
59f2f60e3e Remove "runtime debugging"
This doesn't seem useful as configuration option.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
884e142dc5 Fix --with-schemas --without-xpath build
xmlXPathInit must be called for schemas.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
6843fc726f Remove or annotate char casts 2022-09-01 04:31:30 +02:00
Nick Wellnhofer
2cac626976 Don't use sizeof(xmlChar) or sizeof(char) 2022-09-01 03:35:19 +02:00
Nick Wellnhofer
ad338ca737 Remove explicit integer casts
Remove explicit integer casts as final operation

- in assignments
- when passing arguments
- when returning values

Remove casts

- to the same type
- from certain range-bound values

The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.

Document some explicit casts as truncating and add a few missing ones.
2022-09-01 02:33:57 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
48f84ea8ed Remove internal macros from parserInternals.h
Replace MOVETO_ENDTAG with code that updates line and column numbers.
2022-08-25 21:31:08 +02:00
Nick Wellnhofer
58fc89e8a9 Deprecate internal parser functions 2022-08-25 21:04:57 +02:00
Nick Wellnhofer
34a050cdee Move some HTML functions to correct header file 2022-08-24 16:44:39 +02:00
Nick Wellnhofer
fd85b566f7 Mark more parser functions as deprecated
No compiler warnings generated yet.
2022-08-24 15:12:24 +02:00
Nick Wellnhofer
0e49f8826a Mark most SAX1 functions as deprecated
No compiler warnings generated yet.
2022-08-24 14:07:57 +02:00
Nick Wellnhofer
9a82b94a94 Introduce xmlNewSAXParserCtxt and htmlNewSAXParserCtxt
Add API functions to create a parser context with a custom SAX handler
without having to mess with ctxt->sax manually.
2022-08-24 14:07:55 +02:00
Nick Wellnhofer
5b2d07a726 Use xmlStrlen in *CtxtReadDoc
xmlStrlen handles buffers larger than INT_MAX more gracefully.
2022-08-20 17:00:50 +02:00
Nick Wellnhofer
4ad71c2d72 Fix xmlCtxtReadDoc with encoding
xmlCtxtReadDoc used to create an input stream involving
xmlNewStringInputStream. This would create a stream without an input
buffer, causing problems with encodings (see #34).

After commit aab584dc3, an error was returned even with UTF-8 encodings
which happened to work before.

Make xmlCtxtReadDoc call xmlCtxtReadMemory which doesn't suffer from
these issues. Also fix htmlCtxtReadDoc.

Fixes #397.
2022-08-20 16:34:08 +02:00
Nick Wellnhofer
5930fe0196 Reset nsNr in xmlCtxtReset 2022-07-18 20:59:45 +02:00
Nick Wellnhofer
ca2c91f139 Fix memory leak in xmlLoadEntityContent error path
Free the input stream if pushing it fails.

Found by OSS-Fuzz.

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=43743
2022-06-28 19:33:48 +02:00
Nick Wellnhofer
ecba4cbd43 Avoid double-free if malloc fails in inputPush
It's the caller's responsibility to free the input stream if this
function fails.
2022-06-28 19:33:40 +02:00
Nick Wellnhofer
3e7b4f37aa Avoid calling xmlSetTreeDoc
Create text nodes with xmlNewDocText or set the document directly to
avoid xmlSetTreeDoc being called when the node is inserted.
2022-06-20 01:49:39 +02:00
David Kilzer
44e9118c02 Prevent integer-overflow in htmlSkipBlankChars() and xmlSkipBlankChars()
* HTMLparser.c:
(htmlSkipBlankChars):
* parser.c:
(xmlSkipBlankChars):
- Cap the return value at INT_MAX.
- The commit range that OSS-Fuzz listed for the fix didn't make
  any changes to xmlSkipBlankChars(), so it seems like this
  issue may still exist.

Found by OSS-Fuzz Issue 44803.
2022-04-11 18:09:37 +00:00
David Kilzer
21561e833a Mark more static data as const
Similar to 8f5710379, mark more static data structures with
`const` keyword.

Also fix placement of `const` in encoding.c.

Original patch by Sarah Wilkin.
2022-04-07 12:01:23 -07:00
Nick Wellnhofer
92bff86614 Fix calls to deprecated init/cleanup functions
Only use xmlInitParser/xmlCleanupParser.
2022-03-29 14:18:31 +02:00
Nick Wellnhofer
9684954429 Revert "Continue to parse entity refs in recovery mode"
This reverts commit 84823b8634 which
exposed several other, potentially serious bugs.

Fixes #356.
2022-03-22 19:11:05 +01:00
Nick Wellnhofer
7d02c7291f Fix parser progress checks
Testing the current input pointer for modification is unreliable since
the input buffer could have been freed and realloced. Check whether the
input id and the up-to-date number of bytes consumed match.
2022-03-06 02:33:01 +01:00
Nick Wellnhofer
84823b8634 Continue to parse entity refs in recovery mode
There doesn't seem to be a good reason to abort in xmlParseReference
if a well-formedness error was detected. Removing this check allows to
parse entity references after an error in recovery mode.

Fixes #270.
2022-03-06 02:26:22 +01:00
Nick Wellnhofer
d99ddd9bd5 Improve buffer allocation scheme
In most places, we really need the double-it scheme to avoid quadratic
behavior. The hybrid scheme still can cause many reallocations and the
bounded scheme doesn't seem to provide meaningful protection in
xmlreader.c.
2022-03-06 02:26:22 +01:00
Nick Wellnhofer
ebb1797030 Remove unneeded #includes 2022-03-04 22:11:49 +01:00
Nick Wellnhofer
776d15d383 Don't check for standard C89 headers
Don't check for

- ctype.h
- errno.h
- float.h
- limits.h
- math.h
- signal.h
- stdarg.h
- stdlib.h
- string.h
- time.h

Stop including non-standard headers

- malloc.h
- strings.h
2022-03-02 00:43:54 +01:00
Nick Wellnhofer
89d9ef3ee8 Reset last error in xmlCleanupGlobals
Before, we tried to reset the last error in xmlCleanupParser. But if
xmlCleanupParser wasn't called from the main thread, this would reset
the thread-local error object. xmlCleanupGlobals has access to the
error object of the main thread and can reset it reliably.
2022-03-01 15:14:00 +01:00
Nick Wellnhofer
2489c1d024 Remove useless __CYGWIN__ checks
From what I can tell, some really early Cygwin versions from around
1998-2000 used to erroneously define _WIN32. This was eventually fixed,
but these days, the `defined(_WIN32) && !defined(__CYGWIN__)` idiom is
unnecessary.

Now, we only check for __CYGWIN__ in xmlexports.h when deciding whether
to use __declspec.
2022-02-28 22:58:35 +01:00
Nick Wellnhofer
c41bc10da3 Fix unused variable warnings with disabled features 2022-02-22 19:57:12 +01:00
Nick Wellnhofer
346c3a930c Remove elfgcchack.h
The same optimization can be enabled with -fno-semantic-interposition
since GCC 5. clang has always used this option by default.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
9edc20c154 Fix double counting of CRLF in comments
Fixes #151.
2022-02-07 20:54:07 +01:00
Nick Wellnhofer
9653565765 Make sure to grow input buffer in xmlParseMisc
Otherwise, large amount of whitespace could lead to documents not
being parsed correctly.

Fixes #299.
2022-02-07 15:43:36 +01:00
Nick Wellnhofer
d85245f934 Fix regression with PEs in external DTD
Fix a regression introduced with commit a28f7d87. In some cases,
parameter entity references in external DTDs wouldn't be expanded.

Fixes #306.
2022-01-16 21:56:10 +01:00
Yulin Li
46c658b025 move current position before possible calling of ctxt->sax->characters. 2022-01-16 15:03:12 +01:00
David King
fe564967c9 Fix memory leak in xmlCreateIOParserCtxt
Found by Coverity.

https://bugzilla.redhat.com/show_bug.cgi?id=1938806
2022-01-16 14:14:32 +01:00
Mike Dalessio
a7b9f3ebdf fix: avoid segfault at exit when using custom memory functions
This extends the fix introduced by 956534e to Windows processes
dynamically loading libxml2.

Closes #256.
2021-05-20 13:38:54 -04:00