1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-24 13:33:01 +03:00
Commit Graph

603 Commits

Author SHA1 Message Date
Nick Wellnhofer
9b5cce7a71 include: Remove more unnecessary includes 2023-09-21 01:50:53 +02:00
Nick Wellnhofer
11a1839ddd globals: Move remaining globals back to correct header files
This undoes a lot of damage.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
dc3382ef97 globals: Move xmlRegisterNodeDefault to tree.c
Code in globals.c must not try to access globals itself since the
accessor macros aren't defined and we would only see the main
variable.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
4e1c13ebfd debug: Remove debugging code
This is barely useful these days and only clutters the code base.
2023-09-19 17:35:09 +02:00
Nick Wellnhofer
d39f78069d tree: Fix copying of DTDs
- Don't create multiple DTD nodes.
- Fix UAF if malloc fails.
- Skip DTD nodes if tree module is disabled.

Fixes #583.
2023-08-23 20:43:14 +02:00
Nick Wellnhofer
b8961df65d SAX: Always validate xml:ids
The behavior shouldn't depend on mostly random configuration options.
2023-05-09 03:25:24 +02:00
Nick Wellnhofer
dbc893f588 malloc-fail: Fix memory leak in xmlCopyNamespaceList
Found with libFuzzer, see #344.
2023-03-08 13:17:47 +01:00
Nick Wellnhofer
a442d16a5f malloc-fail: Fix memory leak in xmlGetNsList
Found with libFuzzer, see #344.
2023-02-27 17:18:02 +01:00
Nick Wellnhofer
bc7740b3c3 malloc-fail: Fix memory leak in xmlCopyPropList
Found with libFuzzer, see #344.
2023-02-17 17:16:52 +01:00
Nick Wellnhofer
e6401b68df tree: Fix recursion check in xmlStringGetNodeList
Use the new entity flag to check for recursion.
2023-01-17 14:01:23 +01:00
Nick Wellnhofer
481d79d44c entities: Add XML_ENT_PARSED flag
To check whether an entity was already parsed, the code previously
tested whether "checked" was non-zero or "children" was non-null. The
"children" check could be unreliable because an empty entity also
results in an empty (NULL) node list. Use a separate flag to make this
check more reliable.
2022-12-19 15:26:46 +01:00
Nick Wellnhofer
2059df5358 buf: Deprecate static/immutable buffers 2022-11-20 21:16:03 +01:00
Nick Wellnhofer
b45927095e malloc-fail: Fix memory leak in xmlStringGetNodeList
Also make sure to return NULL on error instead of a partial node list.

Found with libFuzzer, see #344.
2022-11-02 16:22:54 +01:00
Nick Wellnhofer
dd50cfeb61 malloc-fail: Fix memory leak in xmlNewDocNodeEatName
Found with libFuzzer, see #344.
2022-11-02 15:58:31 +01:00
Nick Wellnhofer
fa361de0b7 malloc-fail: Fix memory leak in xmlNewPropInternal
Also fixes a memory leak if called with a non-element node.

Found with libFuzzer, see #344.
2022-11-02 15:57:54 +01:00
Nick Wellnhofer
a22bd982bf malloc-fail: Fix memory leak in xmlStaticCopyNodeList
Found with libFuzzer, see #344.
2022-11-02 15:57:53 +01:00
Nick Wellnhofer
2fc8d12327 xinclude: Make xmlXIncludeCopyNode non-recursive
Avoid call stack overflows.

Also switch to xmlStaticCopyNode which avoids duplicate namespace
definitions.
2022-10-23 18:52:56 +02:00
Nick Wellnhofer
59f2f60e3e Remove "runtime debugging"
This doesn't seem useful as configuration option.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
bdcf842cdb Move xmlIsXHTML to tree.c
It's declared in tree.h and not guarded by LIBXML_OUTPUT_ENABLED like
the other functions in xmlsave.c.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
2cac626976 Don't use sizeof(xmlChar) or sizeof(char) 2022-09-01 03:35:19 +02:00
Nick Wellnhofer
ad338ca737 Remove explicit integer casts
Remove explicit integer casts as final operation

- in assignments
- when passing arguments
- when returning values

Remove casts

- to the same type
- from certain range-bound values

The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.

Document some explicit casts as truncating and add a few missing ones.
2022-09-01 02:33:57 +02:00
Nick Wellnhofer
d7a334f2d0 Silence -Warray-bounds warning
This is a hack, but works for now.

Fixes #389.
2022-08-26 14:43:28 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
39745c927a Improve documentation of tree manipulation API
- Discourage use of node constructors without document.
- Mention that xmlReconciliateNs is crucial when moving nodes from one
  document to another.
2022-08-02 14:38:09 +02:00
Nick Wellnhofer
3e7b4f37aa Avoid calling xmlSetTreeDoc
Create text nodes with xmlNewDocText or set the document directly to
avoid xmlSetTreeDoc being called when the node is inserted.
2022-06-20 01:49:39 +02:00
Nick Wellnhofer
823bf16156 Simplify xmlFreeNode 2022-06-20 01:49:39 +02:00
Nick Wellnhofer
a17a1f564e Don't reset nsDef when changing node content
nsDef is only used for element nodes.
2022-06-20 01:49:39 +02:00
Nick Wellnhofer
2464652537 Fix unintended fall-through in xmlNodeAddContentLen 2022-06-20 01:49:38 +02:00
David Kilzer
6ef16dee7a Reserve byte for NUL terminator and report errors consistently in xmlBuf and xmlBuffer
This is a follow-up to commit 6c283d83.

* buf.c:
(xmlBufGrowInternal):
- Call xmlBufMemoryError() when the buffer size would overflow.
- Account for NUL terminator byte when using XML_MAX_TEXT_LENGTH.
- Do not include NUL terminator byte when returning length.
(xmlBufAdd):
- Call xmlBufMemoryError() when the buffer size would overflow.

* tree.c:
(xmlBufferGrow):
- Call xmlTreeErrMemory() when the buffer size would overflow.
- Do not include NUL terminator byte when returning length.
(xmlBufferResize):
- Update error message in xmlTreeErrMemory() to be consistent
  with other similar messages.
(xmlBufferAdd):
- Call xmlTreeErrMemory() when the buffer size would overflow.
(xmlBufferAddHead):
- Add overflow checks similar to those in xmlBufferAdd().
2022-06-16 12:01:27 +00:00
David Kilzer
4ce2abf6f6 Fix missing NUL terminators in xmlBuf and xmlBuffer functions
* buf.c:
(xmlBufAddLen):
- Change check for remaining space to account for the NUL
  terminator.  When adding a length exactly equal to the number
  of unused bytes, a NUL terminator was not written.
(xmlBufResize):
- Set `buf->use` and NUL terminator when allocating a new
  buffer.
* tree.c:
(xmlBufferResize):
- Set `buf->use` and NUL terminator when allocating a new
  buffer.
(xmlBufferAddHead):
- Set NUL terminator before returning early when shifting
  contents.
2022-06-16 11:23:06 +00:00
David Kilzer
a6df42e649 Fix integer overflow in xmlBufferDump()
* tree.c:
(xmlBufferDump):
- Cap the return value to INT_MAX.
2022-06-02 11:04:27 +00:00
David Kilzer
461ef8ac77 Fix double colon typos in xmlBufferResize()
Introduced in commit 6c283d83e.
2022-05-25 14:19:10 -07:00
David Kilzer
4bc3ebf3ea Fix ownership of xmlNodePtr & xmlAttrPtr fields in xmlSetTreeDoc()
When changing `doc` on an xmlNodePtr or xmlAttrPtr, certain
fields must either be a free-standing string, or they must be
owned by `doc->dict`.

The code to make this change was simply missing, so the crash
happened when an xmlAttrPtr was being torn down after `doc`
changed from non-NULL to NULL, but the `name` field was not
copied.  This is scenario 1 below.

The xmlNodePtr->name and xmlNodePtr->content fields are also
fixed at the same time.  Note that xmlNodePtr->content is never
added to the dictionary, so NULL is used instead of `newDict` to
force a free-standing copy.

This change covers all cases of dictionary changes:
1. Owned by old dictionary -> NULL new dictionary
   - Create free-standing copy of string.
2. Owned by old dictionary -> Non-NULL new dictionary
   - Get string from new dictionary pool.
3. Not owned by old dictionary -> Non-NULL new dictionary
   - No action necessary (already a free-standing string).
4. Not owned by old dictionary -> NULL new dictionary
   - No action necessary (already a free-standing string).

* tree.c:
(_copyStringForNewDictIfNeeded): Add.
(xmlSetTreeDoc):
- Update xmlNodePtr->name, xmlNodePtr->content and
  xmlAttrPtr->name when changing the document, if needed.

Found by OSS-Fuzz Issue 45132.
2022-05-25 16:55:26 +00:00
Nick Wellnhofer
6c283d83ec [CVE-2022-29824] Fix integer overflows in xmlBuf and xmlBuffer
In several places, the code handling string buffers didn't check for
integer overflow or used wrong types for buffer sizes. This could
result in out-of-bounds writes or other memory errors when working on
large, multi-gigabyte buffers.

Thanks to Felix Wilhelm for the report.
2022-05-02 14:11:07 +02:00
Nick Wellnhofer
d314046f89 Don't try to copy children of entity references
This would result in an error, aborting the whole copy operation.
Regressed in commit 7618a3b1.

Fixes #371.
2022-04-23 17:45:35 +02:00
Nick Wellnhofer
41afa89fc9 Fix short-lived regression in xmlStaticCopyNode
Commit 7618a3b1 didn't account for coalesced text nodes.

I think it would be better if xmlStaticCopyNode didn't try to coalesce
text nodes at all. This code path can only be triggered if some other
code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found
such behavior in xinclude.c.
2022-04-10 14:17:31 +02:00
Nick Wellnhofer
7618a3b159 Make xmlStaticCopyNode non-recursive 2022-04-02 19:17:41 +02:00
Nick Wellnhofer
d99ddd9bd5 Improve buffer allocation scheme
In most places, we really need the double-it scheme to avoid quadratic
behavior. The hybrid scheme still can cause many reallocations and the
bounded scheme doesn't seem to provide meaningful protection in
xmlreader.c.
2022-03-06 02:26:22 +01:00
Nick Wellnhofer
4a8c71eb7c Remove DOCBparser
This code has been broken and deprecated since version 2.6.0, released
in 2003. Because of a bug in commit 961b535c, DOCBparser.c was never
compiled since 2012. I couldn't find a Debian package using any of its
symbols, so it seems safe to remove this module.
2022-03-04 22:56:21 +01:00
Nick Wellnhofer
776d15d383 Don't check for standard C89 headers
Don't check for

- ctype.h
- errno.h
- float.h
- limits.h
- math.h
- signal.h
- stdarg.h
- stdlib.h
- string.h
- time.h

Stop including non-standard headers

- malloc.h
- strings.h
2022-03-02 00:43:54 +01:00
Nick Wellnhofer
c41bc10da3 Fix unused variable warnings with disabled features 2022-02-22 19:57:12 +01:00
Nick Wellnhofer
346c3a930c Remove elfgcchack.h
The same optimization can be enabled with -fno-semantic-interposition
since GCC 5. clang has always used this option by default.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
57b3abd592 Fix xmlSetTreeDoc with entity references
The children member of entity reference nodes points to the entity
declaration and must never be followed when traversing a tree. In
the worst case, this could lead to an infinite loop.

It's somewhat unclear how moving entity references to other documents
should work exactly. For now we simply set the children pointer to NULL
to avoid a reference to the original document.

Fixes #42.
2022-02-07 22:18:27 +01:00
Nick Wellnhofer
ea53fc18bc Properly handle nested documents in xmlFreeNode
Client code should never add document nodes as children of other nodes,
but even our own XPointer code has a bug that can produce such trees.
Make sure to really free nested documents. Also see commits 0815302d
and 0762c9b6.

Should fix #269.
2022-02-07 18:36:00 +01:00
Nick Wellnhofer
ae728bb872 Fix null pointer deref in xmlStringGetNodeList
Check for malloc failure to avoid null deref.
2022-01-16 15:05:41 +01:00
Nick Wellnhofer
e20c9c148c Fix xmlGetNodePath with invalid node types
Make xmlGetNodePath return NULL instead of invalid XPath when hitting
unsupported node types like DTD content.

Reported here:
https://mail.gnome.org/archives/xml/2021-January/msg00012.html

Original report:
https://bugs.php.net/bug.php?id=80680
2021-03-13 18:46:00 +01:00
Nick Wellnhofer
ad101bb5b5 Clarify xmlNewDocProp documentation 2021-03-02 13:43:31 +01:00
Nick Wellnhofer
a6e6498fb1 Stop checking attributes for UTF-8 validity
I can't see a reason to check attribute content for UTF-8 validity.
Other parts of the API like xmlNewText have always assumed valid UTF-8
as extra checks only slow down processing.

Besides, setting doc->encoding to "ISO-8859-1" seems pointless, and not
freeing the old encoding would cause a memory leak.

Note that this was last changed in 2008 with commit 6f8611fd which
removed unnecessary encoding/decoding steps. Setting attributes should
be even faster now.

Found by OSS-Fuzz.
2021-03-02 13:35:04 +01:00
Nick Wellnhofer
688b41a0fb Fix quadratic behavior when looking up xml:* attributes
Add a special case for the predefined XML namespace when looking up DTD
attribute defaults in xmlGetPropNodeInternal to avoid calling
xmlGetNsList.

This fixes quadratic behavior in

- xmlNodeGetBase
- xmlNodeGetLang
- xmlNodeGetSpacePreserve

Found by OSS-Fuzz.
2021-03-01 14:36:38 +01:00
Nick Wellnhofer
01411e7c5e Check for invalid redeclarations of predefined entities
Implement section "4.6 Predefined Entities" of the XML 1.0 spec and
check whether redeclarations of predefined entities match the original
definitions.

Note that some test cases declared

    <!ENTITY lt "<">

But the XML spec clearly states that this is illegal:

> If the entities lt or amp are declared, they MUST be declared as
> internal entities whose replacement text is a character reference to
> the respective character (less-than sign or ampersand) being escaped;
> the double escaping is REQUIRED for these entities so that references
> to them produce a well-formed result.

Also fixes #217 but the connection is only tangential. The integer
overflow discovered by fuzzing was more related to the fact that various
parts of the parser disagreed on whether to prefer predefined entities
over their redeclarations. The whole situation is a mess and even
depends on legacy parser options. But now that redeclarations are
validated, it shouldn't make a difference.

As noted in the added comment, this is also one of the cases where
overly defensive checks can hide interesting logic bugs from fuzzers.
2021-02-08 21:51:26 +01:00