1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-26 00:37:43 +03:00
Commit Graph

513 Commits

Author SHA1 Message Date
Nick Wellnhofer
d025cfbb4b parser: Always copy content from entity to target.
Make sure that references from IDs are updated.

Note that if there are IDs with the same value in a document, the last
one will now be returned. IDs should be unique, but maybe this should be
addressed.
2023-12-29 01:22:11 +01:00
Nick Wellnhofer
c49572e57d malloc-fail: Fix erroneous report in xmlStringGetNodeList
The parser can produce invalid attribute content in recovery mode.
Unless this is fixed, xmlStringGetNodeList should ignore such errors
silently.
2023-12-23 15:10:15 +01:00
Nick Wellnhofer
0ea47327c2 malloc-fail: Fix memory leak in xmlNodeGetBaseSafe
Short-lived regression.
2023-12-13 14:58:53 +01:00
Nick Wellnhofer
5c06f4e384 malloc-fail: Fix erroneous reports in xmlNodeListGetString
Short-lived regression.
2023-12-12 15:19:07 +01:00
Nick Wellnhofer
aca16fb3d4 tree: Report malloc failures
Fix many places where malloc failures aren't reported.

Make some API function return an error code. Changing the return type
from void to int is technically an ABI break but should be safe on most
platforms.

- xmlNodeSetContent
- xmlNodeSetContentLen
- xmlNodeAddContent
- xmlNodeAddContentLen
- xmlNodeSetBase

Introduce new API functions that return a separate error code if a
memory allocation fails.

- xmlNodeGetAttrValue
- xmlNodeGetBaseSafe
- xmlGetNsListSafe

Introduce private functions xmlTreeEnsureXMLDecl and xmlSplitQName4.

Don't report low-level errors to the global error handler.

Fix tree

Introduce xmlGetNsListSafe

Fix tree
2023-12-11 22:13:05 +01:00
Nick Wellnhofer
502971cc23 tree: Another fix related to #538
Should fix #639.
2023-12-01 19:44:37 +01:00
Nick Wellnhofer
8707838e69 tree: Fix #583 again
Only set doc->intSubset after successful copy to avoid dangling pointers
in error case.
2023-11-28 13:45:49 +01:00
Nick Wellnhofer
de3f70146d tree: Fix regression when copying DTDs
This reverts commit d39f78069d.

Fixes #634.
2023-11-28 13:30:56 +01:00
Nick Wellnhofer
97e99f4112 parser: Acknowledge that entities with namespaces are broken
Entities which reference out-of-scope namespace have always been broken.
xmlParseBalancedChunkMemoryInternal tried to reuse the namespaces
currently in scope but these namespaces were ignored by the SAX handler.
Besides, there could be different namespaces in scope when expanding the
entity again. For example:

    <!DOCTYPE doc [
      <!ENTITY ent "<ns:elem/>">
    ]>
    <doc>
      <decl1 xmlns:ns="urn:ns1">
        &ent;
      </decl1>
      <decl2 xmlns:ns="urn:ns2">
        &ent;
      </decl2>
    </doc>

Add some comments outlining possible solutions to this problem.

For now, we stop copying namespaces to the temporary parser context
in xmlParseBalancedChunkMemoryInternal. This has never really worked
and the recent changes contained a partial fix which uncovered other
problems like a use-after-free with the XML Reader interface, found
by OSS-Fuzz.
2023-10-05 17:41:46 +02:00
Nick Wellnhofer
8c084ebdc7 doc: Make apibuild.py happy 2023-09-21 22:57:33 +02:00
Nick Wellnhofer
9b5cce7a71 include: Remove more unnecessary includes 2023-09-21 01:50:53 +02:00
Nick Wellnhofer
11a1839ddd globals: Move remaining globals back to correct header files
This undoes a lot of damage.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
dc3382ef97 globals: Move xmlRegisterNodeDefault to tree.c
Code in globals.c must not try to access globals itself since the
accessor macros aren't defined and we would only see the main
variable.
2023-09-20 22:06:49 +02:00
Nick Wellnhofer
4e1c13ebfd debug: Remove debugging code
This is barely useful these days and only clutters the code base.
2023-09-19 17:35:09 +02:00
Nick Wellnhofer
d39f78069d tree: Fix copying of DTDs
- Don't create multiple DTD nodes.
- Fix UAF if malloc fails.
- Skip DTD nodes if tree module is disabled.

Fixes #583.
2023-08-23 20:43:14 +02:00
Nick Wellnhofer
b8961df65d SAX: Always validate xml:ids
The behavior shouldn't depend on mostly random configuration options.
2023-05-09 03:25:24 +02:00
Nick Wellnhofer
dbc893f588 malloc-fail: Fix memory leak in xmlCopyNamespaceList
Found with libFuzzer, see #344.
2023-03-08 13:17:47 +01:00
Nick Wellnhofer
a442d16a5f malloc-fail: Fix memory leak in xmlGetNsList
Found with libFuzzer, see #344.
2023-02-27 17:18:02 +01:00
Nick Wellnhofer
bc7740b3c3 malloc-fail: Fix memory leak in xmlCopyPropList
Found with libFuzzer, see #344.
2023-02-17 17:16:52 +01:00
Nick Wellnhofer
e6401b68df tree: Fix recursion check in xmlStringGetNodeList
Use the new entity flag to check for recursion.
2023-01-17 14:01:23 +01:00
Nick Wellnhofer
481d79d44c entities: Add XML_ENT_PARSED flag
To check whether an entity was already parsed, the code previously
tested whether "checked" was non-zero or "children" was non-null. The
"children" check could be unreliable because an empty entity also
results in an empty (NULL) node list. Use a separate flag to make this
check more reliable.
2022-12-19 15:26:46 +01:00
Nick Wellnhofer
2059df5358 buf: Deprecate static/immutable buffers 2022-11-20 21:16:03 +01:00
Nick Wellnhofer
b45927095e malloc-fail: Fix memory leak in xmlStringGetNodeList
Also make sure to return NULL on error instead of a partial node list.

Found with libFuzzer, see #344.
2022-11-02 16:22:54 +01:00
Nick Wellnhofer
dd50cfeb61 malloc-fail: Fix memory leak in xmlNewDocNodeEatName
Found with libFuzzer, see #344.
2022-11-02 15:58:31 +01:00
Nick Wellnhofer
fa361de0b7 malloc-fail: Fix memory leak in xmlNewPropInternal
Also fixes a memory leak if called with a non-element node.

Found with libFuzzer, see #344.
2022-11-02 15:57:54 +01:00
Nick Wellnhofer
a22bd982bf malloc-fail: Fix memory leak in xmlStaticCopyNodeList
Found with libFuzzer, see #344.
2022-11-02 15:57:53 +01:00
Nick Wellnhofer
2fc8d12327 xinclude: Make xmlXIncludeCopyNode non-recursive
Avoid call stack overflows.

Also switch to xmlStaticCopyNode which avoids duplicate namespace
definitions.
2022-10-23 18:52:56 +02:00
Nick Wellnhofer
59f2f60e3e Remove "runtime debugging"
This doesn't seem useful as configuration option.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
bdcf842cdb Move xmlIsXHTML to tree.c
It's declared in tree.h and not guarded by LIBXML_OUTPUT_ENABLED like
the other functions in xmlsave.c.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
2cac626976 Don't use sizeof(xmlChar) or sizeof(char) 2022-09-01 03:35:19 +02:00
Nick Wellnhofer
ad338ca737 Remove explicit integer casts
Remove explicit integer casts as final operation

- in assignments
- when passing arguments
- when returning values

Remove casts

- to the same type
- from certain range-bound values

The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.

Document some explicit casts as truncating and add a few missing ones.
2022-09-01 02:33:57 +02:00
Nick Wellnhofer
d7a334f2d0 Silence -Warray-bounds warning
This is a hack, but works for now.

Fixes #389.
2022-08-26 14:43:28 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
39745c927a Improve documentation of tree manipulation API
- Discourage use of node constructors without document.
- Mention that xmlReconciliateNs is crucial when moving nodes from one
  document to another.
2022-08-02 14:38:09 +02:00
Nick Wellnhofer
3e7b4f37aa Avoid calling xmlSetTreeDoc
Create text nodes with xmlNewDocText or set the document directly to
avoid xmlSetTreeDoc being called when the node is inserted.
2022-06-20 01:49:39 +02:00
Nick Wellnhofer
823bf16156 Simplify xmlFreeNode 2022-06-20 01:49:39 +02:00
Nick Wellnhofer
a17a1f564e Don't reset nsDef when changing node content
nsDef is only used for element nodes.
2022-06-20 01:49:39 +02:00
Nick Wellnhofer
2464652537 Fix unintended fall-through in xmlNodeAddContentLen 2022-06-20 01:49:38 +02:00
David Kilzer
6ef16dee7a Reserve byte for NUL terminator and report errors consistently in xmlBuf and xmlBuffer
This is a follow-up to commit 6c283d83.

* buf.c:
(xmlBufGrowInternal):
- Call xmlBufMemoryError() when the buffer size would overflow.
- Account for NUL terminator byte when using XML_MAX_TEXT_LENGTH.
- Do not include NUL terminator byte when returning length.
(xmlBufAdd):
- Call xmlBufMemoryError() when the buffer size would overflow.

* tree.c:
(xmlBufferGrow):
- Call xmlTreeErrMemory() when the buffer size would overflow.
- Do not include NUL terminator byte when returning length.
(xmlBufferResize):
- Update error message in xmlTreeErrMemory() to be consistent
  with other similar messages.
(xmlBufferAdd):
- Call xmlTreeErrMemory() when the buffer size would overflow.
(xmlBufferAddHead):
- Add overflow checks similar to those in xmlBufferAdd().
2022-06-16 12:01:27 +00:00
David Kilzer
4ce2abf6f6 Fix missing NUL terminators in xmlBuf and xmlBuffer functions
* buf.c:
(xmlBufAddLen):
- Change check for remaining space to account for the NUL
  terminator.  When adding a length exactly equal to the number
  of unused bytes, a NUL terminator was not written.
(xmlBufResize):
- Set `buf->use` and NUL terminator when allocating a new
  buffer.
* tree.c:
(xmlBufferResize):
- Set `buf->use` and NUL terminator when allocating a new
  buffer.
(xmlBufferAddHead):
- Set NUL terminator before returning early when shifting
  contents.
2022-06-16 11:23:06 +00:00
David Kilzer
a6df42e649 Fix integer overflow in xmlBufferDump()
* tree.c:
(xmlBufferDump):
- Cap the return value to INT_MAX.
2022-06-02 11:04:27 +00:00
David Kilzer
461ef8ac77 Fix double colon typos in xmlBufferResize()
Introduced in commit 6c283d83e.
2022-05-25 14:19:10 -07:00
David Kilzer
4bc3ebf3ea Fix ownership of xmlNodePtr & xmlAttrPtr fields in xmlSetTreeDoc()
When changing `doc` on an xmlNodePtr or xmlAttrPtr, certain
fields must either be a free-standing string, or they must be
owned by `doc->dict`.

The code to make this change was simply missing, so the crash
happened when an xmlAttrPtr was being torn down after `doc`
changed from non-NULL to NULL, but the `name` field was not
copied.  This is scenario 1 below.

The xmlNodePtr->name and xmlNodePtr->content fields are also
fixed at the same time.  Note that xmlNodePtr->content is never
added to the dictionary, so NULL is used instead of `newDict` to
force a free-standing copy.

This change covers all cases of dictionary changes:
1. Owned by old dictionary -> NULL new dictionary
   - Create free-standing copy of string.
2. Owned by old dictionary -> Non-NULL new dictionary
   - Get string from new dictionary pool.
3. Not owned by old dictionary -> Non-NULL new dictionary
   - No action necessary (already a free-standing string).
4. Not owned by old dictionary -> NULL new dictionary
   - No action necessary (already a free-standing string).

* tree.c:
(_copyStringForNewDictIfNeeded): Add.
(xmlSetTreeDoc):
- Update xmlNodePtr->name, xmlNodePtr->content and
  xmlAttrPtr->name when changing the document, if needed.

Found by OSS-Fuzz Issue 45132.
2022-05-25 16:55:26 +00:00
Nick Wellnhofer
6c283d83ec [CVE-2022-29824] Fix integer overflows in xmlBuf and xmlBuffer
In several places, the code handling string buffers didn't check for
integer overflow or used wrong types for buffer sizes. This could
result in out-of-bounds writes or other memory errors when working on
large, multi-gigabyte buffers.

Thanks to Felix Wilhelm for the report.
2022-05-02 14:11:07 +02:00
Nick Wellnhofer
d314046f89 Don't try to copy children of entity references
This would result in an error, aborting the whole copy operation.
Regressed in commit 7618a3b1.

Fixes #371.
2022-04-23 17:45:35 +02:00
Nick Wellnhofer
41afa89fc9 Fix short-lived regression in xmlStaticCopyNode
Commit 7618a3b1 didn't account for coalesced text nodes.

I think it would be better if xmlStaticCopyNode didn't try to coalesce
text nodes at all. This code path can only be triggered if some other
code doesn't coalesce text nodes properly. In this case, OSS-Fuzz found
such behavior in xinclude.c.
2022-04-10 14:17:31 +02:00
Nick Wellnhofer
7618a3b159 Make xmlStaticCopyNode non-recursive 2022-04-02 19:17:41 +02:00
Nick Wellnhofer
d99ddd9bd5 Improve buffer allocation scheme
In most places, we really need the double-it scheme to avoid quadratic
behavior. The hybrid scheme still can cause many reallocations and the
bounded scheme doesn't seem to provide meaningful protection in
xmlreader.c.
2022-03-06 02:26:22 +01:00
Nick Wellnhofer
4a8c71eb7c Remove DOCBparser
This code has been broken and deprecated since version 2.6.0, released
in 2003. Because of a bug in commit 961b535c, DOCBparser.c was never
compiled since 2012. I couldn't find a Debian package using any of its
symbols, so it seems safe to remove this module.
2022-03-04 22:56:21 +01:00
Nick Wellnhofer
776d15d383 Don't check for standard C89 headers
Don't check for

- ctype.h
- errno.h
- float.h
- limits.h
- math.h
- signal.h
- stdarg.h
- stdlib.h
- string.h
- time.h

Stop including non-standard headers

- malloc.h
- strings.h
2022-03-02 00:43:54 +01:00