1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-10-24 13:33:01 +03:00
Commit Graph

139 Commits

Author SHA1 Message Date
Nick Wellnhofer
c2b3294f60 fuzz: Abort on invalid UTF-8
The parser should never generate invalid UTF-8 these days even in
recovery mode.
2024-01-04 21:20:51 +01:00
Nick Wellnhofer
ca5965d594 save: Report more malloc failures 2024-01-02 23:43:06 +01:00
Nick Wellnhofer
0821efc8ee encoding: Check whether encoding handlers support input/output
The "HTML" encoding handler doesn't support input which could lead to a
wrong error report.
2024-01-02 19:48:23 +01:00
Nick Wellnhofer
4dcc2d743e save: Output U+FFFD replacement characters
This degrades more gracefully and helps to diagnose errors.

We stop raising errors for now, since there's no way to report malloc
failures during error handling yet.
2024-01-02 15:39:11 +01:00
Nick Wellnhofer
bc1e030664 save: Improve error handling
Handle malloc failrue from xmlRaiseError.

Use xmlRaiseMemoryError.

Stop using xmlGenericError.

Remove argument from memory error handler.

Remove TODO macro.
2023-12-21 15:02:24 +01:00
Nick Wellnhofer
6c8acdecd2 save: Fix build --without-html
Fixes #646
2023-12-14 13:49:08 +01:00
Nick Wellnhofer
0d97e43993 save: Report malloc failures
Fix places where malloc failures aren't report.

Introduce a new API function xmlSaveFinish which returns an error code.
2023-12-11 22:13:05 +01:00
Nick Wellnhofer
8c084ebdc7 doc: Make apibuild.py happy 2023-09-21 22:57:33 +02:00
Nick Wellnhofer
da274bfa55 build: Fix build when certain modules are disabled 2023-09-21 02:26:43 +02:00
Nick Wellnhofer
4e1c13ebfd debug: Remove debugging code
This is barely useful these days and only clutters the code base.
2023-09-19 17:35:09 +02:00
Nick Wellnhofer
c82701ff0b malloc-fail: Fix memory leak in xmlDocDumpFormatMemoryEnc
Found with libFuzzer, see #344.
2023-02-17 17:16:51 +01:00
Nick Wellnhofer
bdcf842cdb Move xmlIsXHTML to tree.c
It's declared in tree.h and not guarded by LIBXML_OUTPUT_ENABLED like
the other functions in xmlsave.c.
2022-09-02 18:33:35 +02:00
Nick Wellnhofer
ad338ca737 Remove explicit integer casts
Remove explicit integer casts as final operation

- in assignments
- when passing arguments
- when returning values

Remove casts

- to the same type
- from certain range-bound values

The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.

Document some explicit casts as truncating and add a few missing ones.
2022-09-01 02:33:57 +02:00
Nick Wellnhofer
0f568c0b73 Consolidate private header files
Private functions were previously declared

- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.

Consolidate all private header files in include/private.
2022-08-26 02:11:56 +02:00
Nick Wellnhofer
3e7b4f37aa Avoid calling xmlSetTreeDoc
Create text nodes with xmlNewDocText or set the document directly to
avoid xmlSetTreeDoc being called when the node is inserted.
2022-06-20 01:49:39 +02:00
Nick Wellnhofer
d99ddd9bd5 Improve buffer allocation scheme
In most places, we really need the double-it scheme to avoid quadratic
behavior. The hybrid scheme still can cause many reallocations and the
bounded scheme doesn't seem to provide meaningful protection in
xmlreader.c.
2022-03-06 02:26:22 +01:00
Nick Wellnhofer
346c3a930c Remove elfgcchack.h
The same optimization can be enabled with -fno-semantic-interposition
since GCC 5. clang has always used this option by default.
2022-02-20 21:49:04 +01:00
Nick Wellnhofer
13ad8736d2 Fix regression in xmlNodeDumpOutputInternal
Commit 85b1792e could cause additional whitespace if xmlNodeDump was
called with a non-zero starting level.
2021-05-25 11:16:13 +02:00
Nick Wellnhofer
85b1792e37 Work around lxml API abuse
Make xmlNodeDumpOutput and htmlNodeDumpFormatOutput work with corrupted
parent pointers. This used to work with the old recursive code but the
non-recursive rewrite required parent pointers to be set correctly.

Unfortunately, lxml relies on the old behavior and passes subtrees with
a corrupted structure. Fall back to a recursive function call if an
invalid parent pointer is detected.

Fixes #255.
2021-05-21 12:19:25 +02:00
Nick Wellnhofer
0b3c64d9f2 Handle dumps of corrupted documents more gracefully
Check parent pointers for NULL after the non-recursive rewrite of the
serialization code. This avoids segfaults with corrupted documents
which can apparently be seen with lxml, see issue #187.
2020-09-29 18:08:37 +02:00
Nick Wellnhofer
00a86d414b Don't add formatting newlines to XInclude nodes 2020-08-17 01:17:39 +02:00
Nick Wellnhofer
1a360c1c2e More *NodeDumpOutput fixes
When leaving nodes, restrict more operations to XML_ELEMENT_NODEs.
2020-07-29 00:39:15 +02:00
Nick Wellnhofer
7b2e517261 Fix *NodeDumpOutput functions
Only output end tag for elements. Should fix serialization of document
fragments.
2020-07-28 21:52:55 +02:00
Nick Wellnhofer
dc6f009280 Make xmlNodeDumpOutputInternal non-recursive
Fixes stack overflow with deeply nested documents.
2020-07-28 21:00:09 +02:00
Nick Wellnhofer
5330153da4 Make xhtmlNodeDumpOutput non-recursive
Fixes stack overflow with deeply nested documents.
2020-07-28 21:00:09 +02:00
Nick Wellnhofer
20c60886e4 Fix typos
Resolves #133.
2020-03-08 17:41:53 +01:00
Nick Wellnhofer
c9faa29259 Fix overflow check in xmlNodeDump
Store return value of xmlBufNodeDump in a size_t before checking for
integer overflow.

Found by lgtm.com
2020-01-02 14:12:39 +01:00
Nick Wellnhofer
42942066e1 Fix memory leaks of encoding handlers in xmlsave.c
Fix leak of iconv/ICU encoding handler in xmlSaveToBuffer.

Fix leaks of iconv/ICU encoding handlers in xmlSaveTo* error paths.

Closes #127.
2019-11-11 14:04:57 +01:00
Jared Yanovich
2a350ee9b4 Large batch of typo fixes
Closes #109.
2019-09-30 18:04:38 +02:00
Jan Pokorný
81958b6e94 Doc: do not mislead towards "infeasible" scenario wrt. xmlBufNodeDump
At least when merely public API is to be leveraged, one cannot use
xmlBufCreate function that would otherwise be a clear fit, and relying
on some invariants wrt. how some other struct fields will get
initialized along the construction/filling such parent struct and
(ab)using that instead does not appear clever, either.

Hence, instruct people what's the Right Thing for the moment, that is,
make them use xmlNodeDumpOutput instead (together with likewise public
xmlAllocOutputBuffer).

Going forward, it's questionable what do with xmlBuf* family of
functions that are once public, since they, for any practical purpose,
cannot be used by the library clients (that's how I've run into this).

Signed-off-by: Jan Pokorný <jpokorny@redhat.com>
2019-08-25 13:23:49 +02:00
Nick Wellnhofer
96125557b6 Remove unused member doc in xmlSaveCtxt 2019-05-10 12:30:03 +02:00
Nick Wellnhofer
ee501f5449 Stop using doc->charset outside parser code
doc->charset does not specify the in-memory encoding which is always
UTF-8.
2018-10-13 16:47:01 +02:00
Nick Wellnhofer
cb5541c9f3 Fix libz and liblzma detection
If libz or liblzma are detected with pkg-config, AC_CHECK_HEADERS must
not be run because the correct CPPFLAGS aren't set. It is actually not
required have separate checks for LIBXML_ZLIB_ENABLED and HAVE_ZLIB_H.
Only check for LIBXML_ZLIB_ENABLED and remove HAVE_ZLIB_H macro.

Fixes bug 764657, bug 787041.
2017-11-27 14:33:37 +01:00
Nick Wellnhofer
359e750482 Fix -Wmisleading-indentation warnings 2017-11-27 13:42:30 +01:00
Nick Wellnhofer
362b322934 Fix memory leak in xmlBufAttrSerializeTxtContent
The serializer sets doc->encoding to a temporary value and restores
the original value when it's done. This overwrites the encoding value
set in xmlBufAttrSerializeTxtContent, causing a memory leak.

Don't mess with doc->encoding if invalid UTF-8 is encountered.

Found with libFuzzer and ASan.
2017-06-07 19:58:20 +02:00
Daniel Veillard
c97750d11b Avoid an out of bound access when serializing malformed strings
For https://bugzilla.gnome.org/show_bug.cgi?id=766414

* xmlsave.c: xmlBufAttrSerializeTxtContent() if an attribute value
  is not UTF-8 be more careful when serializing it as we may do an
  out of bound access as a result.
2016-05-23 13:42:18 +08:00
Daniel Veillard
23922c536c When calling xmlNodeDump make sure we grow the buffer quickly
Make sure the underlying new buffer allocated use a double-it scheme
for the time of the dump.
2013-02-11 12:01:05 +08:00
Daniel Veillard
f8e3db0445 Big space and tab cleanup
Remove all space before tabs and space and tabs at end of lines.
2012-09-11 13:26:36 +08:00
Daniel Veillard
3e62adbe39 Adding various checks on node type though the API
Specifially checking against namespace nodes before accessing node
pointers
2012-08-09 14:24:02 +08:00
Daniel Veillard
50cdab5552 New saving functions using xmlBuf and conversion
* save.h: new header providing new functions currently internal
          and xmlBuf counterparts of old xmlBuffer based ones
* xmlsave.c: convert functions to use xmlBuf as much as possible
2012-07-23 14:24:27 +08:00
Daniel Veillard
0795348aeb fix a pair of possible out of array char references
When serializing char references back to an character string
Reported by Abhishek Arya <inferno@chromium.org>
2012-01-22 17:42:35 +08:00
Adam Spragg
d2e62311cd Add xmlSaveOption XML_SAVE_WSNONSIG
non destructive indentation option using spaces within markup
constructs and hence not modifying content
* include/libxml/xmlsave.h: new option
* xmlsave.c: some refactoring and new code for the new option
* xmllint.c: adds --pretty option where option 2 uses the new formatting
2010-11-03 15:33:40 +01:00
Adam Spragg
8b877135a3 Force _xmlSaveCtxt.format to be 0 or 1
* xmlsave.c: force _xmlSaveCtxt.format to be 0 or 1 and check
  accordingly, this will allow other values of "format" to be used
  for other purposes.
2010-11-01 14:24:56 +01:00
Daniel Veillard
594e5dfb48 Chasing dead assignments reported by clang-scan
* SAX2.c dict.c error.c hash.c nanohttp.c parser.c python/libxml.c
  relaxng.c runtest.c tree.c valid.c xinclude.c xmlregexp.c xmlsave.c
  xmlschemas.c xpath.c xpointer.c: mostly removing unneded affectations,
  but this led to a few real bugs and some part not yet understood
  (relaxng/interleave)
2009-09-07 14:58:47 +02:00
Daniel Veillard
141ebfa028 Wrong block opening in htmlNodeDumpOutputInternal
* xmlsave.c: Jim Meyering ran clang on libxml2 and this is one of
  the error found, misplaced curly brace
2009-09-02 14:58:13 +02:00
Daniel Veillard
856d92818b new options to serialize as XML/HTML/XHTML and restore old entry point
* include/libxml/xmlsave.h xmlsave.c: new options to serialize
  as XML/HTML/XHTML and restore old entry point behaviours
Daniel

svn path=/trunk/; revision=3794
2008-09-25 14:31:40 +00:00
Daniel Veillard
da3fee406d Borland C fix from Moritz Both regenerate, workaround a problem for buffer
* trionan.c: Borland C fix from Moritz Both
* testapi.c: regenerate, workaround a problem for buffer testing
* xmlIO.c HTMLtree.c: new internal entry point to hide even better
  xmlAllocOutputBufferInternal
* tree.c: harden the code around buffer allocation schemes
* parser.c: restore the warning when namespace names are not absolute
  URIs
* runxmlconf.c: continue regression tests if we get the expected
  number of errors
* Makefile.am: run the python tests on make check
* xmlsave.c: handle the HTML documents and trees
* python/libxml.c: convert python serialization to the xmlSave APIs
  and avoid some horrible hacks
Daniel

svn path=/trunk/; revision=3790
2008-09-01 13:08:57 +00:00
Daniel Veillard
d0d2f090dc fix handling of empty CDATA nodes as reported and discussed around #514181
* xmlsave.c parser.c: fix handling of empty CDATA nodes as 
  reported and discussed around #514181 and associated patches
* test/emptycdata.xml result/emptycdata.xml* 
  result/noent/emptycdata.xml: added a specific test in the
  regression suite.
Daniel

svn path=/trunk/; revision=3701
2008-03-07 16:50:21 +00:00
Daniel Veillard
a76a81f638 fix to avoid a crash when dumping an attribute from an XHTML document,
* xmlsave.c: fix to avoid a crash when dumping an attribute from
  an XHTML document, patch contributed to fix #485298
Daniel

svn path=/trunk/; revision=3660
2007-10-10 08:28:18 +00:00
Daniel Veillard
3814a365d6 fixed problem reported on bug #460415 Daniel
* xmlsave.c: fixed problem reported on bug #460415
Daniel

svn path=/trunk/; revision=3646
2007-07-26 11:41:46 +00:00