I think this was required when some struct members like
xmlParserInputBuffer::buffer were changed from xmlBuffer to xmlBuf (20+
years ago).
Unfortunately, I missed the opportunity to align xmlBuffer with xmlBuf
before the ABI break.
Fix many params in internal functions (not really necessary but Doxygen
warns about that in XML mode).
Fix formatting in a few corner cases that automatic conversion can't
handle.
Rearrange some DOC_DISABLE blocks.
Port most changes made to the xmlBuf code in f3807d76, except that
"size" still includes the terminating NULL byte.
Make xmlSetBufferAllocationScheme, xmlBufferAllocScheme and
xmlDefaultBufferSize no-ops.
Deprecate a few functions.
Always use what the old implementation called the "IO" allocation
scheme, allowing to move the content pointer past the initial
allocation. This is inexpensive and allows efficient shrinking.
Optimize xmlBufGrow, reusing shrunken memory as much as possible.
Simplify xmlBufAdd.
Make xmlBufBackToBuffer return an error on overflow.
Make "size" exclude the terminating NULL byte.
Always provide an initial size.
Reintroduce static buffers.
Remove xmlBufResize and several other functions.
This allows access to ctxt->wellFormed, ctxt->nsWellFormed and
ctxt->valid. It also detects several fatal non-parser errors which
really should be another error level.
We now follow a laissez-faire approach when handling malloc failures and
removed many checks whether the parser was stopped by such an error.
This means the parser input must not be truncated to avoid out-of-bounds
reads.
Short-lived regression.
Fix short-lived regression from previous commit.
It might be safer to make xmlBufSetInputBaseCur use the original buffer
even in case of errors.
Found by OSS-Fuzz.
Remove explicit integer casts as final operation
- in assignments
- when passing arguments
- when returning values
Remove casts
- to the same type
- from certain range-bound values
The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.
Document some explicit casts as truncating and add a few missing ones.
Private functions were previously declared
- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.
Consolidate all private header files in include/private.
This is a follow-up to commit 6c283d83.
* buf.c:
(xmlBufGrowInternal):
- Call xmlBufMemoryError() when the buffer size would overflow.
- Account for NUL terminator byte when using XML_MAX_TEXT_LENGTH.
- Do not include NUL terminator byte when returning length.
(xmlBufAdd):
- Call xmlBufMemoryError() when the buffer size would overflow.
* tree.c:
(xmlBufferGrow):
- Call xmlTreeErrMemory() when the buffer size would overflow.
- Do not include NUL terminator byte when returning length.
(xmlBufferResize):
- Update error message in xmlTreeErrMemory() to be consistent
with other similar messages.
(xmlBufferAdd):
- Call xmlTreeErrMemory() when the buffer size would overflow.
(xmlBufferAddHead):
- Add overflow checks similar to those in xmlBufferAdd().
* buf.c:
(xmlBufAddLen):
- Change check for remaining space to account for the NUL
terminator. When adding a length exactly equal to the number
of unused bytes, a NUL terminator was not written.
(xmlBufResize):
- Set `buf->use` and NUL terminator when allocating a new
buffer.
* tree.c:
(xmlBufferResize):
- Set `buf->use` and NUL terminator when allocating a new
buffer.
(xmlBufferAddHead):
- Set NUL terminator before returning early when shifting
contents.
* buf.c:
(xmlBufAvail):
- Return the number of bytes available in the buffer, but do not
include a byte for the NUL terminator so that it is reserved.
* encoding.c:
(xmlCharEncFirstLineInput):
(xmlCharEncInput):
(xmlCharEncOutput):
* xmlIO.c:
(xmlOutputBufferWriteEscape):
- Remove code that subtracts 1 from the return value of
xmlBufAvail(). It was implemented inconsistently anyway.
In several places, the code handling string buffers didn't check for
integer overflow or used wrong types for buffer sizes. This could
result in out-of-bounds writes or other memory errors when working on
large, multi-gigabyte buffers.
Thanks to Felix Wilhelm for the report.