Fix many places where malloc failures aren't reported.
Make xmlErrMemory public. This is useful for custom external entity
loaders.
Introduce new API function xmlSwitchEncodingName.
Change the way how we store whether the the parser is stopped. This used
to be signaled by setting ctxt->instate to XML_PARSER_EOF which was
misdesigned and error-prone. Set ctxt->disableSAX to 2 instead and
introduce a macro PARSER_STOPPED. Also stop to remove parser inputs in
xmlHaltParser. This allows to remove many checks of ctxt->instate.
Introduce xmlErrParser to handle errors if a parser context is
available.
Don't create a copy of the whole input buffer. Read the data chunk by
chunk to save memory.
Historically, it was probably envisioned to read data from memory
without additional copying. This doesn't work reliably with the current
design of the XML parser which requires a terminating null byte at the
end of input buffers. This lead to xmlReadMemory interfaces, which
expect pointer and size arguments, being changed to make a
zero-terminated copy of the input buffer. Interfaces based on
xmlReadDoc, which actually expect a zero-terminated string and
would make zero-copy operation work, were then simplified to rely on
xmlReadMemoryi, resulting in an unnecessary copy.
To avoid copying (possibly gigabytes) of memory temporarily, we now
stream in-memory input just like content read from files in a
chunk-by-chunk fashion (using a somewhat outdated INPUT_CHUNK size of
250 bytes). As a side effect, we also avoid another copy of the whole
input when handling non-UTF-8 data which was made possible by some
earlier commits.
Interfaces expecting zero-terminated strings now make use of strnlen
which unfortunately isn't part of the standard C library and only
mandated since POSIX 2008.
In some cases, for example when using encoders, the read callback was
set to NULL, in other cases it was set to xmlInputReadCallbackNop.
xmlGROW only tested for xmlInputReadCallbackNop, resulting in errors
when parsing large encoded content from memory.
Always use a NULL callback for memory buffers to avoid ambiguities.
Fixes#262.
Private functions were previously declared
- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.
Consolidate all private header files in include/private.