The following strings are never allocated from a dict:
- xmlParserCtxt.version
- xmlParserCtxt.encoding
- xmlParserCtxt.extSubURI
- xmlParserCtxt.extSubSystem
- xmlDoc.version
- xmlDoc.encoding
- xmlDoc.URL
- xmlDTD.ExternalID
- xmlDTD.SystemID
- xmlID.value
Also make the struct members point to non-const chars to avoid casts
when freeing.
Port most changes made to the xmlBuf code in f3807d76, except that
"size" still includes the terminating NULL byte.
Make xmlSetBufferAllocationScheme, xmlBufferAllocScheme and
xmlDefaultBufferSize no-ops.
Deprecate a few functions.
This option would allow for a smaller, but mostly useless minimal build.
But it complicates the symbol availability logic in an insane way and
requires specialized tools like our custom C parser in doc/apibuild.py.
See #717.
There are dozens of downstream projects that only include tree.h but use
declarations from parser.h. This broke after the recent cleanup of
circular dependencies.
Make tree.h include parser.h again. This is a hack but doesn't change
the include directory struture.
This commit only made it into the 2.12 branch but wasn't applied to
master, so the issue turned up in 2.13.0 again.
Should fix#734.
Fix many places where malloc failures aren't reported.
Make some API function return an error code. Changing the return type
from void to int is technically an ABI break but should be safe on most
platforms.
- xmlNodeSetContent
- xmlNodeSetContentLen
- xmlNodeAddContent
- xmlNodeAddContentLen
- xmlNodeSetBase
Introduce new API functions that return a separate error code if a
memory allocation fails.
- xmlNodeGetAttrValue
- xmlNodeGetBaseSafe
- xmlGetNsListSafe
Introduce private functions xmlTreeEnsureXMLDecl and xmlSplitQName4.
Don't report low-level errors to the global error handler.
Fix tree
Introduce xmlGetNsListSafe
Fix tree
Introduce a new API function xmlAddIDSafe that returns a separate error
code if a memory allocation fails.
Store a pointer to the ID struct in xmlAttr so attributes can be
freed without allocating memory. It's impossible to report malloc
failures in deallocation code.
Introduce XML_INPUT_HAS_ENCODING flag for xmlParserInput which is set
when xmlSwitchEncoding is called. The parser can use the flag to
reliably detect whether an encoding was already set via user override,
BOM or other auto-detection. In this case, the encoding declaration
won't be used to switch the encoding.
Before, an inscrutable mix of ctxt->charset, ctxt->input->encoding
and ctxt->input->buf->encoder was used.
Introduce private helper functions to switch encodings used by both the
XML and HTML parser:
- xmlDetectEncoding which skips over the BOM, allowing to remove the
BOM checks from other encoding functions.
- xmlSetDeclaredEncoding, replacing htmlCheckEncodingDirect, which warns
about encoding mismatches.
If users override the encoding, store the declared instead of the actual
encoding in xmlDoc. In this case, the actual encoding is known and the
raw value from the doc is more useful.
Also use the input flags to store the ISO-8859-1 fallback state.
Restrict the fallback to cases where no encoding was specified. (The
fallback is only useful in recovery mode and these days broken UTF-8 is
probably more likely than ISO-8859-1, so it might eventually be removed
completely.)
The 'charset' member of xmlParserCtxt is now unused. The 'encoding'
member of xmlParserInput is now unused.
The 'standalone' member of xmlParserInput is renamed to 'flags'.
A new parser state XML_PARSER_XML_DECL is added for the push parser.
This code has been broken and deprecated since version 2.6.0, released
in 2003. Because of a bug in commit 961b535c, DOCBparser.c was never
compiled since 2012. I couldn't find a Debian package using any of its
symbols, so it seems safe to remove this module.
One of the operation on the reader could resolve entities
leading to the classic expansion issue. Make sure the
buffer used for xmlreader operation is bounded.
Introduce a new allocation type for the buffers for this effect.