Most string functions can assume valid UTF-8. In order to detect malloc
failures reliably, xmlUTF8Strsub should only return NULL if the start
index is out of bounds or a memory allocation failed.
libxml2 has limited support for reading and writing compressed data
with the help of zlib and liblzma which used to be enabled by default.
This only works for files read from the file system and never worked
with memory buffers. My guess is that this feature is virtually unused.
In light of the recently discovered xz backdoor, it's a good time to
disable these features by default to reduce attack surface and prepare
for eventual removal.
If --with-legacy is passed to the Autotools build, compression will
be enabled by default as before.
Some users set href to NULL to unset a namespace without deleting it.
Also change the duplicate check in xmlNewNs which must agree with
xmlSearchNs.
Short-lived regression from f960c60d.
Commit 9e1c72da from 2001 introduced a bug where xmlAddPrevSibling and
xmlAddNextSibling would only try to merge text nodes with one of its
new siblings. Commit 4ccd3eb8 fixed this bug but unfortunately, lxml
and possibly other downstream code depend on text nodes not being
merged.
To avoid breaking downstream code while still having somewhat
consistent API behavior, it's probably best to make these functions
never coalesce text nodes.
Make xmlAddChild unlink the child before insertion. Originally, linked
children would most likely cause tree corruption. The first fix
disallowed linked nodes, but there are cases where insertion of such
nodes could succeed.
Don't abort if the node is already a child of parent. In this case,
the node will be moved to the end of the child list.
xmlUnlinkNode also removes references to DTD nodes which shouldn't be
done when moving nodes within a document. Introduce a new function
xmlUnlinkNodeInternal which only unlinks a node from the tree.
Remove references to DTD nodes in xmlNodeSetDoc. Note that moving
element and attribute declarations to another document will still leave
references in the source document.
Some exotic encodings like ISO646-FR don't support '#' characters, so
encoding a character reference can actually fail. Don't skip the
offending input in this case so the error will be reported on the next
call.
After the failed experiment with a static XML namespace, introduce
versions of xmlSearchNs that report malloc failures.
Optimize the no-document case by only adding the XML namespace
declaration if it wasn't found in an ancestor.