`nbchar` could overflow with larger than 2GB memory buffers which some
new APIs allow. This shouldn't affect memory safety.
Limit maximum amount of bytes passed to character callback to
XML_MAX_ITEMS (1e9).
Now that we make sure that PEs starting markup won't be popped
implicitly, it's enough to check that no new entities are on the stack
when checking boundaries.
To fully implement "VC: Standalone Document Declaration", we have to
check for normalization changes caused by non-CDATA attribute types
declared externally.
Fixes#119.
We currently only handle "Validity constraint: Proper Declaration/PE
Nesting", but we must detect "Well-formedness constraint: PE Between
Declarations" separately:
> The replacement text of a parameter entity reference in a DeclSep must
> match the production extSubsetDecl.
PEs in DeclSeps are PEs that start with a full markup declaration (or
another PE). These are handled in xmParse{Internal|External}Subset. We
set a flag on these PEs and don't close them implicitly in
xmlSkipBlankCharsPE. This will make unterminated declarations in such
PEs cause a parser error. The PEs are closed explicitly in
xmParse{Internal|External}Subset, the only location where they are
allowed to end.
Fix many params in internal functions (not really necessary but Doxygen
warns about that in XML mode).
Fix formatting in a few corner cases that automatic conversion can't
handle.
Rearrange some DOC_DISABLE blocks.
When parsing XML content with functions like xmlParseBalancedChunk or
xmlParseInNodeContext, make undeclared entities always a fatal error to
match 2.13 behavior.
This was deliberately changed in 4f329dc5, probably to make the tests
pass.
Should fix#895.
Unlike iconv or the internal converters, ICU consumes truncated multi-
byte sequences at the end of an input buffer. We currently check for a
non-empty raw input buffer to detect truncated sequences, so this fails
with ICU.
It might be possible to inspect the pivot buffer pointers, but it seems
cleaner to implement a `flush` flag for some encoding and I/O functions.
After flushing, we can check for U_TRUNCATED_CHAR_FOUND with ICU, or
detect remaining input with other converters.
Also fix detection of truncated sequences for HTML, XML content and
DTDs with iconv.
Support for RELAX NG used to be enabled together with XML Schema support
(--with-schemas). Now there's a separate option and a new feature macro
LIBXML_RELAXNG_ENABLED.