Unfortunately, it's long-standing behavior for libxml2 to print all
reported errors to stderr by default. This default behavior is now
partially disabled. If no error handler is set, only parser and
validation errors are passed to a generic error handler or printed to
stderr. Other errors are still available via xmlGetLastError and can be
captured with a structured error handler.
Introduce xmlCtxtSetErrorHandler allowing to set a structured error for
a parser context. There already was the "serror" SAX handler but this
always receives the parser context as argument.
Start to use xmlRaiseMemoryError.
Remove useless arguments from memory error functions. Rename
xmlErrMemory to xmlCtxtErrMemory.
Remove a few calls to xmlGenericError.
Remove support for runtime entity debugging.
xmlParserInputBufferCreateMem must make a copy of the buffer.
This fixes a regression from 2.11 which could cause reads from freed
memory depending on the use case.
Undeprecate xmlParserInputBufferCreateStatic which can avoid copying
the whole buffer.
Fix many places where malloc failures aren't reported.
Make xmlErrMemory public. This is useful for custom external entity
loaders.
Introduce new API function xmlSwitchEncodingName.
Change the way how we store whether the the parser is stopped. This used
to be signaled by setting ctxt->instate to XML_PARSER_EOF which was
misdesigned and error-prone. Set ctxt->disableSAX to 2 instead and
introduce a macro PARSER_STOPPED. Also stop to remove parser inputs in
xmlHaltParser. This allows to remove many checks of ctxt->instate.
Introduce xmlErrParser to handle errors if a parser context is
available.
This regressed in commit e0dd330b.
Also fixes a long-standing issue where namespaces from default
attributes weren't added if they match an existing namespace.
Fixes#643.
Setting these deprecated globals hasn't had an effect for a long time.
Make them constants. This reduces the size of per-thread storage from
~700 to ~250 bytes.
Set the dictionary for newDoc in xmlParseBalancedChunkMemoryRecover.
This is a long-standing bug which was masked by
- xmlParseBalancedChunkMemoryRecover changing the document of the root
node. This is a really bad idea, resulting in a mismatch between
ctxt->myDoc and ctxt->node->doc.
- SAX2.c preferring ctxt->node->doc over ctxt->myDoc until commit
a31e1b06.
Fixes#641.
Partial revert of cb927e85 fixing CRLFs not incrementing the line
number.
This requires to rework xmlParseQNameHashed. The original implementation
prompted the change to xmlCurrentChar which really shouldn't modify the
'cur' pointer as side effect. But the NEXTL macro relies on this
behavior.
Ultimately, we should reintroduce the change to xmlCurrentChar and fix
the NEXTL macro. This will lead to single CRs incrementing the line
number as well which seems more consistent.
Fixes#628.
Entities which reference out-of-scope namespace have always been broken.
xmlParseBalancedChunkMemoryInternal tried to reuse the namespaces
currently in scope but these namespaces were ignored by the SAX handler.
Besides, there could be different namespaces in scope when expanding the
entity again. For example:
<!DOCTYPE doc [
<!ENTITY ent "<ns:elem/>">
]>
<doc>
<decl1 xmlns:ns="urn:ns1">
&ent;
</decl1>
<decl2 xmlns:ns="urn:ns2">
&ent;
</decl2>
</doc>
Add some comments outlining possible solutions to this problem.
For now, we stop copying namespaces to the temporary parser context
in xmlParseBalancedChunkMemoryInternal. This has never really worked
and the recent changes contained a partial fix which uncovered other
problems like a use-after-free with the XML Reader interface, found
by OSS-Fuzz.
Use a hash table to lookup namespaces by prefix. The hash table stores
an index into the namespace table. Auxiliary data for namespaces is
stored in a separate array along the main namespace table.
Use a hash table to verify attribute uniqueness. The hash table stores
an index into the attribute table.
Reuse hash value from the dictionary to avoid computing them twice.
See #346.