If we try to continue parsing after an error in the internal or external
subset, entity expansion accounting gets more complicated. Simply halt
the parser.
Found with libFuzzer.
As per https://peps.python.org/pep-0394/, the python binary can be one
of the following options:
- Python 2
- Python 3
- Not exist
All of the scripts in libxml2 use 'python', which may not exist.
As Python 2 reached EOL on the 1st January 2020, it's safe to move the
scripts to use python3 explicitly.
The expected errors contain an relative path, but the messages from the
parser contain absolute paths. However, due to the tests not actually
failing if there was an error this wasn't noticed.
Instead of putting relative paths in the expected messages use format()
to embed the correct absolute path.
Also use os.path.join() consistently when constructing paths to ensure
uniformly formatted paths.
Added all test cases that have a non-empty error in result/valid/*.xml.err
Restructured to make it easier extensible with new test cases
Added coding cookie because there is non-ASCII in the error messages
* error.c relaxng.c include/libxml/xmlerror.h: switched Relax-NG
module to teh new error reporting. Better default report, adds
the element associated if found, context and node are included
in the xmlError
* python/tests/reader2.py: the error messages changed.
* result/relaxng/*: error message changed too.
Daniel
* Makefile.am: more cleanup in make tests
* error.c valid.c parser.c include/libxml/xmlerror.h: more work
in the transition to the new error reporting strategy.
* python/tests/reader2.py result/VC/* result/valid/*:
few changes in the strings generated by the validation output
Daniel
* Makefile.am: cleanup, creating a new legacy.c module,
made sure make tests ran in reduced conditions
* SAX.c SAX2.c configure.in entities.c globals.c parser.c
parserInternals.c tree.c valid.c xlink.c xmlIO.c xmlcatalog.c
xmlmemory.c xpath.c xmlmemory.c include/libxml/xmlversion.h.in:
increased the modularization, allow to configure out
validation code and legacy code, added a configuration
option --with-minimum compiling only the mandatory code
which then shrink to 200KB.
Daniel
* tree.c include/libxml/tree.h: added a new API to split a
QName without generating any memory allocation
* valid.c: fixed another problem with namespaces on element
in mixed content case
* python/tests/reader2.py: updated the testcase with
Bjorn Reese fix to reader for unsignificant white space
* parser.c HTMLparser.c: cleanup.
Daniel
* build_glob.py global.data globals.c parser.c
include/libxml/globals.h: patch from Stphane Bidoul for setting
up threads global defaults.
* doc/libxml2-api.xml: this extends the API with new functions
* python/tests/Makefile.am python/tests/reader2.py
python/tests/thread2.py: integrated the associated testcase and
fixed the error string used in reader2
Daniel
* globals.c libxml.h parser.c parserInternals.c tree.c xmllint.c
xmlreader.c include/libxml/parser.h: a lot of performance work
especially the speed of streaming through the reader and push
interface. Some thread related optimizations. Nearly doubled the
speed of parsing through the reader.
Daniel
* xmlreader.c: seriously changed the way data are pushed to
the underlying parser, go by block of 512 bytes instead of
tryng to detect tag boundaries at that level. Changed the
way empty element are detected and tagged.
* python/tests/reader.py python/tests/reader2.py
python/tests/reader3.py: small changes mostly due to context
reporting being different and DTD node being reported. Some
errors previously undetected are now caught and fixed.
* doc/xmlreader.html: flagged last section as TODO
Daniel
* xmlreader.c python/tests/reader2py: Fixing some more mess
with validation and recursive entities while using the
reader interface, it's getting a bit messy...
Daniel
* xmlreader.c python/tests/reader2.py: fixed another validity
checking in external parsed entities raised by Stphane Bidoul
and added a specific regression test.
* python/tests/reader3.py: cleanup
Daniel
* xmlreader.c python/tests/reader2.py: fixed a problem with
validation within entities pointed by Stphane Bidoul, augmented
the tests to catch those.
Daniel
* xmlreader.c include/libxml/xmlreader.h doc/libxml2-api.xml:
extended the XmlTextReader API a bit, addding accessors for
the current doc and node, and an entity substitution mode for
the parser.
* python/libxml.py python/libxml2class.txt: related updates
* python/tests/Makefile.am python/tests/reader.py
python/tests/reader2.py python/tests/reader3.py: updated a bit
the old tests and added a new one to test the entities handling
Daniel
* python/generator.py python/libxml2class.txt
python/tests/reader.py python/tests/reader2.py: changed the
generator to provide casing for the XmlTextReader similar to
C# so that examples and documentation are more directly transposable.
Fixed the couple of tests in the suite.
Daniel
* valid.c xmlreader.c: final touch running DTD validation
on the XmlTextReader
* python/tests/Makefile.am python/tests/reader2.py: added a
specific run based on the examples from test/valid/*.xml
Daniel