Remove XML_PARSE_XINCLUDE. This is only honored by the XML Reader
interface which is now fuzzed in reader.c.
Don't validate in XInclude fuzzer. This doesn't increase coverage after
moving the Reader fuzzer.
Implement a custom mutator that takes a list of fixed-size chunks which
are mutated with a given probability. This makes sure that values like
parser options or failure position are mutated regularly even as the
fuzz data grows large. Values can also be adjusted temporarily to make
the fuzzer focus on failure injection, for example.
Thanks to David Kilzer for the idea.
Also serialize the result of push-parsing and compare whether pull and
push parser produce the same result (differential fuzzing).
We lose the ability to inject IO errors when serializing for now, but
this isn't too important.
Use variable chunk size for push parser.
Fixes#849.
XIncludes involve XPath processing which can still lead to timeouts when
fuzzing. This will probably take a while to fix. The rest of the XML
parsing code should hopefully run without timeouts now. OSS-Fuzz only
shows a single timeout test case, so separate the XInclude from the core
XML fuzzer.
Now that entity expansion issues should be fixed, we should get more
interesting timeout errors from OSS-Fuzz. Disable XInclude for now,
since it often timeouts in XPath computations. The XInclude tests should
be moved to a separate fuzz target.
Remove the libfuzzer max_len option which doesn't apply to other
fuzzing engines. Enforce the maximum length directly in the fuzz
targets. For the xml target, lower the maximum when expanding entities
to avoid timeout and OOM errors.
- XML fuzzer
Currently tests the pull parser, push parser and reader, as well as
serialization. Supports splitting fuzz data into multiple documents
for things like external DTDs or entities. The seed corpus is built
from parts of the test suite.
- Regexp fuzzer
Seed corpus was statically generated from test suite.
- URI fuzzer
Tests parsing and most other functions from uri.c.