Private functions were previously declared
- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.
Consolidate all private header files in include/private.
xmlSchemaClearElemInfo can add new items to the "matcher" cache, so the
cache must be cleared after calling this function, not before. This
only seems to affect invalid XSDs.
Fixes#390.
The function is only used once and its return value is only checked for
zero. Disable the function like its Max counterpart and add an
implementation for the special case.
Found by OSS-Fuzz.
key/unique/keyref schema attributes currently use qudratic loops
to check their various constraints (that keys are unique and that
keyrefs refer to existing keys). That becomes extremely slow if
there are many elements with keys. This happens in the wild with
e.g. the OVAL XML descriptions of security patches. You need the
openscap schemata, and then an example xml file:
% zypper in openscap-utils
% wget ftp://ftp.suse.com/pub/projects/security/oval/opensuse.leap.15.1.xml
% time xmllint --schema /usr/share/openscap/schemas/oval/5.5/oval-definitions-schema.xsd opensuse.leap.15.1.xml > /dev/null
opensuse.leap.15.1.xml validates
real 16m59,857s
user 16m55,787s
sys 0m1,060s
This patch makes libxml use a hash table to avoid the quadratic
behaviour. The existing hash table only accepts strings as keys, so
we're mostly reusing the canonical representation of key values to derive
such strings (with the caveat given in a comment). The alternative
would be to rework the hash table code to accept either numbers or free
functions as hash workers, but the code is fast enough as is.
With the patch we have this then:
% time LD_LIBRARY_PATH=./libxml2/.libs/ ./libxml2/.libs/xmllint --schema /usr/share/openscap/schemas/oval/5.5/oval-definitions-schema.xsd opensuse.leap.15.1.xml > /dev/null
opensuse.leap.15.1.xml validates
real 0m3,531s
user 0m3,427s
sys 0m0,103s
So, a ~300x speedup. This patch survives 'make check' and 'make tests'.
When ctxt->schema is NULL, xmlSchemaSAXPlug->xmlSchemaPreRun
alloc a new schema for ctxt->schema and set vctxt->xsiAssemble
to 1. Then xmlSchemaVStart->xmlSchemaPreRun initialize
vctxt->xsiAssemble to 0 again which cause the alloced schema
can not be freed anymore.
Found with libFuzzer.
Signed-off-by: Zhipeng Xie <xiezhipeng1@huawei.com>
When reusing an xmlSchemaValidCtxtPtr to validate multiple xml documents
against the same schema, there is a memory leak in xmlschemas.c in
xmlSchemaClearValidCtxt(). The vctxt->idcKeys and associated counters
are not cleaned up in xmlSchemaClearValidCtxt() as they are in
xmlSchemaFreeValidCtxt(). As a result, vctxt->idcKeys grows with each
xmlValidateDoc() call that uses the same context and that memory is
never freed. Similarly, vctxt->nbIdcKeys and vctxt->sizeIdcKeys
increment and are never reset.
Closes: #23
Make sure that all parameters and return values of hash callback
functions exactly match the callback function type. This is required
to pass clang's Control Flow Integrity checks and to allow compilation
to asm.js with Emscripten.
Fixes bug 784861.
First set of patches for zOS
- entities.c parser.c tree.c xmlschemas.c xmlschemastypes.c xpath.c xpointer.c:
ask conversion of code to ISO Latin 1 to avoid having the compiler assume
EBCDIC codepoint for characters.
- xmlmodule.c: make sure we have support for modules
- xmlIO.c: zOS path names are special avoid dsome of the expectstions from
Unix/Windows
this is used in a callback which will pass a name, the name is ignored
but it's best to have the signature of the function match, pointed out
by Claude Petit
* xmlschemas.c: fix xmlSchemaAugmentImportedIDC() signature no functional
change
- Suppress warnings in xmlmemory.c by casting to 'void *'.
- Remove unneeded cast in xmlschemas.c that caused a macro precedence
error.
- Add dummy fields to short structs in xmlschemas.c. This increases the
size of the structs, but I can't see a better solution without using
C11's _Alignof operator.
There are still a couple of cast-align warnings in encoding.c. These
are legitimate portability issues that can't be fixed without reworking
the conversion functions.
For https://bugzilla.gnome.org/show_bug.cgi?id=766834
vctxt->parserCtxt is always NULL in xmlSchemaSAXHandleStartElementNs,
so this function can't call xmlStringLenDecodeEntities to decode the
entities.
For https://bugzilla.gnome.org/show_bug.cgi?id=709171
This makes xmlSchemaSAXHandleStartElementNs pass attributes through
xmlStringDecodeEntities, similar to how xmlSchemaVDocWalk passes them
through xmlNodeListGetString.
For https://bugzilla.gnome.org/show_bug.cgi?id=734363
When using xml schema validation, structured error callbacks do not get
passed a valid column number in xmlError field "int2".
$ ./xmlsaxparse colbug5.xml colbug5.xsd
colbug5.xml:3:0: Element '{urn:colbug5}bx': This element is not
expected.
Expected is ( {urn:colbug5}b ).
The schema error is reported for line 3, column 0 (= N/A).
I'd like to have the column number of the error passed in the xmlError
structure. With this test case: line 3, column 9.
Recently I have run into the very same problem Tiberius Duluman did back in
Wed, 13 May 2009 15:56:55 +0300 ([xml] Bug in xmlSchemaValidateOneElement
function). Now I can proof now that his problem is a valid problem. I checked
the latest available version of xmlschemas.c (2.9.0.) and the problem is still
there!
I think I have found a solution to the problem which I'd like proof with you:
My quick solution to the problem is to replace line 27849 in
xmlschemas.c
(v2.9.0.) in function xmlSchemaVDocWalk
valRoot = xmlDocGetRootElement(vctxt->doc);
with this one:
valRoot = vctxt->validationRoot ? vctxt->validationRoot : xmlDocGetRootElement(vctxt->doc);
Currently I'm using version 2.7.8. in Windows and this change seems to solve
the problem.
Based on Thomas Gamper <icicle@cg.tuwien.ac.at> findings and
initial patch
There is no point doing a regexp validation of further
content if there actually is no further content because the
element is nilled.