On Fri, May 11, 2012 at 9:10 AM, Daniel Veillard <veillard@redhat.com> wrote:
> Hi Conrad,
>
> that's interesting ! I was initially afraid of a sudden explosion of
> memory allocations for building a tree since by default buffers tend to
> "waste" memory by using doubling allocations, but that's not the case.
> xmllint --noout doc/libxml2-api.xml
> when compiled with memory debug produce
>
> paphio:~/XML -> cat .memdump
> MEMORY ALLOCATED : 0, MAX was 12756699
>
> and without your patch 12755657, i.e. the increase is minimal.
Heh, I thought that too. Actually you're looking at the result with XML_ALLOC_EXACT! This
is because EXACT adds 10bytes "spare" on each alloc, and that interestingly wastes about the
same amount of space as XML_ALLOC_DOUBLEIT on this example (see below).
So it turns out that the default realloc() on my system actually handles this case really
well — and I guess that all the time in xmlRealloc() was actually in xmlStrlen, not the
underlying realloc() after all (sorry for misleading you). If you replace the realloc()
with a bad one (like valgrind's), then the performance degrades severely.
This patch implements a HYBRID allocator which has the behaviour you describe (it's
like EXACT to start with, though without the spare 10 bytes; and switches to DOUBLEIT
after 4kb) — that gets the memory back down to 12755657, with no noticeable impact on the
performance of the synthetic pathological example under valgrind.
In summary:
max_memory on ./xmllint --noout doc/libxml2-api.xml,
valgrind time on https://gist.github.com/2656940
max_memory valgrind time
before | 12755657 | 29:18.2
EXACT | 12756699 | 2:58.6 <-- this is the state after the first patch.
DOUBLEIT | 12756727 | 0:02.7
HYBRID | 12755754 | 0:02.7 <-- this is the state with both patches.
>
> There is also the cost of creating the buffers all the time.
> I need to read the code and check but I may be interested in an hybrid
> approach where we switch to buffer only when the text node starts to
> become too big (4k would remove nearly all usuall types of "document"
> usage, i.e. not blocks of data)
I tried to avoid too much buffer creation by introducing the xmlBufferDetach function,
which allows re-using one buffer to construct many strings. It's maybe a bit of a "hack"
in API terms though I thought the gains would be worth it.
Conrad
------8<------
To keep memory usage tight in normal conditions it's desirable to only
allocate as much space as is needed. Unfortunately this can lead to
problems when constructing a long string out of small chunks, because
every chunk you add will need to resize the buffer.
To fix this XML_ALLOC_HYBRID will switch (when the buffer is 4kb big)
from using exact allocations to doubling buffer size every time it is
full. This limits the number of buffer resizes to O(log n) (down from
O(n)), and thus greatly increases the performance of constructing very
large strings in this manner.
Hi Veillard and all,
Firstly, thanks for libxml: it's awesome!
I noticed recently that libxml was taking a surprisingly long time to perform some
operations (many minutes instead of milliseconds), and so I did some digging. It turns out
that the problem was caused by the realloc()ing done in xmlNodeAddContentLen() which can
be called many (many) times when assigning some content into a node.
For background, I'm dealing with XML that contains emails, these can have large
attachments (~6MB) which are base-64 encoded, line-wrapped at 78 chars, and each line ends
with . This means that xmlNodeAddContentLen() is being called about 200,000 times,
and so there are 200,000 reallocs of a 6MB string, which takes a while... (I put a synthetic
example of this at https://gist.github.com/2656940)
The attached patch works around that problem by using the existing buffer API to merge the
strings together before even creating the text node, this keeps the number of realloc()s
at a managable level.
I'd love feedback on the patch, and am happy to fix problems with it, or explore other
solutions if you think that this is barking up the wrong tree :).
Thanks,
Conrad
P.S. Should I create a bug for this too?
------8<------
Before this change xmlStringGetNodeList would perform a realloc() of the
entire new content for every XML entity in the assigned text in order to
merge together adjacent text nodes. This had the effect of making
xmlSetNodeContent O(n^2), which led to unexpectedly bad performance on
inputs that contained a large number of XML entities.
After this change the memory management is done by the buffer API,
avoiding the need to continually re-measure and realloc() the string.
For my test data (6MB of 80 character lines, each ending with )
this takes the time to xmlSetNodeContent from about 500 seconds to
around 50ms. I have not profiled smaller cases, though I tried to
minimize the performance impact of my change by avoiding unnecessary
string copying.
Signed-off-by: Conrad Irwin <conrad.irwin@gmail.com>
For https://bugzilla.gnome.org/show_bug.cgi?id=630682
The python tests were reporting errors, some of it was due to
a small change in case encoding, but the main one was about
htmlSetMetaEncoding(doc, NULL) being broken by not removing
the associated meta tag anymore
Usually 'xml' namespace for XML-1.0 declaration does not need
to be carried but Mike Hommey raised the problem that the SVG
XSD file fails to parse due to a mishandling.
- SAX2.c: failure to create a namespace should not be interpreted
as a memory allocation error
- tree.c: document better xmlNewNs behaviour, and fix it in the
case the 'xml' prefix is being used.
* SAX2.c dict.c error.c hash.c nanohttp.c parser.c python/libxml.c
relaxng.c runtest.c tree.c valid.c xinclude.c xmlregexp.c xmlsave.c
xmlschemas.c xpath.c xpointer.c: mostly removing unneded affectations,
but this led to a few real bugs and some part not yet understood
(relaxng/interleave)
* libxml2.syms: the symbols with history, going back to 2.4.30
* Makefile.am configure.in: linking flags detection and use
* parser.c tree.c valid.c xpointer.c: various cleanup of functions
which could be made static or simply discarded, not that many
* tree.c: add a missing check in xmlAddSibling, patch by Kris Breuker
* xmlIO.c: avoid xmlAllocOutputBuffer using XML_BUFFER_EXACT which
leads to performances problems especially on Windows.
daniel
svn path=/trunk/; revision=3820
* tree.c: set doc on last child tree in xmlAddChildList for
bug #546772. Fix problem adding an attribute via with xmlAddChild
reported by Kris Breuker.
svn path=/trunk/; revision=3806
* include/libxml/tree.h tree.c python/generator.py: adds
element traversal support
* valid.c: avoid a warning
* doc/*: regenerated
daniel
svn path=/trunk/; revision=3804
* SAX2.c parser.c: fix for CVE-2008-4226, a memory overflow
when building gigantic text nodes, and a bit of cleanup
to better handled out of memory problem in that code.
* tree.c: fix for CVE-2008-4225, lack of testing leads to
a busy loop test assuming one have enough core memory.
Daniel
svn path=/trunk/; revision=3803
* trionan.c: Borland C fix from Moritz Both
* testapi.c: regenerate, workaround a problem for buffer testing
* xmlIO.c HTMLtree.c: new internal entry point to hide even better
xmlAllocOutputBufferInternal
* tree.c: harden the code around buffer allocation schemes
* parser.c: restore the warning when namespace names are not absolute
URIs
* runxmlconf.c: continue regression tests if we get the expected
number of errors
* Makefile.am: run the python tests on make check
* xmlsave.c: handle the HTML documents and trees
* python/libxml.c: convert python serialization to the xmlSave APIs
and avoid some horrible hacks
Daniel
svn path=/trunk/; revision=3790
* include/libxml/tree.h tree.c: make a new kind of buffer where
shrinking and adding in head can avoid reallocation or full
buffer memmoves
* encoding.c xmlIO.c: use the new kind of buffers for output
buffers
Daniel
svn path=/trunk/; revision=3787
* threads.c: fix a small initialization problem raised by Ashwin
* testapi.c gentest.py: increase testing especially for document
with an internal subset, and entities
* tree.c: fix a deallocation issue when unlinking entities from
a document.
* valid.c: fix a missing entry point test not found previously.
* doc/*: regenerated the APIs, docs etc.
daniel
svn path=/trunk/; revision=3778
* xmlreader.c: applied patch from Aswin to fix tree skipping
* include/libxml/entities.h entities.c: fixed a comment and
added a new xmlNewEntity() entry point
* runtest.c: be less verbose
* tree.c: space and tabs cleanups
daniel
svn path=/trunk/; revision=3774
* runxmlconf.c: more progresses against the official regression tests
* runsuite.c: small cleanup for non-leak reports
* include/libxml/tree.h: parsing flags and other properties are
now added to the document node, this is generally useful and
allow to make Name and NmToken validations based on the parser
flags, more specifically the 5th edition of XML or not
* HTMLparser.c tree.c: small side effects for the previous changes
* parser.c SAX2.c valid.c: the bulk of teh changes are here,
the parser and validation behaviour can be affected, parsing
flags need to be copied, lot of changes. Also fixing various
validation problems in the regression tests.
Daniel
svn path=/trunk/; revision=3762
* tree.c: fix a bug introduced when fixing #438208 and reported by
Ashwin
* python/generator.py: fix an infinite loop bug
Daniel
svn path=/trunk/; revision=3733
* tree.c: fix some problems with the *EatName functions when
running out of memory raised by Eric Schrock , should fix#438208
Daniel
svn path=/trunk/; revision=3729
* encoding.c: Fixed typo in xmlCharEncFirstLine pointed out
by Mark Rowe (bug #440159)
* include/libxml/xmlversion.h.in: Added check for definition of
_POSIX_C_SOURCE to avoid warnings on Apple OS/X (patch from
Wendy Doyle and Mark Rowe, bug #346675)
* schematron.c, testapi.c, tree.c, xmlIO.c, xmlsave.c: minor
changes to fix compilation warnings - no change to logic.
svn path=/trunk/; revision=3618
* tree.c: applied documentation patches from Markus Keim
* xmlregexp.c: fixed one bug and added a couple of optimisations
while working on bug #362989
Daniel
* xmllint.c: added --html --memory to test htmlReadMemory to
test #321632
* HTMLparser.c: added various initialization calls which may help
#321632 but not conclusive
* testapi.c tree.c include/libxml/tree.h: fixed compilation with
--with-minimum --with-sax1 and --with-minimum --with-schemas
fixing #326442
Daniel
* uri.c include/libxml/uri.h: add a new function xmlPathToUri()
to provide a clean conversion when setting up a base
* SAX2.c tree.c: use said function when setting up doc->URL
or using the xmlSetBase function. Should fix#346261
Daniel
* tree.c: xmlTextConcat works with comments and PI nodes (bug #355962).
* parser.c: fix resulting tree corruption when using XML namespace
with existing doc in xmlParseBalancedChunkMemoryRecover.
* tree.c include/libxml/tree.h: Fixed a bug in
xmlDOMWrapAdoptNode(); the tree traversal stopped if the
very first given node had an attribute node :-( This was due
to a missed check in the traversal mechanism.
Expanded the xmlDOMWrapCtxt: it now holds the namespace map
used in xmlDOMWrapAdoptNode() and xmlDOMWrapCloneNode() for
reusal; so the map-items don't need to be created for every
cloning/adoption. Added a callback function to it for
retrieval of xmlNsPtr to be set on node->ns; this is needed
for my custom handling of ns-references in my DOM wrapper.
Substituted code which created the XML namespace decl on
the doc for a call to xmlTreeEnsureXMLDecl(). Removed
those nastly "warnigns" from the docs of the clone/adopt
functions; they work fine on my side.
* tree.c: Fixed xmlGetNodePath() to generate the node test "*"
for elements in the default namespace, rather than generating
an unprefixed named node test and loosing the namespace
information.
* runtest.c schematron.c testAutomata.c tree.c valid.c xinclude.c
xmlcatalog.c xmlreader.c xmlregexp.c xpath.c: end of first
pass on coverity reports.
Daniel
* tree.c: Simplified usage of the internal xmlNsMap. Added a
"strict" lookup for namespaces based on a prefix. Fixed a
namespace processing issue in the clone-node function, which
occured if a @ctxt argument was given.
* tree.c: Bundled lookup of attr-nodes and retrieving their
values into the functions xmlGetPropNodeInternal() and
xmlGetPropNodeValueInternal(). Changed relevant code
to use those functions.
* tree.c: Fix the add sibling functions when passing attributes.
Modify testing for ID in xmlSetProp.
No longer remove IDness when unlinking or replacing an attribute.
* tree.c: Added an initial version of xmlDOMWrapCloneNode() to
the API. It will be used to reflect DOM's Node.cloneNode and
Document.importNode methods.
The pros: 1) non-recursive, 2) optimized ns-lookup
(mostly pointer comparison), 3) user defined ns-lookup,
4) save ns-processing. The function is in an unfinished
and experimental state and should be only used to test it.
* tree.c: Enhanced xmlDOMWrapReconcileNamespaces() to remove
redundant ns-decls if the option XML_DOM_RECONNS_REMOVEREDUND
was given. Note that I haven't moved this option to the
header file yet; so just call this function with an @option
of 1 to test the behaviour.