This option passes tokenizer output directly to the SAX callbacks,
making it possible to test the tokenizer against the html5lib test
suite.
This will produce unbalanced calls to the startElement and endElement
callbacks, but it's the only way to support a SAX like interface for
HTML5. It can be used for filtering or rewriting HTML5, for example.
A HTML5 tree builder could then be implemented on top of the SAX
callbacks.
Provide a new set of functions to create xmlParserInputs. These can be
used for the document entity or from external entity loaders.
- Don't require xmlParserInputBuffer.
- All functions take a base URI.
- All functions take an encoding as string.
- xmlNewInputURL also takes a public ID.
- xmlNewInputMemory takes a size_t.
- Optimization hints for memory buffers.
Improve documentation.
Only call xmlInitParser before allocating a new parser context.
Call xmlCtxtUseOptions as early as possible.
Setting these deprecated globals hasn't had an effect for a long time.
Make them constants. This reduces the size of per-thread storage from
~700 to ~250 bytes.
For both XML and HTML, the document can provide an encoding
either in XMLDecl in XML, or as a meta element in HTML head.
This adds options to ignore those encodings if the encoding
is known in advace for example if the content had been converted
before being passed to the parser.
* parser.c include/libxml/parser.h: add XML_PARSE_IGNORE_ENC option
for XML parsing
* include/libxml/HTMLparser.h HTMLparser.c: adds the
HTML_PARSE_IGNORE_ENC for HTML parsing
* HTMLtree.c: fix the handling of saving when an unknown encoding is
defined in meta document header
* xmllint.c: add a --noenc option to activate the new parser options
- include/libxml/HTMLparser.h: defines the new HTML parser option
HTML_PARSE_NODEFDTD
- HTMLparser.c: if option is set don't add a default DTD
- xmllint.c: add the corresponding --nodefdtd option in xmllint
xmlParseInNodeContext notices that the enclosing document is
an HTML document, so invoke the HTML parser for that fragment, and
the HTML parser finding a "<p>hello world!</p>" document automatically
augment it with defaulted <html> and <body>. This defaulting should
be turned off in the HTML parser for this to work, but there is no
such HTML parser option. There is an htmlOmittedDefaultValue global
variable that you could use, but really we should not rely on global
variable for processing options anymore, best is to add an
HTML_PARSE_NOIMPLIED.
* include/libxml/HTMLparser.h: add the HTML_PARSE_NOIMPLIED parser flag
* HTMLparser.c: do add implied element if HTML_PARSE_NOIMPLIED is set
* parser.c: add HTML_PARSE_NOIMPLIED to options for xmlParseInNodeContext
on HTML documents
* HTMLparser.c parser.c SAX2.c debugXML.c tree.c valid.c xmlreader.c
xmllint.c include/libxml/HTMLparser.h include/libxml/parser.h:
added a parser XML_PARSE_COMPACT option to allocate small
text nodes (less than 8 bytes on 32bits, less than 16bytes on 64bits)
directly within the node, various changes to cope with this.
* result/XPath/tests/* result/XPath/xptr/* result/xmlid/*: this
slightly change the output
Daniel
* elfgcchack.h doc/elfgcchack.xsl libxml.h: hack based on Arjan van de
Ven suggestion to reduce ELF footprint and generated code. Based on
aliasing of libraries function to generate direct call instead of
indirect ones
* doc/libxml2-api.xml doc/Makefile.am doc/apibuild.py: added automatic
generation of elfgcchack.h based on the API description, extended
the API description to show the conditionals configuration flags
required for symbols.
* nanohttp.c parser.c xmlsave.c include/libxml/*.h: lot of cleanup
* doc/*: regenerated the docs.
Daniel
* include/libxml/*.h include/libxml/*.h.in: modified the file
header to add more informations, painful...
* genChRanges.py genUnicode.py: updated to generate said changes
in headers
* doc/apibuild.py: extract headers, add them to libxml2-api.xml
* *.html *.xsl *.xml: updated the stylesheets to flag geprecated
APIs modules. Updated the stylesheets, some cleanups, regenerated
* doc/html/*.html: regenerated added back book1 and libxml-lib.html
Daniel
* HTMLparser.c Makefile.am configure.in legacy.c parser.c
parserInternals.c testHTML.c xmllint.c include/libxml/HTMLparser.h
include/libxml/parser.h include/libxml/parserInternals.h
include/libxml/xmlversion.h.in: added a new configure
option --with-push, some cleanups, chased code size anomalies.
Now a library configured --with-minimum is around 150KB,
sounds good enough.
Daniel
* HTMLparser.c testHTML.c xmllint.c include/libxml/HTMLparser.h:
added the same htmlRead APIs than their XML counterparts
* include/libxml/parser.h: new parser options, not yet implemented,
added an options field to the context.
* tree.c: patch from Shaun McCance to fix bug #123238 when ]]>
is found within a cdata section.
* result/noent/cdata2 result/cdata2 result/cdata2.rdr
result/cdata2.sax test/cdata2: add one more cdata test
Daniel
* dict.c include/libxml/dict.h Makefile.am include/libxml/Makefile.am:
new dictionary module to keep a single instance of the names used
by the parser
* DOCBparser.c HTMLparser.c parser.c parserInternals.c valid.c:
switched all parsers to use the dictionary internally
* include/libxml/HTMLparser.h include/libxml/parser.h
include/libxml/parserInternals.h include/libxml/valid.h:
Some of the interfaces changed as a result to receive or return
"const xmlChar *" instead of "xmlChar *", this is either
insignificant from an user point of view or when the returning
value changed, those function are really parser internal methods
that no user code should really change
* doc/libxml2-api.xml doc/html/*: the API interface changed and
the docs were regenerated
Daniel
* encoding.c, threads.c, include/libxml/HTMLparser.h,
doc/libxml2-api.xml: Minor changes to comments, etc. for
improving documentation generation
* doc/Makefile.am: further adjustment to auto-generation of
win32/libxml2.def.src
* HTMLparser.c include/libxml/HTMLparser.h: applied HTML
improvements from Nick Kew, allowing to do more checking
to HTML elements and attributes.
Daniel
* HTMLparser.c win32/libxml2.def.src win32/dsp/libxml2.def.src
include/libxml/HTMLparser.h: fixing #79334 making htmlParseDocument
a public entry point.
* doc/*: rebuilt the API and docs
Daniel
* SAX.c: cleanup patch from Anthony Jones
* doc/Makefile.am: fix the headers to avoid in make scan
* parserInternals.c xpath.c include/libxml/*.h: cleanup of the
includes, * vs Ptr and general cleanup
* parsedecl.py: first version of a script to extract the
module interfaces, the goal will be to provide .decl or XML
specification of the interfaces to build wrappers.
Daniel
* include/libxml/parserInternals.h include/libxml/HTMLparser.h
xmlIO.c tree.c parserInternals.c entities.c encoding.c
HTMLparser.c: cleanup of global variables, marking some
const or private.
Daniel
* AUTHORS: added William and Bjorn
* include/libxml/*.h *.c README doc/*.html etc.: changed old email to
daniel@veillard.com hopefully I won't have to do this again
* doc/Makefile.am doc/html/*.html: cleanup makefile, checked that
docs can be rebuilt cleanly now
* include/libxml/xml*version.h*: removed include/libxml/xmlversion.h
from CVs it's generated, added include/libxml/xmlwin32version.h
also generated but which should change far less frequently.
* catalog.c nanoftp.c: made sure to include libxml.h not
libxml/xmlversion.h directly
* include/libxml/*.h: include xmlwin32version.h instead of xmlversion.h
when compiling on WIN32 and MSC
Daniel
of element and use it to avoid outputting formatting spaces at
the wrong place. Implemented the format parameter for HTML save.
- result/HTML/doc2.htm result/HTML/doc3.htm result/HTML/fp40.htm
result/HTML/script.html result/HTML/test2.html result/HTML/test3.html
result/HTML/wired.html: of course this impact the result of a
number of HTML tests
Daniel
Thu Feb 23 02:03:56 CET 2001 Tomasz Koczko <kloczek@pld.org.pl>
* *.c *.h libxml files: moved to libxml directory - this allow
simplify automake/autoconf. Now isn't neccessary hack on
am/ac level for make and remove libxml symlink (modified for this
also configure.in and main Makefile.am). Now automake abilities
are used in best way (like in many other projects with libraries).
* include/win32config.h: moved to libxml directory (now include
directory isn't neccessary).
* Makefile.am, examples/Makefile.am, libxml/Makefile.am:
added empty DEFS and in INCLUDES rest only -I$(top_builddir) -
this allow minimize parameters count passed to libtool script
(now compilation is also slyghtly more quiet).
* configure.in: simplifies libzdetestion - prepare separated
variables for keep libz name and path to libz header files isn't
realy neccessary (if someone have libz installed in non standard
prefix path to header files ald library can be passed as:
$ CFALGS="-I</libz.h/path>" LDFLAGS="-L</libz/path>" ./configure
* autogen.sh: check now for libxml/entities.h.
After above building libxml pass correctly and also pass
"make install DESTDIR=</install/prefix>" from tar ball generated by
"make dist". Seems ac/am reorganization is finished. This changes
not touches any other things on *.{c,h} files level.
- xpath.c result/XPath/tests/chaptersprefol: bugfixes on order and
on predicate
- HTMLparser.[ch] HTMLtree.c result/HTML/doc3.htm.err
result/HTML/doc3.htm.sax result/HTML/wired.html: sometimes one
really want to have tags closed on output even if we accept
unclosed ones on input
Daniel
- HTMLparser.[ch]: added a way to avoid adding automatically
omitted tags. htmlHandleOmittedElem() allows to change the
default handling.
- tree.[ch] xmllint.c: added xmlDocDumpFormatMemory() and
xmlDocDumpFormatMemoryEnc(), uses memory functions for output
of xmllint too when using --memory flag, added a memory test
suite at the Makefile level.
- xpathInternals.h xpath.[ch] xpointer.c: fixed problems
with namespace use when encountering QNames in XPath evalation,
added xmlns() scheme in XPointer.
- nanoftp.c : incorporated a fix
- parser.c xmlIO.c: fixed problems raised with encoding when using
the memory I/O
- parserInternals.c: closed bug 25934 reported by
torsten.landschoff@innominate.de
- TODO: updated
Daniel