1
0
mirror of https://gitlab.gnome.org/GNOME/libxml2.git synced 2025-07-30 22:43:14 +03:00

Finished reintegrating the docs and unifying the look, may still

need a couple of pointers but looks fine now. valid.html si now
merged in xmldtd.html. Overall cleanup, Daniel
This commit is contained in:
Daniel Veillard
2001-10-25 10:53:28 +00:00
parent 594cf0b2f2
commit b8cfbd1268
29 changed files with 2966 additions and 1310 deletions

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>DOM Principles</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -1,96 +1,144 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>Libxml Frequently Asked Questions</title>
<meta name="GENERATOR" content="amaya V5.0">
<meta http-equiv="Content-Type" content="text/html">
<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
<style type="text/css"><!--
TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>FAQ</title>
</head>
<body bgcolor="#ffffff">
<h1 align="center">Libxml Frequently Asked Questions</h1>
<p>Location: <a
href="http://xmlsoft.org/FAQ.html">http://xmlsoft.org/FAQ.html</a></p>
<p>Libxml home page: <a href="http://xmlsoft.org/">http://xmlsoft.org/</a></p>
<p>Mailing-list archive: <a
href="http://xmlsoft.org/messages/">http://xmlsoft.org/messages/</a></p>
<p>Version: $Revision$</p>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
<td width="180">
<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>FAQ</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
</ul></td></tr>
</table>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
<li><a href="ftp://xmlsoft.org/">FTP</a></li>
<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
</ul></td></tr>
</table>
</td></tr></table></td>
<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
<p>Table of Content:</p>
<ul>
<li><a href="#Licence">Licence(s)</a></li>
<li><a href="#Installati">Installation</a></li>
<li><a href="#Compilatio">Compilation</a></li>
<li><a href="#Developer">Developer corner</a></li>
<li><a href="FAQ.html#Licence">Licence(s)</a></li>
<li><a href="FAQ.html#Installati">Installation</a></li>
<li><a href="FAQ.html#Compilatio">Compilation</a></li>
<li><a href="FAQ.html#Developer">Developer corner</a></li>
</ul>
<h2><a name="Licence">Licence</a>(s)</h2>
<h3>
<a name="Licence">Licence</a>(s)</h3>
<ol>
<li><em>Licensing Terms for libxml</em>
<li>
<em>Licensing Terms for libxml</em>
<p>libxml is released under 2 (compatible) licences:</p>
<ul>
<li>the <a href="http://www.gnu.org/copyleft/lgpl.html">LGPL</a>: GNU
Library General Public License</li>
<li>the <a
href="http://www.w3.org/Consortium/Legal/copyright-software-19980720.html">W3C
<li>the <a href="http://www.w3.org/Consortium/Legal/copyright-software-19980720.html">W3C
IPR</a>: very similar to the XWindow licence</li>
</ul>
</li>
<li><em>Can I embed libxml in a proprietary application ?</em>
<li>
<em>Can I embed libxml in a proprietary application ?</em>
<p>Yes. The W3C IPR allows you to also keep proprietary the changes you
made to libxml, but it would be graceful to provide back bugfixes and
improvements as patches for possible incorporation in the main
development tree</p>
</li>
</ol>
<h2><a name="Installati">Installation</a></h2>
<h3><a name="Installati">Installation</a></h3>
<ol>
<li>Unless you are forced to because your application links with a Gnome
library requiring it, <strong><span style="background-color: #FF0000">Do
Not Use libxml1</span></strong>, use libxml2</li>
<li><em>Where can I get libxml</em>
<li>
<em>Where can I get libxml</em>
?
<p>The original distribution comes from <a
href="ftp://rpmfind.net/pub/libxml/">rpmfind.net</a> or <a
href="ftp://ftp.gnome.org/pub/GNOME/stable/sources/libxml/">gnome.org</a></p>
<p>The original distribution comes from <a href="ftp://rpmfind.net/pub/libxml/">rpmfind.net</a> or <a href="ftp://ftp.gnome.org/pub/GNOME/stable/sources/libxml/">gnome.org</a>
</p>
<p>Most linux and Bsd distribution includes libxml, this is probably the
safer way for end-users</p>
<p>David Doolin provides precompiled Windows versions at <a
href="http://www.ce.berkeley.edu/~doolin/code/libxmlwin32/ ">http://www.ce.berkeley.edu/~doolin/code/libxmlwin32/</a></p>
<p>David Doolin provides precompiled Windows versions at <a href="http://www.ce.berkeley.edu/~doolin/code/libxmlwin32/ ">http://www.ce.berkeley.edu/~doolin/code/libxmlwin32/</a>
</p>
</li>
<li><em>I see libxml and libxml2 releases, which one should I install ?</em>
<li>
<em>I see libxml and libxml2 releases, which one should I install ?</em>
<ul>
<li>If you are not concerned by any existing backward compatibility
with existing application, install libxml2 only</li>
<li>If you are not doing development, you can safely install both.
usually the packages <a
href="http://rpmfind.net/linux/RPM/libxml.html">libxml</a> and <a
href="http://rpmfind.net/linux/RPM/libxml2.html">libxml2</a> are
usually the packages <a href="http://rpmfind.net/linux/RPM/libxml.html">libxml</a> and <a href="http://rpmfind.net/linux/RPM/libxml2.html">libxml2</a> are
compatible (this is not the case for development packages)</li>
<li>If you are a developer and your system provides separate packaging
for shared libraries and the development components, it is possible
to install libxml and libxml2, and also <a
href="http://rpmfind.net/linux/RPM/libxml-devel.html">libxml-devel</a>
and <a
href="http://rpmfind.net/linux/RPM/libxml2-devel.html">libxml2-devel</a>
to install libxml and libxml2, and also <a href="http://rpmfind.net/linux/RPM/libxml-devel.html">libxml-devel</a>
and <a href="http://rpmfind.net/linux/RPM/libxml2-devel.html">libxml2-devel</a>
too for libxml2 &gt;= 2.3.0</li>
<li>If you are developing a new application, please develop against
libxml2(-devel)</li>
</ul>
</li>
<li><em>I can't install the libxml package it conflicts with libxml0</em>
<li>
<em>I can't install the libxml package it conflicts with libxml0</em>
<p>You probably have an old libxml0 package used to provide the shared
library for libxml.so.0, you can probably safely remove it. Anyway the
libxml packages provided on <a
href="ftp://rpmfind.net/pub/libxml/">rpmfind.net</a> provides
libxml packages provided on <a href="ftp://rpmfind.net/pub/libxml/">rpmfind.net</a> provides
libxml.so.0</p>
</li>
<li><em>I can't install the libxml(2) RPM package due to failed
<li>
<em>I can't install the libxml(2) RPM package due to failed
dependancies</em>
<p>The most generic solution is to refetch the latest src.rpm , and
rebuild it locally with</p>
@ -101,11 +149,11 @@ href="http://xmlsoft.org/messages/">http://xmlsoft.org/messages/</a></p>
applications with libxml(2)) that you can install locally.</p>
</li>
</ol>
<h2><a name="Compilatio">Compilation</a></h2>
<h3><a name="Compilatio">Compilation</a></h3>
<ol>
<li><em>What is the process to compile libxml ?</em>
<p>As most UNIX libraries libxml follows the "standard":</p>
<li>
<em>What is the process to compile libxml ?</em>
<p>As most UNIX libraries libxml follows the &quot;standard&quot;:</p>
<p><code>gunzip -c xxx.tar.gz | tar xvf -</code></p>
<p><code>cd libxml-xxxx</code></p>
<p><code>./configure --help</code></p>
@ -116,54 +164,58 @@ href="http://xmlsoft.org/messages/">http://xmlsoft.org/messages/</a></p>
<p>At that point you may have to rerun ldconfig or similar utility to
update your list of installed shared libs.</p>
</li>
<li><em>What other libraries are needed to compile/install libxml ?</em>
<li>
<em>What other libraries are needed to compile/install libxml ?</em>
<p>Libxml does not requires any other library, the normal C ANSI API
should be sufficient (please report any violation to this rule you may
find).</p>
<p>However if found at configuration time libxml will detect and use the
following libs:</p>
<ul>
<li><a href="http://www.info-zip.org/pub/infozip/zlib/">libz</a>
<li>
<a href="http://www.info-zip.org/pub/infozip/zlib/">libz</a>
: a highly portable and available widely compression library</li>
<li>iconv: a powerful character encoding conversion library. It's
included by default on recent glibc libraries, so it doesn't need to
be installed specifically on linux. It seems it's now <a
href="http://www.opennc.org/onlinepubs/7908799/xsh/iconv.html">part
of the official UNIX</a> specification. Here is one <a
href="http://clisp.cons.org/~haible/packages-libiconv.html">implementation
of the library</a> which source can be found <a
href="ftp://ftp.ilog.fr/pub/Users/haible/gnu/">here</a>.</li>
be installed specifically on linux. It seems it's now <a href="http://www.opennc.org/onlinepubs/7908799/xsh/iconv.html">part
of the official UNIX</a> specification. Here is one <a href="http://clisp.cons.org/~haible/packages-libiconv.html">implementation
of the library</a> which source can be found <a href="ftp://ftp.ilog.fr/pub/Users/haible/gnu/">here</a>.</li>
</ul>
</li>
<li><em>libxml does not compile with HP-UX's optional ANSI-C compiler</em>
<p>this is due to macro limitations. Try to add " -Wp,-H16800 -Ae" to the
<li>
<em>libxml does not compile with HP-UX's optional ANSI-C compiler</em>
<p>this is due to macro limitations. Try to add &quot; -Wp,-H16800 -Ae&quot; to the
CFLAGS</p>
<p>you can also install and use gcc instead or use a precompiled version
of libxml, both available from the <a
href="http://hpux.cae.wisc.edu/hppd/auto/summary_all.html">HP-UX Porting
and Archive Centre</a></p>
of libxml, both available from the <a href="http://hpux.cae.wisc.edu/hppd/auto/summary_all.html">HP-UX Porting
and Archive Centre</a>
</p>
</li>
<li><em>make check fails on some platforms</em>
<li>
<em>make check fails on some platforms</em>
<p>Sometime the regression tests results don't completely match the value
produced by the parser, and the makefile uses diff to print the delta. On
some platforms the diff return breaks the compilation process, if the
diff is small this is probably not a serious problem</p>
</li>
<li><em>I use the CVS version and there is no configure script</em>
<li>
<em>I use the CVS version and there is no configure script</em>
<p>The configure (and other Makefiles) are generated. Use the autogen.sh
script to regenerate the configure and Makefiles, like:</p>
<p><code>./autogen.sh --prefix=/usr --disable-shared</code></p>
</li>
<li><em>I have troubles when running make tests with gcc-3.0</em>
<li>
<em>I have troubles when running make tests with gcc-3.0</em>
<p>It seems the initial release of gcc-3.0 has a problem with the
optimizer which miscompiles the URI module. Please use another
compiler</p>
</li>
</ol>
<h2><a name="Developer">Developer</a> corner</h2>
<h3>
<a name="Developer">Developer</a> corner</h3>
<ol>
<li><em>xmlDocDump() generates output on one line</em>
<li>
<em>xmlDocDump() generates output on one line</em>
<p>libxml will not <strong>invent</strong> spaces in the content of a
document since <strong>all spaces in the content of a document are
significant</strong>. If you build a tree from the API and want
@ -174,84 +226,85 @@ href="http://xmlsoft.org/messages/">http://xmlsoft.org/messages/</a></p>
content <strong>modifying the content of your document in the
process</strong>. The result may not be what you expect. There is
<strong>NO</strong> way to guarantee that such a modification won't
impact other part of the content of your document. See <a
href="http://xmlsoft.org/html/libxml-parser.html#XMLKEEPBLANKSDEFAULT">xmlKeepBlanksDefault
()</a> and <a
href="http://xmlsoft.org/html/libxml-tree.html#XMLSAVEFORMATFILE">xmlSaveFormatFile
()</a></li>
impact other part of the content of your document. See <a href="http://xmlsoft.org/html/libxml-parser.html#XMLKEEPBLANKSDEFAULT">xmlKeepBlanksDefault
()</a> and <a href="http://xmlsoft.org/html/libxml-tree.html#XMLSAVEFORMATFILE">xmlSaveFormatFile
()</a>
</li>
</ol>
</li>
<li>Extra nodes in the document:
<p><em>For a XML file as below:</em></p>
<pre>&lt;?xml version="1.0"?&gt;
&lt;PLAN xmlns="http://www.argus.ca/autotest/1.0/"&gt;
&lt;NODE CommFlag="0"/&gt;
&lt;NODE CommFlag="1"/&gt;
<pre>&lt;?xml version=&quot;1.0&quot;?&gt;
&lt;PLAN xmlns=&quot;http://www.argus.ca/autotest/1.0/&quot;&gt;
&lt;NODE CommFlag=&quot;0&quot;/&gt;
&lt;NODE CommFlag=&quot;1&quot;/&gt;
&lt;/PLAN&gt;</pre>
<p><em>after parsing it with the function
pxmlDoc=xmlParseFile(...);</em></p>
<p><em>I want to the get the content of the first node (node with the
CommFlag="0")</em></p>
CommFlag=&quot;0&quot;)</em></p>
<p><em>so I did it as following;</em></p>
<pre>xmlNodePtr pode;
pnode=pxmlDoc-&gt;children-&gt;children;</pre>
<p><em>but it does not work. If I change it to</em></p>
<pre>pnode=pxmlDoc-&gt;children-&gt;children-&gt;next;</pre>
<p><em>then it works. Can someone explain it to me.</em></p>
<p></p>
<p>
<p>In XML all characters in the content of the document are significant
<strong>including blanks and formatting line breaks</strong>.</p>
<p>The extra nodes you are wondering about are just that, text nodes with
the formatting spaces wich are part of the document but that people tend
to forget. There is a function <a
href="http://xmlsoft.org/html/libxml-parser.html">xmlKeepBlanksDefault
to forget. There is a function <a href="http://xmlsoft.org/html/libxml-parser.html">xmlKeepBlanksDefault
()</a> to remove those at parse time, but that's an heuristic, and its
use should be limited to case where you are sure there is no
mixed-content in the document.</p>
</li>
<li><em>I get compilation errors of existing code like when accessing
<li>
<em>I get compilation errors of existing code like when accessing
<strong>root</strong> or <strong>childs fields</strong> of nodes</em>
<p>You are compiling code developed for libxml version 1 and using a
libxml2 development environment. Either switch back to libxml v1 devel or
even better fix the code to compile with libxml2 (or both) by <a
href="upgrade.html">following the instructions</a>.</p>
even better fix the code to compile with libxml2 (or both) by <a href="upgrade.html">following the instructions</a>.</p>
</li>
<li><em>I get compilation errors about non existing
<li>
<em>I get compilation errors about non existing
<strong>xmlRootNode</strong> or <strong>xmlChildrenNode</strong>
fields</em>
<p>The source code you are using has been <a
href="upgrade.html">upgraded</a> to be able to compile with both libxml
<p>The source code you are using has been <a href="upgrade.html">upgraded</a> to be able to compile with both libxml
and libxml2, but you need to install a more recent version:
libxml(-devel) &gt;= 1.8.8 or libxml2(-devel) &gt;= 2.1.0</p>
</li>
<li><em>XPath implementation looks seriously broken</em>
<li>
<em>XPath implementation looks seriously broken</em>
<p>XPath implementation prior to 2.3.0 was really incomplete, upgrade to
a recent version, the implementation and debug of libxslt generated fixes
for most obvious problems.</p>
</li>
<li><em>The example provided in the web page does not compile</em>
<li>
<em>The example provided in the web page does not compile</em>
<p>It's hard to maintain the documentation in sync with the code
&lt;grin/&gt; ...</p>
<p>Check the previous points 1/ and 2/ raised before, and send
patches.</p>
</li>
<li><em>Where can I get more examples and informations than in the web
<li>
<em>Where can I get more examples and informations than in the web
page</em>
<p>Ideally a libxml book would be nice. I have no such plan ... But you
can:</p>
<ul>
<li>check more deeply the <a href="html/libxml-lib.html">existing
generated doc</a></li>
generated doc</a>
</li>
<li>looks for examples of use for libxml function using the Gnome code
for example the following will query the full Gnome CVs base for the
use of the <strong>xmlAddChild()</strong> function:
<p><a
href="http://cvs.gnome.org/lxr/search?string=xmlAddChild">http://cvs.gnome.org/lxr/search?string=xmlAddChild</a></p>
<p><a href="http://cvs.gnome.org/lxr/search?string=xmlAddChild">http://cvs.gnome.org/lxr/search?string=xmlAddChild</a></p>
<p>This may be slow, a large hardware donation to the gnome project
could cure this :-)</p>
</li>
<li><a
href="http://cvs.gnome.org/bonsai/rview.cgi?cvsroot=/cvs/gnome&amp;dir=gnome-xml">Browse
<li>
<a href="http://cvs.gnome.org/bonsai/rview.cgi?cvsroot=/cvs/gnome&amp;dir=gnome-xml">Browse
the libxml source</a>
, I try to write code as clean and documented as possible, so
looking at it may be helpful</li>
@ -263,21 +316,20 @@ pnode=pxmlDoc-&gt;children-&gt;children;</pre>
C++.</p>
<p>There is however a C++ wrapper provided by Ari Johnson
&lt;ari@btigate.com&gt; which may fullfill your needs:</p>
<p>Website: <a
href="http://lusis.org/~ari/xml++/">http://lusis.org/~ari/xml++/</a></p>
<p>Download: <a
href="http://lusis.org/~ari/xml++/libxml++.tar.gz">http://lusis.org/~ari/xml++/libxml++.tar.gz</a></p>
<p>Website: <a href="http://lusis.org/~ari/xml++/">http://lusis.org/~ari/xml++/</a>
</p>
<p>Download: <a href="http://lusis.org/~ari/xml++/libxml++.tar.gz">http://lusis.org/~ari/xml++/libxml++.tar.gz</a>
</p>
</li>
<li>How to validate a document a posteriori ?
<p>It is possible to validate documents which had not been validated at
initial parsing time or documents who have been built from scratch using
the API. Use the <a
href="http://xmlsoft.org/html/libxml-valid.html#XMLVALIDATEDTD">xmlValidateDtd()</a>
the API. Use the <a href="http://xmlsoft.org/html/libxml-valid.html#XMLVALIDATEDTD">xmlValidateDtd()</a>
function. It is also possible to simply add a Dtd to an existing
document:</p>
<pre>xmlDocPtr doc; /* your existing document */
xmlDtdPtr dtd = xmlParseDTD(NULL, filename_of_dtd); /* parse the DTD */
dtd-&gt;name = xmlStrDup((xmlChar*)"root_name"); /* use the given root */
dtd-&gt;name = xmlStrDup((xmlChar*)&quot;root_name&quot;); /* use the given root */
doc-&gt;intSubset = dtd;
if (doc-&gt;children == NULL) xmlAddChild((xmlNodePtr)doc, (xmlNodePtr)dtd);
@ -286,9 +338,9 @@ pnode=pxmlDoc-&gt;children-&gt;children;</pre>
</li>
<li>etc ...</li>
</ol>
<p>
<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
<p>$Id$</p>
</td></tr></table></td></tr></table></td></tr></table></td>
</tr></table></td></tr></table>
</body>
</html>

View File

@ -15,7 +15,7 @@ TARGET_DIR=$(HTML_DIR)/$(DOC_MODULE)/html
PAGES= architecture.html bugs.html contribs.html docs.html DOM.html \
downloads.html entities.html example.html help.html index.html \
interface.html intro.html library.html namespaces.html news.html \
tree.html valid.html XML.html XSLT.html
tree.html xmldtd.html XML.html XSLT.html
man_MANS = xmlcatalog.1

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>XML</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>XSLT</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,8 +8,9 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>An overview of libxml architecture</title>
<title>libxml architecture</title>
</head>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
@ -18,7 +19,7 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>An overview of libxml architecture</h2>
<h2>libxml architecture</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Reporting bugs and getting help</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -1,28 +1,78 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>Libxml Catalog support</title>
<meta name="GENERATOR" content="amaya V5.0">
<meta http-equiv="Content-Type" content="text/html">
<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
<style type="text/css"><!--
TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Catalog support</title>
</head>
<body bgcolor="#ffffff">
<h1 align="center">Libxml Catalog support</h1>
<p>Location: <a
href="http://xmlsoft.org/catalog.html">http://xmlsoft.org/catalog.html</a></p>
<p>Libxml home page: <a href="http://xmlsoft.org/">http://xmlsoft.org/</a></p>
<p>Mailing-list archive: <a
href="http://mail.gnome.org/archives/xml/">http://mail.gnome.org/archives/xml/</a></p>
<p>Version: $Revision: 1.4 $</p>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
<td width="180">
<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>Catalog support</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
</ul></td></tr>
</table>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
<li><a href="ftp://xmlsoft.org/">FTP</a></li>
<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
</ul></td></tr>
</table>
</td></tr></table></td>
<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
<p>Table of Content:</p>
<ol>
<li><a href="#General">General overview</a></li>
<li><a href="General2">General overview</a></li>
<li><a href="#definition">The definition</a></li>
<li><a href="#Simple">Using catalogs</a></li>
<li><a href="#Some">Some examples</a></li>
@ -33,31 +83,28 @@ href="http://mail.gnome.org/archives/xml/">http://mail.gnome.org/archives/xml/</
API</a></li>
<li><a href="#Other">Other resources</a></li>
</ol>
<h2><a name="General">General overview</a></h2>
<p>What is a catalog? Basically it's a lookup mechanism used when
an entity (a file or a remote resource) references another entity. The catalog
lookup is inserted between the moment the reference is recognized by the
software (XML parser, stylesheet processing, or even images referenced for
inclusion in a rendering) and the time where loading that resource is
actually started.</p>
<h3><a name="General2">General overview</a></h3>
<p>What is a catalog? Basically it's a lookup mechanism used when an entity
(a file or a remote resource) references another entity. The catalog lookup
is inserted between the moment the reference is recognized by the software
(XML parser, stylesheet processing, or even images referenced for inclusion
in a rendering) and the time where loading that resource is actually
started.</p>
<p>It is basically used for 3 things:</p>
<ul>
<li>mapping from "logical" names, the public identifiers and a more
<li>mapping from &quot;logical&quot; names, the public identifiers and a more
concrete name usable for download (and URI). For example it can associate
the logical name
<p>"-//OASIS//DTD DocBook XML V4.1.2//EN"</p>
<p>&quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot;</p>
<p>of the DocBook 4.1.2 XML DTD with the actual URL where it can be
downloaded</p>
<p>http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd</p>
</li>
<li>remapping from a given URL to another one, like an HTTP indirection
saying that
<p>"http://www.oasis-open.org/committes/tr.xsl"</p>
<p>&quot;http://www.oasis-open.org/committes/tr.xsl&quot;</p>
<p>should really be looked at</p>
<p>"http://www.oasis-open.org/committes/entity/stylesheets/base/tr.xsl"</p>
<p>&quot;http://www.oasis-open.org/committes/entity/stylesheets/base/tr.xsl&quot;</p>
</li>
<li>providing a local cache mechanism allowing to load the entities
associated to public identifiers or remote resources, this is a really
@ -65,73 +112,63 @@ actually started.</p>
allows to avoid the aleas and delays associated to fetching remote
resources.</li>
</ul>
<h2><a name="definition">The definitions</a></h2>
<h3><a name="definition">The definitions</a></h3>
<p>Libxml, as of 2.4.3 implements 2 kind of catalogs:</p>
<ul>
<li>the older SGML catalogs, the official spec is SGML Open Technical
Resolution TR9401:1997, but is better understood by reading <a
href="http://www.jclark.com/sp/catalog.htm">the SP Catalog page</a> from
Resolution TR9401:1997, but is better understood by reading <a href="http://www.jclark.com/sp/catalog.htm">the SP Catalog page</a> from
James Clark. This is relatively old and not the preferred mode of
operation of libxml.</li>
<li><a href="http://www.oasis-open.org/committees/entity/spec.html">XML
<li>
<a href="http://www.oasis-open.org/committees/entity/spec.html">XML
Catalogs</a>
is far more flexible, more recent, uses an XML syntax and should scale
quite better. This is the default option of libxml.</li>
</ul>
<p></p>
<h2><a name="Simple">Using catalog</a></h2>
<p>
<h3><a name="Simple">Using catalog</a></h3>
<p>In a normal environment libxml will by default check the presence of a
catalog in /etc/xml/catalog, and assuming it has been correctly populated,
the processing is completely transparent to the document user. To take a
concrete example, suppose you are authoring a DocBook document, this one
starts with the following DOCTYPE definition:</p>
<pre>&lt;?xml version='1.0'?&gt;
&lt;!DOCTYPE book PUBLIC "-//Norman Walsh//DTD DocBk XML V3.1.4//EN"
"http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd"&gt;</pre>
&lt;!DOCTYPE book PUBLIC &quot;-//Norman Walsh//DTD DocBk XML V3.1.4//EN&quot;
&quot;http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd&quot;&gt;</pre>
<p>When validating the document with libxml, the catalog will be
automatically consulted to lookup the public identifier "-//Norman Walsh//DTD
DocBk XML V3.1.4//EN" and the system identifier
"http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd", and if these entities have
automatically consulted to lookup the public identifier &quot;-//Norman Walsh//DTD
DocBk XML V3.1.4//EN&quot; and the system identifier
&quot;http://nwalsh.com/docbook/xml/3.1.4/db3xml.dtd&quot;, and if these entities have
been installed on your system and the catalogs actually point to them, libxml
will fetch them from the local disk.</p>
<p style="font-size: 10pt"><strong>Note</strong>: Really don't use this
<p style="font-size: 10pt">
<strong>Note</strong>: Really don't use this
DOCTYPE example it's a really old version, but is fine as an example.</p>
<p>Libxml will check the catalog each time that it is requested to load an
entity, this includes DTD, external parsed entities, stylesheets, etc ... If
your system is correctly configured all the authoring phase and processing
should use only local files, even if your document stays portable because it
uses the canonical public and system ID, referencing the remote document.</p>
<h2><a name="Some">Some examples:</a></h2>
<h3><a name="Some">Some examples:</a></h3>
<p>Here is a couple of fragments from XML Catalogs used in libxml early
regression tests in <code>test/catalogs</code> :</p>
<pre>&lt;?xml version="1.0"?&gt;
&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
"http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"&gt;
&lt;public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN"
uri="http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"/&gt;
<pre>&lt;?xml version=&quot;1.0&quot;?&gt;
&lt;!DOCTYPE catalog PUBLIC
&quot;-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN&quot;
&quot;http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd&quot;&gt;
&lt;catalog xmlns=&quot;urn:oasis:names:tc:entity:xmlns:xml:catalog&quot;&gt;
&lt;public publicId=&quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot;
uri=&quot;http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd&quot;/&gt;
...</pre>
<p>This is the beginning of a catalog for DocBook 4.1.2, XML Catalogs are
written in XML, there is a specific namespace for catalog elements
"urn:oasis:names:tc:entity:xmlns:xml:catalog". The first entry in this
&quot;urn:oasis:names:tc:entity:xmlns:xml:catalog&quot;. The first entry in this
catalog is a <code>public</code> mapping it allows to associate a Public
Identifier with an URI.</p>
<pre>...
&lt;rewriteSystem systemIdStartString="http://www.oasis-open.org/docbook/"
rewritePrefix="file:///usr/share/xml/docbook/"/&gt;
&lt;rewriteSystem systemIdStartString=&quot;http://www.oasis-open.org/docbook/&quot;
rewritePrefix=&quot;file:///usr/share/xml/docbook/&quot;/&gt;
...</pre>
<p>A <code>rewriteSystem</code> is a very powerful instruction, it says that
any URI starting with a given prefix should be looked at another URI
constructed by replacing the prefix with an new one. In effect this acts like
@ -139,80 +176,72 @@ a cache system for a full area of the Web. In practice it is extremely useful
with a file prefix if you have installed a copy of those resources on your
local system.</p>
<pre>...
&lt;delegatePublic publicIdStartString="-//OASIS//DTD XML Catalog //"
catalog="file:///usr/share/xml/docbook.xml"/&gt;
&lt;delegatePublic publicIdStartString="-//OASIS//ENTITIES DocBook XML"
catalog="file:///usr/share/xml/docbook.xml"/&gt;
&lt;delegatePublic publicIdStartString="-//OASIS//DTD DocBook XML"
catalog="file:///usr/share/xml/docbook.xml"/&gt;
&lt;delegateSystem systemIdStartString="http://www.oasis-open.org/docbook/"
catalog="file:///usr/share/xml/docbook.xml"/&gt;
&lt;delegateURI uriStartString="http://www.oasis-open.org/docbook/"
catalog="file:///usr/share/xml/docbook.xml"/&gt;
&lt;delegatePublic publicIdStartString=&quot;-//OASIS//DTD XML Catalog //&quot;
catalog=&quot;file:///usr/share/xml/docbook.xml&quot;/&gt;
&lt;delegatePublic publicIdStartString=&quot;-//OASIS//ENTITIES DocBook XML&quot;
catalog=&quot;file:///usr/share/xml/docbook.xml&quot;/&gt;
&lt;delegatePublic publicIdStartString=&quot;-//OASIS//DTD DocBook XML&quot;
catalog=&quot;file:///usr/share/xml/docbook.xml&quot;/&gt;
&lt;delegateSystem systemIdStartString=&quot;http://www.oasis-open.org/docbook/&quot;
catalog=&quot;file:///usr/share/xml/docbook.xml&quot;/&gt;
&lt;delegateURI uriStartString=&quot;http://www.oasis-open.org/docbook/&quot;
catalog=&quot;file:///usr/share/xml/docbook.xml&quot;/&gt;
...</pre>
<p>Delegation is the core features which allows to build a tree of catalogs,
easier to maintain than a single catalog, based on Public Identifier, System
Identifier or URI prefixes it instructs the catalog software to look up entries
in another resource. This feature allow to build hierarchies of catalogs, the
set of entries presented should be sufficient to redirect the resolution of
all DocBook references to the specific catalog in
Identifier or URI prefixes it instructs the catalog software to look up
entries in another resource. This feature allow to build hierarchies of
catalogs, the set of entries presented should be sufficient to redirect the
resolution of all DocBook references to the specific catalog in
<code>/usr/share/xml/docbook.xml</code> this one in turn could delegate all
references for DocBook 4.2.1 to a specific catalog installed at the same time
as the DocBook resources on the local machine.</p>
<h2><a name="reference">How to tune catalog usage:</a></h2>
<h3><a name="reference">How to tune catalog usage:</a></h3>
<p>The user can change the default catalog behaviour by redirecting queries
to its own set of catalogs, this can be done by setting the
<code>XML_CATALOG_FILES</code> environment variable to a list of catalogs, an
empty one should deactivate loading the default
<code>/etc/xml/catalog</code> default catalog.</p>
<p>@@More options are likely to be provided in the future@@</p>
<h2><a name="validate">How to debug catalog processing:</a></h2>
empty one should deactivate loading the default <code>/etc/xml/catalog</code>
default catalog</p>
<h3><a name="validate">How to debug catalog processing:</a></h3>
<p>Setting up the <code>XML_DEBUG_CATALOG</code> environment variable will
make libxml output debugging informations for each catalog operations, for
example:</p>
<pre>orchis:~/XML -&gt; xmllint --memory --noout test/ent2
warning: failed to load external entity "title.xml"
warning: failed to load external entity &quot;title.xml&quot;
orchis:~/XML -&gt; export XML_DEBUG_CATALOG=
orchis:~/XML -&gt; xmllint --memory --noout test/ent2
Failed to parse catalog /etc/xml/catalog
Failed to parse catalog /etc/xml/catalog
warning: failed to load external entity "title.xml"
warning: failed to load external entity &quot;title.xml&quot;
Catalogs cleanup
orchis:~/XML -&gt; </pre>
<p>The test/ent2 references an entity, running the parser from memory makes
the base URI unavailable and the the "title.xml" entity cannot be loaded.
the base URI unavailable and the the &quot;title.xml&quot; entity cannot be loaded.
Setting up the debug environment variable allows to detect that an attempt is
made to load the <code>/etc/xml/catalog</code> but since it's not present the
resolution fails.</p>
<p>But the most advanced way to debug XML catalog processing is to use the
<strong>xmlcatalog</strong> command shipped with libxml2, it allows to load
catalogs and make resolution queries to see what is going on. This is also
used for the regression tests:</p>
<pre>orchis:~/XML -&gt; ./xmlcatalog test/catalogs/docbook.xml "-//OASIS//DTD DocBook XML V4.1.2//EN"
<pre>orchis:~/XML -&gt; ./xmlcatalog test/catalogs/docbook.xml \
&quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot;
http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
orchis:~/XML -&gt; </pre>
<p>For debugging what is going on, adding one -v flags increase the verbosity
level to indicate the processing done (adding a second flag also indicate
what elements are recognized at parsing):</p>
<pre>orchis:~/XML -&gt; ./xmlcatalog -v test/catalogs/docbook.xml "-//OASIS//DTD DocBook XML V4.1.2//EN"
<pre>orchis:~/XML -&gt; ./xmlcatalog -v test/catalogs/docbook.xml \
&quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot;
Parsing catalog test/catalogs/docbook.xml's content
Found public match -//OASIS//DTD DocBook XML V4.1.2//EN
http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
Catalogs cleanup
orchis:~/XML -&gt; </pre>
<p>A shell interface is also available to debug and process multiple queries
(and for regression tests):</p>
<pre>orchis:~/XML -&gt; ./xmlcatalog -shell test/catalogs/docbook.xml "-//OASIS//DTD DocBook XML V4.1.2//EN"
<pre>orchis:~/XML -&gt; ./xmlcatalog -shell test/catalogs/docbook.xml \
&quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot;
&gt; help
Commands available:
public PublicID: make a PUBLIC identifier lookup
@ -224,75 +253,68 @@ dump: print the current catalog state
debug: increase the verbosity level
quiet: decrease the verbosity level
exit: quit the shell
&gt; public "-//OASIS//DTD DocBook XML V4.1.2//EN"
&gt; public &quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot;
http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd
&gt; quit
orchis:~/XML -&gt; </pre>
<p>This should be sufficient for most debugging purpose, this was actually
used heavily to debug the XML Catalog implementation itself.</p>
<h2><a name="Declaring">How to create and maintain</a> catalogs:</h2>
<h3>
<a name="Declaring">How to create and maintain</a> catalogs:</h3>
<p>Basically XML Catalogs are XML files, you can either use XML tools to
manage them or use <strong>xmlcatalog</strong> for this. The basic step is
to create a catalog the -create option provide this facility:</p>
<pre>orchis:~/XML -&gt; ./xmlcatalog --create tst.xml
&lt;?xml version="1.0"?&gt;
&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
"http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"/&gt;
&lt;?xml version=&quot;1.0&quot;?&gt;
&lt;!DOCTYPE catalog PUBLIC &quot;-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN&quot;
&quot;http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd&quot;&gt;
&lt;catalog xmlns=&quot;urn:oasis:names:tc:entity:xmlns:xml:catalog&quot;/&gt;
orchis:~/XML -&gt; </pre>
<p>By default xmlcatalog does not overwrite the original catalog and save the
result on the standard output, this can be overridden using the -noout
option. The <code>-add</code> command allows to add entries in the
catalog:</p>
<pre>orchis:~/XML -&gt; ./xmlcatalog --noout --create --add "public" "-//OASIS//DTD DocBook XML V4.1.2//EN" http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd tst.xml
<pre>orchis:~/XML -&gt; ./xmlcatalog --noout --create --add &quot;public&quot; \
&quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot; \
http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd tst.xml
orchis:~/XML -&gt; cat tst.xml
&lt;?xml version="1.0"?&gt;
&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"&gt;
&lt;public publicId="-//OASIS//DTD DocBook XML V4.1.2//EN"
uri="http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd"/&gt;
&lt;?xml version=&quot;1.0&quot;?&gt;
&lt;!DOCTYPE catalog PUBLIC &quot;-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN&quot; \
&quot;http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd&quot;&gt;
&lt;catalog xmlns=&quot;urn:oasis:names:tc:entity:xmlns:xml:catalog&quot;&gt;
&lt;public publicId=&quot;-//OASIS//DTD DocBook XML V4.1.2//EN&quot;
uri=&quot;http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd&quot;/&gt;
&lt;/catalog&gt;
orchis:~/XML -&gt; </pre>
<p>The <code>-add</code> option will always take 3 parameters even if some of
the XML Catalog constructs (like nextCatalog) will have only a single
argument, just pass a third empty string, it will be ignored.</p>
<p>Similarly the <code>-del</code> option remove matching entries from the
catalog:</p>
<pre>orchis:~/XML -&gt; ./xmlcatalog --del "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" tst.xml
&lt;?xml version="1.0"?&gt;
&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"/&gt;
<pre>orchis:~/XML -&gt; ./xmlcatalog --del \
&quot;http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd&quot; tst.xml
&lt;?xml version=&quot;1.0&quot;?&gt;
&lt;!DOCTYPE catalog PUBLIC &quot;-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN&quot;
&quot;http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd&quot;&gt;
&lt;catalog xmlns=&quot;urn:oasis:names:tc:entity:xmlns:xml:catalog&quot;/&gt;
orchis:~/XML -&gt; </pre>
<p>The catalog is now empty. Note that the matching of <code>-del</code> is
exact and would have worked in a similar fashion with the Public ID
string.</p>
<p>This is rudimentary but should be sufficient to manage a not too complex
catalog tree of resources.</p>
<h2><a name="implemento">The implementor corner quick review of the
API:</a></h2>
<p>First, and like for every other module of libxml, there is an automatically
generated <a href="html/libxml-catalog.html">API page for catalog
support</a>.</p>
<h3><a name="implemento">The implementor corner quick review of the
API:</a></h3>
<p>First, and like for every other module of libxml, there is an
automatically generated <a href="html/libxml-catalog.html">API page for
catalog support</a>.</p>
<p>The header for the catalog interfaces should be included as:</p>
<pre>#include &lt;libxml/catalog.h&gt;</pre>
<p>The API is voluntarily kept very simple. First it is not obvious that
applications really need access to it since it is the default behaviour of
libxml (Note: it is possible to completely override libxml default catalog by
using <a href="html/libxml-parser.html">xmlSetExternalEntityLoader</a> to
plug an application specific resolver).</p>
<p>Basically libxml support 2 catalog lists:</p>
<ul>
<li>the default one, global shared by all the application</li>
@ -301,74 +323,53 @@ plug an application specific resolver).</p>
associated to the parser context and destroyed when the parsing context
is destroyed.</li>
</ul>
<p>the document one will be used first if it exists.</p>
<h3>Initialization routines:</h3>
<h4>Initialization routines:</h4>
<p>xmlInitializeCatalog(), xmlLoadCatalog() and xmlLoadCatalogs() should be
used at startup to initialize the catalog, if the catalog should be
initialized with specific values xmlLoadCatalog() or xmlLoadCatalogs()
should be called before xmlInitializeCatalog() which would otherwise do a
default initialization first.</p>
<p>The xmlCatalogAddLocal() call is used by the parser to grow the document
own catalog list if needed.</p>
<h3>Preferences setup:</h3>
<h4>Preferences setup:</h4>
<p>The XML Catalog spec requires the possibility to select default
preferences between public and system delegation,
xmlCatalogSetDefaultPrefer() allows this, xmlCatalogSetDefaults() and
xmlCatalogGetDefaults() allow to control if XML Catalogs resolution should
be forbidden, allowed for global catalog, for document catalog or both, the
default is to allow both.</p>
<p>And of course xmlCatalogSetDebug() allows to generate debug messages
(through the xmlGenericError() mechanism).</p>
<h3>Querying routines:</h3>
<h4>Querying routines:</h4>
<p>xmlCatalogResolve(), xmlCatalogResolveSystem(), xmlCatalogResolvePublic()
and xmlCatalogResolveURI() are relatively explicit if you read the XML
Catalog specification they correspond to section 7 algorithms, they should
also work if you have loaded an SGML catalog with a simplified semantic.</p>
<p>xmlCatalogLocalResolve() and xmlCatalogLocalResolveURI() are the same but
operate on the document catalog list</p>
<h3>Cleanup and Miscellaneous:</h3>
<h4>Cleanup and Miscellaneous:</h4>
<p>xmlCatalogCleanup() free-up the global catalog, xmlCatalogFreeLocal() is
the per-document equivalent.</p>
<p>xmlCatalogAdd() and xmlCatalogRemove() are used to dynamically modify the
first catalog in the global list, and xmlCatalogDump() allows to dump a
catalog state, those routines are primarily designed for xmlcatalog, I'm not
sure that exposing more complex interfaces (like navigation ones) would be
really useful.</p>
<p>The xmlParseCatalogFile() is a function used to load XML Catalog files,
it's similar as xmlParseFile() except it bypass all catalog lookups, it's
provided because this functionality may be useful for client tools.</p>
<h3>threaded environments:</h3>
<h4>threaded environments:</h4>
<p>Since the catalog tree is built progressively, some care has been taken to
try to avoid troubles in multithreaded environments but without a
test-and-set routine accessible from C this can't be fully guaranteed, so the
best is to use xmlGetExternalEntityLoader and set the entity loader routines
to one of your code doing the synchronization.</p>
<p></p>
<h2><a name="Other">Other resources</a></h2>
try to avoid troubles in multithreaded environments. The code is now thread
safe assuming that the libxml library has been compiled with threads
support.</p>
<p>
<h3><a name="Other">Other resources</a></h3>
<p>The XML Catalog specification is relatively recent so there isn't much
literature to point at:</p>
<ul>
<li>You can find an good rant from Norm Walsh about <a
href="http://www.arbortext.com/Think_Tank/XML_Resources/Issue_Three/issue_three.html">the
<li>You can find an good rant from Norm Walsh about <a href="http://www.arbortext.com/Think_Tank/XML_Resources/Issue_Three/issue_three.html">the
need for catalogs</a>, it provides a lot of context informations even if
I don't agree with everything presented.</li>
<li>An <a href="http://home.ccil.org/~cowan/XML/XCatalog.html">old XML
@ -376,23 +377,21 @@ literature to point at:</p>
<li>The <a href="http://www.rddl.org/">Resource Directory Description
Language</a> (RDDL) another catalog system but more oriented toward
providing metadata for XML namespaces.</li>
<li>the page from the OASIS Technical <a
href="http://www.oasis-open.org/committees/entity/">Committee on Entity
<li>the page from the OASIS Technical <a href="http://www.oasis-open.org/committees/entity/">Committee on Entity
Resolution</a> who maintains XML Catalog, you will find pointers to the
specification update, some background and pointers to others tools
providing XML Catalog support</li>
<li>I have uploaded <a href="ftp://xmlsoft.org/test/dbk412catalog.tar.gz">a
mall tarball</a> containing XML Catalogs for DocBook 4.1.2 which seems to
work fine for me</li>
<li>The <a href="http://www.xmlsoft.org/xmlcatalog_man.html">xmlcatalog manual page</a></li>
<li>The <a href="http://www.xmlsoft.org/xmlcatalog_man.html">xmlcatalog
manual page</a>
</li>
</ul>
<p>If you have suggestions for corrections or additions, simply contact
me:</p>
<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
<p>$Id: catalog.html,v 1.4 2001/08/24 12:14:55 veillard Exp $</p>
</td></tr></table></td></tr></table></td></tr></table></td>
</tr></table></td></tr></table>
</body>
</html>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Contributions</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
@ -70,8 +71,12 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
</td></tr></table></td>
<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
<ul>
<li>Bjorn Reese, William Brack and Thomas Broyer have provided a number of
patches, Gary Pennington worked on the validation API, threading support
and Solaris port.</li>
<li>John Fleck helps maintaining the documentation and man pages.</li>
<li>
<a href="mailto:ari@lusis.org">Ari Johnson</a>
<p><a href="mailto:ari@lusis.org">Ari Johnson</a></p>
provides a C++ wrapper for libxml:
<p>Website: <a href="http://lusis.org/~ari/xml++/">http://lusis.org/~ari/xml++/</a>
</p>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Documentation</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Downloads</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -1,51 +1,101 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>Libxml Internationalization support</title>
<meta name="GENERATOR" content="amaya V3.2">
<meta http-equiv="Content-Type" content="text/html">
<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
<style type="text/css"><!--
TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Encodings support</title>
</head>
<body bgcolor="#ffffff">
<h1 align="center">Libxml Internationalization support</h1>
<p>Location: <a
href="http://xmlsoft.org/encoding.html">http://xmlsoft.org/encoding.html</a></p>
<p>Libxml home page: <a href="http://xmlsoft.org/">http://xmlsoft.org/</a></p>
<p>Mailing-list archive: <a
href="http://xmlsoft.org/messages/">http://xmlsoft.org/messages/</a></p>
<p>Version: $Revision$</p>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
<td width="180">
<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>Encodings support</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
</ul></td></tr>
</table>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
<li><a href="ftp://xmlsoft.org/">FTP</a></li>
<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
</ul></td></tr>
</table>
</td></tr></table></td>
<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
<p>Table of Content:</p>
<ol>
<li><a href="#What">What does internationalization support mean ?</a></li>
<li><a href="#internal">The internal encoding, how and why</a></li>
<li><a href="#implemente">How is it implemented ?</a></li>
<li><a href="#Default">Default supported encodings</a></li>
<li><a href="#extend">How to extend the existing support</a></li>
<li><a href="encoding.html#What">What does internationalization support
mean ?</a></li>
<li><a href="encoding.html#internal">The internal encoding, how and
why</a></li>
<li><a href="encoding.html#implemente">How is it implemented ?</a></li>
<li><a href="encoding.html#Default">Default supported encodings</a></li>
<li><a href="encoding.html#extend">How to extend the existing
support</a></li>
</ol>
<h2><a name="What">What does internationalization support mean ?</a></h2>
<h3><a name="What">What does internationalization support mean ?</a></h3>
<p>XML was designed from the start to allow the support of any character set
by using Unicode. Any conformant XML parser has to support the UTF-8 and
UTF-16 default encodings which can both express the full unicode ranges. UTF8
is a variable length encoding whose greatest point are to resuse the same
emcoding for ASCII and to save space for Western encodings, but it is a bit
more complex to handle in practice. UTF-16 use 2 bytes per characters (and
sometimes combines two pairs), it makes implementation easier, but looks a bit
overkill for Western languages encoding. Moreover the XML specification allows
document to be encoded in other encodings at the condition that they are
clearly labelled as such. For example the following is a wellformed XML
sometimes combines two pairs), it makes implementation easier, but looks a
bit overkill for Western languages encoding. Moreover the XML specification
allows document to be encoded in other encodings at the condition that they
are clearly labelled as such. For example the following is a wellformed XML
document encoded in ISO-8859 1 and using accentuated letter that we French
likes for both markup and content:</p>
<pre>&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
<pre>&lt;?xml version=&quot;1.0&quot; encoding=&quot;ISO-8859-1&quot;?&gt;
&lt;tr<EFBFBD>s&gt;l<EFBFBD>&lt;/tr<74>s&gt;</pre>
<p>Having internationalization support in libxml means the foolowing:</p>
<ul>
<li>the document is properly parsed</li>
@ -55,35 +105,31 @@ likes for both markup and content:</p>
<li>it can also be saved in another encoding supported by libxml (for
example straight UTF8 or even an ASCII form)</li>
</ul>
<p>Another very important point is that the whole libxml API, with the
exception of a few routines to read with a specific encoding or save to a
specific encoding, is completely agnostic about the original encoding of the
document.</p>
<p>It should be noted too that the HTML parser embedded in libxml now obbey
the same rules too, the following document will be (as of 2.2.2) handled in
an internationalized fashion by libxml too:</p>
<pre>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd"&gt;
&lt;html lang="fr"&gt;
<pre>&lt;!DOCTYPE HTML PUBLIC &quot;-//W3C//DTD HTML 4.0 Transitional//EN&quot;
&quot;http://www.w3.org/TR/REC-html40/loose.dtd&quot;&gt;
&lt;html lang=&quot;fr&quot;&gt;
&lt;head&gt;
&lt;META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"&gt;
&lt;META HTTP-EQUIV=&quot;Content-Type&quot; CONTENT=&quot;text/html; charset=ISO-8859-1&quot;&gt;
&lt;/head&gt;
&lt;body&gt;
&lt;p&gt;W3C cr<63>e des standards pour le Web.&lt;/body&gt;
&lt;/html&gt;</pre>
<h2><a name="internal">The internal encoding, how and why</a></h2>
<h3><a name="internal">The internal encoding, how and why</a></h3>
<p>One of the core decision was to force all documents to be converted to a
default internal encoding, and that encoding to be UTF-8, here are the
rationale for those choices:</p>
<ul>
<li>keeping the native encoding in the internal form would force the libxml
users (or the code associated) to be fully aware of the encoding of the
original document, for examples when adding a text node to a document, the
content would have to be provided in the document encoding, i.e. the
original document, for examples when adding a text node to a document,
the content would have to be provided in the document encoding, i.e. the
client code would have to check it before hand, make sure it's conformant
to the encoding, etc ... Very hard in practice, though in some specific
cases this may make sense.</li>
@ -111,23 +157,20 @@ rationale for those choices:</p>
<li>UTF-8 is being used as the de-facto internal encoding standard for
related code like the <a href="http://www.pango.org/">pango</a>
upcoming Gnome text widget, and a lot of Unix code (yep another place
where Unix programmer base takes a different approach from Microsoft -
they are using UTF-16)</li>
where Unix programmer base takes a different approach from Microsoft
- they are using UTF-16)</li>
</ul>
</li>
</ul>
<p>What does this mean in practice for the libxml user:</p>
<ul>
<li>xmlChar, the libxml data type is a byte, those bytes must be assembled
as UTF-8 valid strings. The proper way to terminate an xmlChar * string is
simply to append 0 byte, as usual.</li>
as UTF-8 valid strings. The proper way to terminate an xmlChar * string
is simply to append 0 byte, as usual.</li>
<li>One just need to make sure that when using chars outside the ASCII set,
the values has been properly converted to UTF-8</li>
</ul>
<h2><a name="implemente">How is it implemented ?</a></h2>
<h3><a name="implemente">How is it implemented ?</a></h3>
<p>Let's describe how all this works within libxml, basically the I18N
(internationalization) support get triggered only during I/O operation, i.e.
when reading a document or saving one. Let's look first at the reading
@ -137,8 +180,8 @@ sequence:</p>
simple heuristic allows to detect UTF-18 and UCS-4 from whose where the
ASCII range (0-0x7F) maps with ASCII</li>
<li>the xml declaration if available is parsed, including the encoding
declaration. At that point, if the autodetected encoding is different from
the one declared a call to xmlSwitchEncoding() is issued.</li>
declaration. At that point, if the autodetected encoding is different
from the one declared a call to xmlSwitchEncoding() is issued.</li>
<li>If there is no encoding declaration, then the input has to be in either
UTF-8 or UTF-16, if it is not then at some point when processing the
input, the converter/checker of UTF-8 form will raise an encoding error.
@ -158,20 +201,19 @@ err.xml:1: error: Bytes: 0xE8 0x73 0x3E 0x6C
will report an error and stops processing:
<pre>~/XML -&gt; ./xmllint err2.xml
err2.xml:1: error: Unsupported encoding UnsupportedEnc
&lt;?xml version="1.0" encoding="UnsupportedEnc"?&gt;
&lt;?xml version=&quot;1.0&quot; encoding=&quot;UnsupportedEnc&quot;?&gt;
^</pre>
</li>
<li>From that point the encoder process progressingly the input (it is
plugged as a front-end to the I/O module) for that entity. It captures and
convert on-the-fly the document to be parsed to UTF-8. The parser itself
just does UTF-8 checking of this input and process it transparently. The
only difference is that the encoding information has been added to the
parsing context (more precisely to the input corresponding to this
entity).</li>
<li>The result (when using DOM) is an internal form completely in UTF-8 with
just an encoding information on the document node.</li>
plugged as a front-end to the I/O module) for that entity. It captures
and convert on-the-fly the document to be parsed to UTF-8. The parser
itself just does UTF-8 checking of this input and process it
transparently. The only difference is that the encoding information has
been added to the parsing context (more precisely to the input
corresponding to this entity).</li>
<li>The result (when using DOM) is an internal form completely in UTF-8
with just an encoding information on the document node.</li>
</ol>
<p>Ok then what's happen when saving the document (assuming you
colllected/built an xmlDoc DOM like structure) ? It depends on the function
called, xmlSaveFile() will just try to save in the original encoding, while
@ -198,34 +240,30 @@ encoding:</p>
point libxml will decode the offending character, remove it from the
buffer and replace it with the associated charRef encoding &amp;#123; and
resume the convertion. This guarante that any document will be saved
without losses (except for markup names where this is not legal, this is a
problem in the current version, in pactice avoid using non-ascci
characters for tags or attributes names @@). A special "ascii" encoding
without losses (except for markup names where this is not legal, this is
a problem in the current version, in pactice avoid using non-ascci
characters for tags or attributes names @@). A special &quot;ascii&quot; encoding
name is used to save documents to a pure ascii form can be used when
portability is really crucial</li>
</ol>
<p>Here is a few examples based on the same test document:</p>
<pre>~/XML -&gt; ./xmllint isolat1
&lt;?xml version="1.0" encoding="ISO-8859-1"?&gt;
&lt;?xml version=&quot;1.0&quot; encoding=&quot;ISO-8859-1&quot;?&gt;
&lt;tr<EFBFBD>s&gt;l<EFBFBD>&lt;/tr<74>s&gt;
~/XML -&gt; ./xmllint --encode UTF-8 isolat1
&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;
&lt;très&gt;l<EFBFBD> <20>&lt;/très&gt;
~/XML -&gt; </pre>
<p>The same processing is applied (and reuse most of the code) for HTML I18N
processing. Looking up and modifying the content encoding is a bit more
difficult since it is located in a &lt;meta&gt; tag under the &lt;head&gt;, so
a couple of functions htmlGetMetaEncoding() and htmlSetMetaEncoding() have
difficult since it is located in a &lt;meta&gt; tag under the &lt;head&gt;,
so a couple of functions htmlGetMetaEncoding() and htmlSetMetaEncoding() have
been provided. The parser also attempts to switch encoding on the fly when
detecting such a tag on input. Except for that the processing is the same (and
again reuses the same code).</p>
<h2><a name="Default">Default supported encodings</a></h2>
<p>libxml has a set of default converters for the following encodings (located
in encoding.c):</p>
detecting such a tag on input. Except for that the processing is the same
(and again reuses the same code).</p>
<h3><a name="Default">Default supported encodings</a></h3>
<p>libxml has a set of default converters for the following encodings
(located in encoding.c):</p>
<ol>
<li>UTF-8 is supported by default (null handlers)</li>
<li>UTF-16, both little and big endian</li>
@ -234,30 +272,25 @@ in encoding.c):</p>
<li>HTML, a specific handler for the conversion of UTF-8 to ASCII with HTML
predefined entities like &amp;copy; for the Copyright sign.</li>
</ol>
<p>More over when compiled on an Unix platfor with iconv support the full set
of encodings supported by iconv can be instantly be used by libxml. On a linux
machine with glibc-2.1 the list of supported encodings and aliases fill 3 full
pages, and include UCS-4, the full set of ISO-Latin encodings, and the various
Japanese ones.</p>
<h3>Encoding aliases</h3>
<p>From 2.2.3, libxml has support to register encoding names aliases. The goal
is to be able to parse document whose encoding is supported but where the name
differs (for example from the default set of names accepted by iconv). The
following functions allow to register and handle new aliases for existing
encodings. Once registered libxml will automatically lookup the aliases when
handling a document:</p>
of encodings supported by iconv can be instantly be used by libxml. On a
linux machine with glibc-2.1 the list of supported encodings and aliases fill
3 full pages, and include UCS-4, the full set of ISO-Latin encodings, and the
various Japanese ones.</p>
<h4>Encoding aliases</h4>
<p>From 2.2.3, libxml has support to register encoding names aliases. The
goal is to be able to parse document whose encoding is supported but where
the name differs (for example from the default set of names accepted by
iconv). The following functions allow to register and handle new aliases for
existing encodings. Once registered libxml will automatically lookup the
aliases when handling a document:</p>
<ul>
<li>int xmlAddEncodingAlias(const char *name, const char *alias);</li>
<li>int xmlDelEncodingAlias(const char *alias);</li>
<li>const char * xmlGetEncodingAlias(const char *alias);</li>
<li>void xmlCleanupEncodingAliases(void);</li>
</ul>
<h2><a name="extend">How to extend the existing support</a></h2>
<h3><a name="extend">How to extend the existing support</a></h3>
<p>Well adding support for new encoding, or overriding one of the encoders
(assuming it is buggy) should not be hard, just write an input and output
conversion routines to/from UTF-8, and register them using
@ -266,23 +299,21 @@ called automatically if the parser(s) encounter such an encoding name
(register it uppercase, this will help). The description of the encoders,
their arguments and expected return values are described in the encoding.h
header.</p>
<p>A quick note on the topic of subverting the parser to use a different
internal encoding than UTF-8, in some case people will absolutely want to keep
the internal encoding different, I think it's still possible (but the encoding
must be compliant with ASCII on the same subrange) though I didn't tried it.
The key is to override the default conversion routines (by registering null
encoders/decoders for your charsets), and bypass the UTF-8 checking of the
parser by setting the parser context charset (ctxt-&gt;charset) to something
different than XML_CHAR_ENCODING_UTF8, but there is no guarantee taht this
will work. You may also have some troubles saving back.</p>
internal encoding than UTF-8, in some case people will absolutely want to
keep the internal encoding different, I think it's still possible (but the
encoding must be compliant with ASCII on the same subrange) though I didn't
tried it. The key is to override the default conversion routines (by
registering null encoders/decoders for your charsets), and bypass the UTF-8
checking of the parser by setting the parser context charset
(ctxt-&gt;charset) to something different than XML_CHAR_ENCODING_UTF8, but
there is no guarantee taht this will work. You may also have some troubles
saving back.</p>
<p>Basically proper I18N support is important, this requires at least
libxml-2.0.0, but a lot of features and corrections are really available only
starting 2.2.</p>
<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
<p>$Id$</p>
</td></tr></table></td></tr></table></td></tr></table></td>
</tr></table></td></tr></table>
</body>
</html>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Entities or no entities</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>A real example</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>How to help</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>The XML C library for Gnome</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
@ -84,21 +85,13 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation</a></li>
<li><a href="xmldtd.html">Validation</a></li>
<li><a href="#Principles">DOM principles</a></li>
<li><a href="#real">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
</ul>
<p>Separate documents:</p>
<ul>
<li><a href="upgrade.html">upgrade instructions for migrating to
libxml2</a></li>
<li><a href="encoding.html">libxml Internationalization support</a></li>
<li><a href="xmlio.html">libxml Input/Output interfaces</a></li>
<li><a href="xmlmem.html">libxml Memory interfaces</a></li>
<li><a href="catalog.html">libxml Catalog support</a></li>
<li><a href="xmldtd.html">a short introduction about DTDs and
libxml</a></li>
<li><a href="http://xmlsoft.org/XSLT/">the libxslt page</a></li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">the gdome2 page: a
standard DOM interface for libxml2</a></li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>The SAX interface</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Introduction</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,8 +8,9 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>The XML library interfaces</title>
<title>The parser interfaces</title>
</head>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
@ -18,7 +19,7 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>The XML library interfaces</h2>
<h2>The parser interfaces</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Namespaces</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>News</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -1,9 +1,104 @@
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="html" version="4.01" encoding="ISO-8859-1"/>
<!--
- returns the filename associated to an ID in the original file
-->
<xsl:template name="filename">
<xsl:param name="name" select="string(@href)"/>
<xsl:choose>
<xsl:when test="$name = '#Introducti'">
<xsl:text>intro.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Documentat'">
<xsl:text>docs.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Reporting'">
<xsl:text>bugs.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#help'">
<xsl:text>help.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Help'">
<xsl:text>help.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Downloads'">
<xsl:text>downloads.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#News'">
<xsl:text>news.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Contributi'">
<xsl:text>contribs.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#xsltproc'">
<xsl:text>xsltproc2.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#API'">
<xsl:text>API.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#XSLT'">
<xsl:text>XSLT.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#XML'">
<xsl:text>XML.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Validation'">
<xsl:text>xmldtd.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#tree'">
<xsl:text>tree.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#library'">
<xsl:text>library.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#interface'">
<xsl:text>interface.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Example'">
<xsl:text>example.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Entities'">
<xsl:text>entities.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#architecture'">
<xsl:text>architecture.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Namespaces'">
<xsl:text>namespaces.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#DOM'">
<xsl:text>DOM.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Catalog'">
<xsl:text>catalog.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Upgrading'">
<xsl:text>upgrade.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Encodings'">
<xsl:text>encoding.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#IO'">
<xsl:text>xmlio.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Memory'">
<xsl:text>xmlmem.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#FAQ'">
<xsl:text>FAQ.html</xsl:text>
</xsl:when>
<xsl:when test="$name = ''">
<xsl:text>unknown.html</xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$name"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!--
- The global title
-->
<xsl:output method="html" version="4.01" encoding="ISO-8859-1"/>
<xsl:variable name="globaltitle" select="string(/html/body/h1[1])"/>
<!--
- The table of content
@ -11,7 +106,6 @@
<xsl:variable name="toc">
<ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<xsl:for-each select="/html/body/h2">
<xsl:variable name="filename">
<xsl:call-template name="filename">
@ -27,11 +121,6 @@
</xsl:element>
</li>
</xsl:for-each>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li><a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a></li>
</ul>
</xsl:variable>
@ -98,6 +187,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
<xsl:text disable-output-escaping="yes">--&gt;</xsl:text></style>
</xsl:template>
<!--
@ -135,83 +225,6 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
</tr>
</table>
</xsl:template>
<!--
- returns the filename associated to an ID in the original file
-->
<xsl:template name="filename">
<xsl:param name="name" select="string(@href)"/>
<xsl:choose>
<xsl:when test="$name = '#Introducti'">
<xsl:text>intro.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Documentat'">
<xsl:text>docs.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Reporting'">
<xsl:text>bugs.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#help'">
<xsl:text>help.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Help'">
<xsl:text>help.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Downloads'">
<xsl:text>downloads.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#News'">
<xsl:text>news.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Contributi'">
<xsl:text>contribs.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#xsltproc'">
<xsl:text>xsltproc2.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#API'">
<xsl:text>API.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#XSLT'">
<xsl:text>XSLT.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#XML'">
<xsl:text>XML.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Validation'">
<xsl:text>valid.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#tree'">
<xsl:text>tree.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#library'">
<xsl:text>library.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#interface'">
<xsl:text>interface.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Example'">
<xsl:text>example.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Entities'">
<xsl:text>entities.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#architecture'">
<xsl:text>architecture.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#Namespaces'">
<xsl:text>namespaces.html</xsl:text>
</xsl:when>
<xsl:when test="$name = '#DOM'">
<xsl:text>DOM.html</xsl:text>
</xsl:when>
<xsl:when test="$name = ''">
<xsl:text>unknown.html</xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$name"/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!--
- Handling of nodes in the body before the first H2, table of content
- Everything is just copied over, except href which may get rewritten

View File

@ -8,6 +8,7 @@ BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; ma
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>The tree output</title>
</head>
@ -27,8 +28,8 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
@ -36,21 +37,21 @@ H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>

View File

@ -1,23 +1,82 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>Upgrading libxml client code from 1.x to 2.x</title>
<meta name="GENERATOR" content="amaya V5.0">
<meta http-equiv="Content-Type" content="text/html">
<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
<style type="text/css"><!--
TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Upgrading 1.x code</title>
</head>
<body bgcolor="#ffffff">
<h1 align="center">Upgrading libxml client code from 1.x to 2.x</h1>
<h2>Incompatible changes:</h2>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
<td width="180">
<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>Upgrading 1.x code</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
</ul></td></tr>
</table>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
<li><a href="ftp://xmlsoft.org/">FTP</a></li>
<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
</ul></td></tr>
</table>
</td></tr></table></td>
<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
<p>Incompatible changes:</p>
<p>Version 2 of libxml is the first version introducing serious backward
incompatible changes. The main goals were:</p>
<ul>
<li>a general cleanup. A number of mistakes inherited from the very early
versions couldn't be changed due to compatibility constraints. Example
the "childs" element in the nodes.</li>
the &quot;childs&quot; element in the nodes.</li>
<li>Uniformization of the various nodes, at least for their header and link
parts (doc, parent, children, prev, next), the goal is a simpler
programming model and simplifying the task of the DOM implementors.</li>
@ -28,9 +87,7 @@ incompatible changes. The main goals were:</p>
containing blank text may populate the DOM tree which were not present
before.</li>
</ul>
<h2>How to fix libxml-1.x code:</h2>
<h3>How to fix libxml-1.x code:</h3>
<p>So client code of libxml designed to run with version 1.x may have to be
changed to compile against version 2.x of libxml. Here is a list of changes
that I have collected, they may not be sufficient, so in case you find other
@ -42,7 +99,7 @@ mail</a>:</p>
select the right parameters libxml2</li>
<li>Node <strong>childs</strong> field has been renamed
<strong>children</strong> so s/childs/children/g should be applied
(probablility of having "childs" anywere else is close to 0+</li>
(probablility of having &quot;childs&quot; anywere else is close to 0+</li>
<li>The document don't have anymore a <strong>root</strong> element it has
been replaced by <strong>children</strong> and usually you will get a
list of element here. For example a Dtd element for the internal subset
@ -84,9 +141,7 @@ mail</a>:</p>
<li>xmlDetectCharEncoding takes an extra argument indicating the lenght in
byte of the head of the document available for character detection.</li>
</ol>
<h2>Ensuring both libxml-1.x and libxml-2.x compatibility</h2>
<h3>Ensuring both libxml-1.x and libxml-2.x compatibility</h3>
<p>Two new version of libxml (1.8.11) and libxml2 (2.3.4) have been released
to allow smoth upgrade of existing libxml v1code while retaining
compatibility. They offers the following:</p>
@ -95,19 +150,21 @@ compatibility. They offers the following:</p>
<strong>#include&lt;libxml/...&gt;</strong> in both cases.</li>
<li>similar identifiers defined via macros for the child and root fields:
respectively <strong>xmlChildrenNode</strong> and
<strong>xmlRootNode</strong></li>
<strong>xmlRootNode</strong>
</li>
<li>a new macro <strong>LIBXML_TEST_VERSION</strong> which should be
inserted once in the client code</li>
</ol>
<p>So the roadmap to upgrade your existing libxml applications is the
following:</p>
<ol>
<li>install the libxml-1.8.8 (and libxml-devel-1.8.8) packages</li>
<li>find all occurences where the xmlDoc <strong>root</strong> field is
used and change it to <strong>xmlRootNode</strong></li>
used and change it to <strong>xmlRootNode</strong>
</li>
<li>similary find all occurences where the xmlNode <strong>childs</strong>
field is used and change it to <strong>xmlChildrenNode</strong></li>
field is used and change it to <strong>xmlChildrenNode</strong>
</li>
<li>add a <strong>LIBXML_TEST_VERSION</strong> macro somewhere in your
<strong>main()</strong> or in the library init entry point</li>
<li>Recompile, check compatibility, it should still work</li>
@ -124,17 +181,14 @@ following:</p>
code before calling the parser (next to
<strong>LIBXML_TEST_VERSION</strong> is a fine place).</li>
</ol>
<p>Following those steps should work. It worked for some of my own code.</p>
<p>Let me put some emphasis on the fact that there is far more changes from
libxml 1.x to 2.x than the ones you may have to patch for. The overall code
has been considerably improved and the conformance to the XML specification
has been drastically improve. Don't take those changes as an excuse to not
upgrade, it may cost a lot on the long term ...</p>
has been considerably cleaned up and the conformance to the XML specification
has been drastically improved too. Don't take those changes as an excuse to
not upgrade, it may cost a lot on the long term ...</p>
<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
<p>$Id: upgrade.html,v 1.9 2001/06/24 12:13:21 veillard Exp $</p>
</td></tr></table></td></tr></table></td></tr></table></td>
</tr></table></td></tr></table>
</body>
</html>

View File

@ -1,108 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
<style type="text/css"><!--
TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
--></style>
<title>Validation, or are you afraid of DTDs ?</title>
</head>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
<td width="180">
<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>Validation, or are you afraid of DTDs ?</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">An overview of libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="library.html">The XML library interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="valid.html">Validation, or are you afraid of DTDs ?</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="catalog.html">Catalogs support</a></li>
<li><a href="xmlio.html">I/O interfaces</a></li>
<li><a href="xmlmem.html">Memory interfaces</a></li>
<li><a href="xmldtd.html">DTD support</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
</ul></td></tr>
</table>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
<li><a href="ftp://xmlsoft.org/">FTP</a></li>
<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
</ul></td></tr>
</table>
</td></tr></table></td>
<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
<p>Well what is validation and what is a DTD ?</p>
<p>Validation is the process of checking a document against a set of
construction rules; a <strong>DTD</strong> (Document Type Definition) is such
a set of rules.</p>
<p>The validation process and building DTDs are the two most difficult parts
of the XML life cycle. Briefly a DTD defines all the possibles element to be
found within your document, what is the formal shape of your document tree
(by defining the allowed content of an element, either text, a regular
expression for the allowed list of children, or mixed content i.e. both text
and children). The DTD also defines the allowed attributes for all elements
and the types of the attributes. For more detailed information, I suggest
that you read the related parts of the XML specification, the examples found
under gnome-xml/test/valid/dtd and any of the large number of books available
on XML. The dia example in gnome-xml/test/valid should be both simple and
complete enough to allow you to build your own.</p>
<p>A word of warning, building a good DTD which will fit the needs of your
application in the long-term is far from trivial; however, the extra level of
quality it can ensure is well worth the price for some sets of applications
or if you already have already a DTD defined for your application field.</p>
<p>The validation is not completely finished but in a (very IMHO) usable
state. Until a real validation interface is defined the way to do it is to
define and set the <strong>xmlDoValidityCheckingDefaultValue</strong>
external variable to 1, this will of course be changed at some point:</p>
<p>extern int xmlDoValidityCheckingDefaultValue;</p>
<p>...</p>
<p>xmlDoValidityCheckingDefaultValue = 1;</p>
<p>
<p>To handle external entities, use the function
<strong>xmlSetExternalEntityLoader</strong>(xmlExternalEntityLoader f); to
link in you HTTP/FTP/Entities database library to the standard libxml
core.</p>
<p>@@interfaces@@</p>
<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
</td></tr></table></td></tr></table></td></tr></table></td>
</tr></table></td></tr></table>
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@ -1,30 +1,83 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>Libxml Input/Output handling</title>
<meta name="GENERATOR" content="amaya V4.1">
<meta http-equiv="Content-Type" content="text/html">
<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
<style type="text/css"><!--
TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Validation &amp; DTDs</title>
</head>
<body bgcolor="#ffffff">
<h1 align="center">Libxml DTD support</h1>
<p>Location: <a
href="http://xmlsoft.org/xmlio.html">http://xmlsoft.org/xmldtd.html</a></p>
<p>Libxml home page: <a href="http://xmlsoft.org/">http://xmlsoft.org/</a></p>
<p>Mailing-list archive: <a
href="http://xmlsoft.org/messages/">http://xmlsoft.org/messages/</a></p>
<p>Version: $Revision$</p>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
<td width="180">
<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>Validation &amp; DTDs</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
</ul></td></tr>
</table>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
<li><a href="ftp://xmlsoft.org/">FTP</a></li>
<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
</ul></td></tr>
</table>
</td></tr></table></td>
<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
<p>Table of Content:</p>
<ol>
<li><a href="#General">General overview</a></li>
<li><a href="#General5">General overview</a></li>
<li><a href="#definition">The definition</a></li>
<li><a href="#Simple">Simple rules</a>
<ol>
<li><a href="#reference">How to reference a DTD from a document</a></li>
<li>
<a href="#Simple">Simple rules</a><ol>
<li><a href="#reference">How to reference a DTD from a
document</a></li>
<li><a href="#Declaring">Declaring elements</a></li>
<li><a href="#Declaring1">Declaring attributes</a></li>
</ol>
@ -33,18 +86,23 @@ href="http://xmlsoft.org/messages/">http://xmlsoft.org/messages/</a></p>
<li><a href="#validate">How to validate</a></li>
<li><a href="#Other">Other resources</a></li>
</ol>
<h2><a name="General">General overview</a></h2>
<h3><a name="General5">General overview</a></h3>
<p>Well what is validation and what is a DTD ?</p>
<p>DTD is the acronym for Document Type Definition. This is a description of
the content for a familly of XML files. This is part of the XML 1.0
specification, and alows to describe and check that a given document instance
conforms to a set of rules detailing its structure and content.</p>
<h2><a name="definition">The definition</a></h2>
<p>The <a href="http://www.w3.org/TR/REC-xml">W3C XML Recommendation</a> (<a
href="http://www.xml.com/axml/axml.html">Tim Bray's annotated version of
<p>Validation is the process of checking a document against a DTD (more
generally against a set of construction rules).</p>
<p>The validation process and building DTDs are the two most difficult parts
of the XML life cycle. Briefly a DTD defines all the possibles element to be
found within your document, what is the formal shape of your document tree
(by defining the allowed content of an element, either text, a regular
expression for the allowed list of children, or mixed content i.e. both text
and children). The DTD also defines the allowed attributes for all elements
and the types of the attributes.</p>
<h3><a name="definition1">The definition</a></h3>
<p>The <a href="http://www.w3.org/TR/REC-xml">W3C XML Recommendation</a> (<a href="http://www.xml.com/axml/axml.html">Tim Bray's annotated version of
Rev1</a>):</p>
<ul>
<li><a href="http://www.w3.org/TR/REC-xml#elemdecls">Declaring
@ -52,31 +110,24 @@ Rev1</a>):</p>
<li><a href="http://www.w3.org/TR/REC-xml#attdecls">Declaring
attributes</a></li>
</ul>
<p>(unfortunately) all this is inherited from the SGML world, the syntax is
ancient...</p>
<h2><a name="Simple">Simple rules</a></h2>
<h3><a name="Simple1">Simple rules</a></h3>
<p>Writing DTD can be done in multiple ways, the rules to build them if you
need something fixed or something which can evolve over time can be radically
different. Really complex DTD like Docbook ones are flexible but quite harder
to design. I will just focuse on DTDs for a formats with a fixed simple
structure. It is just a set of basic rules, and definitely not exhaustive nor
useable for complex DTD design.</p>
<h3><a name="reference">How to reference a DTD from a document</a>:</h3>
<h4>
<a name="reference1">How to reference a DTD from a document</a>:</h4>
<p>Assuming the top element of the document is <code>spec</code> and the dtd
is placed in the file <code>mydtd</code> in the subdirectory <code>dtds</code>
of the directory from where the document were loaded:</p>
<p><code>&lt;!DOCTYPE spec SYSTEM "dtds/mydtd"&gt;</code></p>
is placed in the file <code>mydtd</code> in the subdirectory
<code>dtds</code> of the directory from where the document were loaded:</p>
<p><code>&lt;!DOCTYPE spec SYSTEM &quot;dtds/mydtd&quot;&gt;</code></p>
<p>Notes:</p>
<ul>
<li>the system string is actually an URI-Reference (as defined in <a
href="http://www.ietf.org/rfc/rfc2396.txt">RFC 2396</a>) so you can use a
<li>the system string is actually an URI-Reference (as defined in <a href="http://www.ietf.org/rfc/rfc2396.txt">RFC 2396</a>) so you can use a
full URL string indicating the location of your DTD on the Web, this is a
really good thing to do if you want others to validate your document</li>
<li>it is also possible to associate a <code>PUBLIC</code> identifier (a
@ -87,119 +138,92 @@ of the directory from where the document were loaded:</p>
told to the parser/validator as the first element of the
<code>DOCTYPE</code> declaration.</li>
</ul>
<h3><a name="Declaring">Declaring elements</a>:</h3>
<h4>
<a name="Declaring2">Declaring elements</a>:</h4>
<p>The following declares an element <code>spec</code>:</p>
<p><code>&lt;!ELEMENT spec (front, body, back?)&gt;</code></p>
<p>it also expresses that the spec element contains one <code>front</code>,
one <code>body</code> and one optionnal <code>back</code> children elements in
this order. The declaration of one element of the structure and its content
are done in a single declaration. Similary the following declares
one <code>body</code> and one optionnal <code>back</code> children elements
in this order. The declaration of one element of the structure and its
content are done in a single declaration. Similary the following declares
<code>div1</code> elements:</p>
<p><code>&lt;!ELEMENT div1 (head, (p | list | note)*, div2*)&gt;</code></p>
<p>means div1 contains one <code>head</code> then a series of optional
<code>p</code>, <code>list</code>s and <code>note</code>s and then an optional
<code>div2</code>. And last but not least an element can contain text:</p>
<code>p</code>, <code>list</code>s and <code>note</code>s and then an
optional <code>div2</code>. And last but not least an element can contain
text:</p>
<p><code>&lt;!ELEMENT b (#PCDATA)&gt;</code></p>
<p><code>b</code> contains text or being of mixed content (text and elements
<p>
<code>b</code> contains text or being of mixed content (text and elements
in no particular order):</p>
<p><code>&lt;!ELEMENT p (#PCDATA|a|ul|b|i|em)*&gt;</code></p>
<p><code>p </code>can contain text or <code>a</code>, <code>ul</code>,
<p>
<code>p </code>can contain text or <code>a</code>, <code>ul</code>,
<code>b</code>, <code>i </code>or <code>em</code> elements in no particular
order.</p>
<h3><a name="Declaring1">Declaring attributes</a>:</h3>
<h4>
<a name="Declaring1">Declaring attributes</a>:</h4>
<p>again the attributes declaration includes their content definition:</p>
<p><code>&lt;!ATTLIST termdef name CDATA #IMPLIED&gt;</code></p>
<p>means that the element <code>termdef</code> can have a <code>name</code>
attribute containing text (<code>CDATA</code>) and which is optionnal
(<code>#IMPLIED</code>). The attribute value can also be defined within a
set:</p>
<p><code>&lt;!ATTLIST list type (bullets|ordered|glossary)
"ordered"&gt;</code></p>
&quot;ordered&quot;&gt;</code></p>
<p>means <code>list</code> element have a <code>type</code> attribute with 3
allowed values "bullets", "ordered" or "glossary" and which default to
"ordered" if the attribute is not explicitely specified.</p>
allowed values &quot;bullets&quot;, &quot;ordered&quot; or &quot;glossary&quot; and which default to
&quot;ordered&quot; if the attribute is not explicitely specified.</p>
<p>The content type of an attribute can be text (<code>CDATA</code>),
anchor/reference/references
(<code>ID</code>/<code>IDREF</code>/<code>IDREFS</code>), entity(ies)
(<code>ENTITY</code>/<code>ENTITIES</code>) or name(s)
(<code>NMTOKEN</code>/<code>NMTOKENS</code>). The following defines that a
<code>chapter</code> element can have an optional <code>id</code> attribute of
type <code>ID</code>, usable for reference from attribute of type IDREF:</p>
<code>chapter</code> element can have an optional <code>id</code> attribute
of type <code>ID</code>, usable for reference from attribute of type
IDREF:</p>
<p><code>&lt;!ATTLIST chapter id ID #IMPLIED&gt;</code></p>
<p>The last value of an attribute definition can be <code>#REQUIRED
</code>meaning that the attribute has to be given, <code>#IMPLIED</code>
meaning that it is optional, or the default value (possibly prefixed by
<code>#FIXED</code> if it is the only allowed).</p>
<p>Notes:</p>
<ul>
<li>usually the attributes pertaining to a given element are declared in a
<ul><li>usually the attributes pertaining to a given element are declared in a
single expression, but it is just a convention adopted by a lot of DTD
writers:
<pre>&lt;!ATTLIST termdef
id ID #REQUIRED
name CDATA #IMPLIED&gt;</pre>
<p>The previous construct defines both <code>id</code> and
<code>name</code> attributes for the element <code>termdef</code></p>
</li>
</ul>
<h2><a name="Some">Some examples</a></h2>
<code>name</code> attributes for the element <code>termdef</code>
</p>
</li></ul>
<h3><a name="Some1">Some examples</a></h3>
<p>The directory <code>test/valid/dtds/</code> in the libxml distribution
contains some complex DTD examples. The <code>test/valid/dia.xml</code>
example shows an XML file where the simple DTD is directly included within the
document.</p>
<h2><a name="validate">How to validate</a></h2>
example shows an XML file where the simple DTD is directly included within
the document.</p>
<h3><a name="validate1">How to validate</a></h3>
<p>The simplest is to use the xmllint program comming with libxml. The
<code>--valid</code> option turn on validation of the files given as input,
for example the following validates a copy of the first revision of the XML
1.0 specification:</p>
<p><code>xmllint --valid --noout test/valid/REC-xml-19980210.xml</code></p>
<p>the -- noout is used to not output the resulting tree.</p>
<p>The <code>--dtdvalid dtd</code> allows to validate the document(s) against
a given DTD.</p>
<p>Libxml exports an API to handle DTDs and validation, check the <a
href="http://xmlsoft.org/html/libxml-valid.html">associated
<p>Libxml exports an API to handle DTDs and validation, check the <a href="http://xmlsoft.org/html/libxml-valid.html">associated
description</a>.</p>
<h2><a name="Other">Other resources</a></h2>
<h3><a name="Other1">Other resources</a></h3>
<p>DTDs are as old as SGML. So there may be a number of examples on-line, I
will just list one for now, others pointers welcome:</p>
<ul>
<li><a href="http://www.xml101.com:8081/dtd/">XML-101 DTD</a></li>
</ul>
<p></p>
<ul><li><a href="http://www.xml101.com:8081/dtd/">XML-101 DTD</a></li></ul>
<p>I suggest looking at the examples found under test/valid/dtd and any of
the large number of books available on XML. The dia example in test/valid
should be both simple and complete enough to allow you to build your own.</p>
<p>
<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
<p>$Id$</p>
</td></tr></table></td></tr></table></td></tr></table></td>
</tr></table></td></tr></table>
</body>
</html>

View File

@ -1,53 +1,98 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>Libxml Input/Output handling</title>
<meta name="GENERATOR" content="amaya V3.2.1">
<meta http-equiv="Content-Type" content="text/html">
<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
<style type="text/css"><!--
TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>I/O Interfaces</title>
</head>
<body bgcolor="#ffffff">
<h1 align="center">Libxml Input/Output handling</h1>
<p>Location: <a
href="http://xmlsoft.org/xmlio.html">http://xmlsoft.org/xmlio.html</a></p>
<p>Libxml home page: <a href="http://xmlsoft.org/">http://xmlsoft.org/</a></p>
<p>Mailing-list archive: <a
href="http://xmlsoft.org/messages/">http://xmlsoft.org/messages/</a></p>
<p>Version: $Revision: 1.4 $</p>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
<td width="180">
<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>I/O Interfaces</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
</ul></td></tr>
</table>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
<li><a href="ftp://xmlsoft.org/">FTP</a></li>
<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
</ul></td></tr>
</table>
</td></tr></table></td>
<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
<p>Table of Content:</p>
<ol>
<li><a href="#General">General overview</a></li>
<li><a href="#General1">General overview</a></li>
<li><a href="#basic">The basic buffer type</a></li>
<li><a href="#Input">Input I/O handlers</a></li>
<li><a href="#Output">Output I/O handlers</a></li>
<li><a href="#entities">The entities loader</a></li>
<li><a href="#Example">Example of customized I/O</a></li>
<li><a href="#Example2">Example of customized I/O</a></li>
</ol>
<h2><a name="General">General overview</a></h2>
<p>The module <code><a
href="http://xmlsoft.org/html/libxml-xmlio.html">xmlIO.h</a></code>
provides the interfaces to the libxml I/O system. This consists of 4 main
parts:</p>
<h3><a name="General1">General overview</a></h3>
<p>The module <code><a href="http://xmlsoft.org/html/libxml-xmlio.html">xmlIO.h</a></code> provides
the interfaces to the libxml I/O system. This consists of 4 main parts:</p>
<ul>
<li>Entities loader, this is a routine which tries to fetch the entities
(files) based on their PUBLIC and SYSTEM identifiers. The default loader
don't look at the public identifier since libxml do not maintain a
catalog. You can redefine you own entity loader by using
<code>xmlGetExternalEntityLoader()</code> and
<code>xmlSetExternalEntityLoader()</code>. <a href="#entities">Check the
example</a>.</li>
<code>xmlSetExternalEntityLoader()</code>. <a href="#entities">Check the example</a>.</li>
<li>Input I/O buffers which are a commodity structure used by the parser(s)
input layer to handle fetching the informations to feed the parser. This
provides buffering and is also a placeholder where the encoding convertors
to UTF8 are piggy-backed.</li>
provides buffering and is also a placeholder where the encoding
convertors to UTF8 are piggy-backed.</li>
<li>Output I/O buffers are similar to the Input ones and fulfill similar
task but when generating a serialization from a tree.</li>
<li>A mechanism to register sets of I/O callbacks and associate them with
@ -56,15 +101,14 @@ parts:</p>
handlers for certain names.</p>
</li>
</ul>
<p>The general mechanism used when loading http://rpmfind.net/xml.html for
example in the HTML parser is the following:</p>
<ol>
<li>The default entity loader calls <code>xmlNewInputFromFile()</code> with
the parsing context and the URI string.</li>
<li>the URI string is checked against the existing registered handlers using
their match() callback function, if the HTTP module was compiled in, it is
registered and its match() function will succeeds</li>
<li>the URI string is checked against the existing registered handlers
using their match() callback function, if the HTTP module was compiled
in, it is registered and its match() function will succeeds</li>
<li>the open() function of the handler is called and if successful will
return an I/O Input buffer</li>
<li>the parser will the start reading from this buffer and progressively
@ -77,48 +121,37 @@ example in the HTML parser is the following:</p>
called once and the Input buffer and associed resources are
deallocated.</li>
</ol>
<p>The user defined callbacks are checked first to allow overriding of the
default libxml I/O routines.</p>
<h2><a name="basic">The basic buffer type</a></h2>
<h3><a name="basic">The basic buffer type</a></h3>
<p>All the buffer manipulation handling is done using the
<code>xmlBuffer</code> type define in <code><a
href="http://xmlsoft.org/html/libxml-tree.html">tree.h</a> </code>which is
a resizable memory buffer. The buffer allocation strategy can be selected to
be either best-fit or use an exponential doubling one (CPU vs. memory use
<code>xmlBuffer</code> type define in <code><a href="http://xmlsoft.org/html/libxml-tree.html">tree.h</a></code>which is a
resizable memory buffer. The buffer allocation strategy can be selected to be
either best-fit or use an exponential doubling one (CPU vs. memory use
tradeoff). The values are <code>XML_BUFFER_ALLOC_EXACT</code> and
<code>XML_BUFFER_ALLOC_DOUBLEIT</code>, and can be set individually or on a
system wide basis using <code>xmlBufferSetAllocationScheme()</code>. A number
of functions allows to manipulate buffers with names starting with the
<code>xmlBuffer...</code> prefix.</p>
<h2><a name="Input">Input I/O handlers</a></h2>
<h3><a name="Input">Input I/O handlers</a></h3>
<p>An Input I/O handler is a simple structure
<code>xmlParserInputBuffer</code> containing a context associated to the
resource (file descriptor, or pointer to a protocol handler), the read() and
close() callbacks to use and an xmlBuffer. And extra xmlBuffer and a charset
encoding handler are also present to support charset conversion when
needed.</p>
<h2><a name="Output">Output I/O handlers</a></h2>
<h3><a name="Output">Output I/O handlers</a></h3>
<p>An Output handler <code>xmlOutputBuffer</code> is completely similar to an
Input one except the callbacks are write() and close().</p>
<h2><a name="entities">The entities loader</a></h2>
<h3><a name="entities">The entities loader</a></h3>
<p>The entity loader resolves requests for new entities and create inputs for
the parser. Creating an input from a filename or an URI string is done through
the xmlNewInputFromFile() routine. The default entity loader do not handle
the PUBLIC identifier associated with an entity (if any). So it just calls
xmlNewInputFromFile() with the SYSTEM identifier (which is mandatory in
the parser. Creating an input from a filename or an URI string is done
through the xmlNewInputFromFile() routine. The default entity loader do not
handle the PUBLIC identifier associated with an entity (if any). So it just
calls xmlNewInputFromFile() with the SYSTEM identifier (which is mandatory in
XML).</p>
<p>If you want to hook up a catalog mechanism then you simply need to override
the default entity loader, here is an example:</p>
<p>If you want to hook up a catalog mechanism then you simply need to
override the default entity loader, here is an example:</p>
<pre>#include &lt;libxml/xmlIO.h&gt;
xmlExternalEntityLoader defaultLoader = NULL;
@ -149,13 +182,10 @@ int main(..) {
...
}</pre>
<h2><a name="Example">Example of customized I/O</a></h2>
<h3><a name="Example2">Example of customized I/O</a></h3>
<p>This example come from <a href="http://xmlsoft.org/messages/0708.html">a
real use case</a>, xmlDocDump() closes the FILE * passed by the application
and this was a problem. The <a
href="http://xmlsoft.org/messages/0711.html">solution</a> was to redefine a
and this was a problem. The <a href="http://xmlsoft.org/messages/0711.html">solution</a> was to redefine a
new output handler with the closing call deactivated:</p>
<ol>
<li>First define a new I/O ouput allocator where the output don't close the
@ -177,6 +207,7 @@ xmlOutputBufferCreateOwn(FILE *file, xmlCharEncodingHandlerPtr encoder) {
<EFBFBD><EFBFBD><EFBFBD><EFBFBD>return(ret); <br>
} </pre>
</li>
<li>And then use it to save the document:
@ -193,9 +224,8 @@ res = xmlSaveFileTo(output, doc, NULL);
</pre>
</li>
</ol>
<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
<p>$Id: xmlio.html,v 1.4 2001/01/29 08:22:12 veillard Exp $</p>
</td></tr></table></td></tr></table></td></tr></table></td>
</tr></table></td></tr></table>
</body>
</html>

View File

@ -1,38 +1,86 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
<head>
<title>Libxml memory management</title>
<meta name="GENERATOR" content="amaya V3.2">
<meta http-equiv="Content-Type" content="text/html">
<meta content="text/html; charset=ISO-8859-1" http-equiv="Content-Type">
<style type="text/css"><!--
TD {font-size: 10pt; font-family: Verdana,Arial,Helvetica}
BODY {font-size: 10pt; font-family: Verdana,Arial,Helvetica; margin-top: 5pt; margin-left: 0pt; margin-right: 0pt}
H1 {font-size: 16pt; font-family: Verdana,Arial,Helvetica}
H2 {font-size: 14pt; font-family: Verdana,Arial,Helvetica}
H3 {font-size: 12pt; font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }
--></style>
<title>Memory Management</title>
</head>
<body bgcolor="#ffffff">
<h1 align="center">Libxml memory management</h1>
<p>Location: <a
href="http://xmlsoft.org/xmlmem.html">http://xmlsoft.org/xmlmem.html</a></p>
<p>Libxml home page: <a href="http://xmlsoft.org/">http://xmlsoft.org/</a></p>
<p>Mailing-list archive: <a
href="http://xmlsoft.org/messages/">http://xmlsoft.org/messages/</a></p>
<p>Version: $Revision$</p>
<body bgcolor="#8b7765" text="#000000" link="#000000" vlink="#000000">
<table border="0" width="100%" cellpadding="5" cellspacing="0" align="center"><tr>
<td width="180">
<a href="http://www.gnome.org/"><img src="smallfootonly.gif" alt="Gnome Logo"></a><a href="http://www.w3.org/Status"><img src="w3c.png" alt="W3C Logo"></a><a href="http://www.redhat.com/"><img src="redhat.gif" alt="Red Hat Logo"></a>
</td>
<td><table border="0" width="90%" cellpadding="2" cellspacing="0" align="center" bgcolor="#000000"><tr><td><table width="100%" border="0" cellspacing="1" cellpadding="3" bgcolor="#fffacd"><tr><td align="center">
<h1>The XML C library for Gnome</h1>
<h2>Memory Management</h2>
</td></tr></table></td></tr></table></td>
</tr></table>
<table border="0" cellpadding="4" cellspacing="0" width="100%" align="center"><tr><td bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="2" width="100%"><tr>
<td valign="top" width="200" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Main Menu</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="index.html">Home</a></li>
<li><a href="intro.html">Introduction</a></li>
<li><a href="FAQ.html">FAQ</a></li>
<li><a href="docs.html">Documentation</a></li>
<li><a href="bugs.html">Reporting bugs and getting help</a></li>
<li><a href="help.html">How to help</a></li>
<li><a href="downloads.html">Downloads</a></li>
<li><a href="news.html">News</a></li>
<li><a href="XML.html">XML</a></li>
<li><a href="XSLT.html">XSLT</a></li>
<li><a href="architecture.html">libxml architecture</a></li>
<li><a href="tree.html">The tree output</a></li>
<li><a href="interface.html">The SAX interface</a></li>
<li><a href="xmldtd.html">Validation &amp; DTDs</a></li>
<li><a href="xmlmem.html">Memory Management</a></li>
<li><a href="encoding.html">Encodings support</a></li>
<li><a href="xmlio.html">I/O Interfaces</a></li>
<li><a href="catalog.html">Catalog support</a></li>
<li><a href="library.html">The parser interfaces</a></li>
<li><a href="entities.html">Entities or no entities</a></li>
<li><a href="namespaces.html">Namespaces</a></li>
<li><a href="upgrade.html">Upgrading 1.x code</a></li>
<li><a href="DOM.html">DOM Principles</a></li>
<li><a href="example.html">A real example</a></li>
<li><a href="contribs.html">Contributions</a></li>
<li>
<a href="xml.html">flat page</a>, <a href="site.xsl">stylesheet</a>
</li>
</ul></td></tr>
</table>
<table width="100%" border="0" cellspacing="1" cellpadding="3">
<tr><td colspan="1" bgcolor="#eecfa1" align="center"><center><b>Related links</b></center></td></tr>
<tr><td bgcolor="#fffacd"><ul style="margin-left: -2pt">
<li><a href="http://mail.gnome.org/archives/xml/">Mail archive</a></li>
<li><a href="http://xmlsoft.org/XSLT/">XSLT libxslt</a></li>
<li><a href="http://www.cs.unibo.it/~casarini/gdome2/">DOM gdome2</a></li>
<li><a href="ftp://xmlsoft.org/">FTP</a></li>
<li><a href="http://www.fh-frankfurt.de/~igor/projects/libxml/">Windows binaries</a></li>
<li><a href="http://pages.eidosnet.co.uk/~garypen/libxml/">Solaris binaries</a></li>
</ul></td></tr>
</table>
</td></tr></table></td>
<td valign="top" bgcolor="#8b7765"><table border="0" cellspacing="0" cellpadding="1" width="100%"><tr><td><table border="0" cellspacing="0" cellpadding="1" width="100%" bgcolor="#000000"><tr><td><table border="0" cellpadding="3" cellspacing="1" width="100%"><tr><td bgcolor="#fffacd">
<p>Table of Content:</p>
<ol>
<li><a href="#General">General overview</a></li>
<li><a href="#setting">Setting libxml set of memory routines</a></li>
<li><a href="#General3">General overview</a></li>
<li><a href="#setting">Setting libxml set of memory
routines</a></li>
<li><a href="#cleanup">Cleaning up after parsing</a></li>
<li><a href="#Debugging">Debugging routines</a></li>
<li><a href="#General">General memory requirements</a></li>
<li><a href="#General4">General memory requirements</a></li>
</ol>
<h2><a name="General">General overview</a></h2>
<p>The module <code><a
href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlmemory.h</a></code>
<h3><a name="General3">General overview</a></h3>
<p>The module <code><a href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlmemory.h</a></code>
provides the interfaces to the libxml memory system:</p>
<ul>
<li>libxml does not use the libc memory allocator directly but xmlFree(),
@ -41,73 +89,67 @@ provides the interfaces to the libxml memory system:</p>
default the libc ones i.e. free(), malloc() and realloc()</li>
<li>the xmlmemory.c module includes a set of debugging routine</li>
</ul>
<h2><a name="setting">Setting libxml set of memory routines</a></h2>
<h3><a name="setting">Setting libxml set of memory routines</a></h3>
<p>It is sometimes useful to not use the default memory allocator, either for
debugging, analysis or to implement a specific behaviour on memory management
(like on embedded systems). Two function calls are available to do so:</p>
<ul>
<li><a href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemGet
()</a> which return the current set of functions in use by the parser</li>
<li><a
href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemSetup()</a>
<li>
<a href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemGet ()</a>
which return the current set of functions in use by the parser</li>
<li>
<a href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemSetup()</a>
which allow to set up a new set of memory allocation functions</li>
</ul>
<p>Of course a call to xmlMemSetup() should probably be done before calling
any other libxml routines (unless you are sure your allocations routines are
compatibles).</p>
<h2><a name="cleanup">Cleaning up after parsing</a></h2>
<h3><a name="cleanup">Cleaning up after parsing</a></h3>
<p>Libxml is not stateless, there is a few set of memory structures needing
allocation before the parser is fully functionnal (some encoding structures
for example). This also mean that once parsing is finished there is a tiny
amount of memory (a few hundred bytes) which can be recollected if you don't
reuse the parser immediately:</p>
<ul>
<li><a href="http://xmlsoft.org/html/libxml-parser.html">xmlCleanupParser
()</a> is a centralized routine to free the parsing states. Note that it
won't deallocate any produced tree if any (use the xmlFreeDoc() and
related routines for this).</li>
<li><a href="http://xmlsoft.org/html/libxml-parser.html">xmlInitParser
()</a> is the dual routine allowing to preallocate the parsing state which
can be useful for example to avoid initialization reentrancy problems when
<li>
<a href="http://xmlsoft.org/html/libxml-parser.html">xmlCleanupParser
()</a>
is a centralized routine to free the parsing states. Note that it won't
deallocate any produced tree if any (use the xmlFreeDoc() and related
routines for this).</li>
<li>
<a href="http://xmlsoft.org/html/libxml-parser.html">xmlInitParser
()</a>
is the dual routine allowing to preallocate the parsing state which can
be useful for example to avoid initialization reentrancy problems when
using libxml in multithreaded applications</li>
</ul>
<p>Generally xmlCleanupParser() is safe, if needed the state will be rebuild
at the next invocation of parser routines, but be careful of the consequences
in multithreaded applications.</p>
<h2><a name="Debugging">Debugging routines</a></h2>
<p>When configured using --with-mem-debug flag (off by default), libxml uses a
set of memory allocation debugging routineskeeping track of all allocated
<h3><a name="Debugging">Debugging routines</a></h3>
<p>When configured using --with-mem-debug flag (off by default), libxml uses
a set of memory allocation debugging routineskeeping track of all allocated
blocks and the location in the code where the routine was called. A couple of
other debugging routines allow to dump the memory allocated infos to a file or
call a specific routine when a given block number is allocated:</p>
other debugging routines allow to dump the memory allocated infos to a file
or call a specific routine when a given block number is allocated:</p>
<ul>
<li><a
href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMallocLoc()</a>
<a
href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlReallocLoc()</a>
and <a
href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemStrdupLoc()</a>
<li>
<a href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMallocLoc()</a><a href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlReallocLoc()</a>
and <a href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemStrdupLoc()</a>
are the memory debugging replacement allocation routines</li>
<li><a href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemoryDump
()</a> dumps all the informations about the allocated memory block lefts
in the <code>.memdump</code> file</li>
<li>
<a href="http://xmlsoft.org/html/libxml-xmlmemory.html">xmlMemoryDump
()</a>
dumps all the informations about the allocated memory block lefts in the
<code>.memdump</code> file</li>
</ul>
<p>When developping libxml memory debug is enabled, the tests programs call
xmlMemoryDump () and the "make test" regression tests will check for any
xmlMemoryDump () and the &quot;make test&quot; regression tests will check for any
memory leak during the full regression test sequence, this helps a lot
ensuring that libxml does not leak memory and bullet proof memory allocations
use (some libc implementations are known to be far too permissive resulting in
major portability problems!). </p>
ensuring that libxml does not leak memory and bullet proof memory
allocations use (some libc implementations are known to be far too permissive
resulting in major portability problems!).</p>
<p>If the .memdump reports a leak, it displays the allocation function and
also tries to give some informations about the content and structure of the
allocated blocks left. This is sufficient in most cases to find the culprit,
@ -123,38 +165,33 @@ possible to find more easilly:</p>
allocation an step to see the condition resulting in the missing
deallocation.</li>
</ol>
<p>I used to use a commercial tool to debug libxml memory problems but after
noticing that it was not detecting memory leaks that simple mechanism was used
and proved extremely efficient until now.</p>
<h2><a name="General">General memory requirements</a></h2>
<p>How much libxml memory require ? It's hard to tell in average it depends of
a number of things:</p>
noticing that it was not detecting memory leaks that simple mechanism was
used and proved extremely efficient until now.</p>
<h3><a name="General4">General memory requirements</a></h3>
<p>How much libxml memory require ? It's hard to tell in average it depends
of a number of things:</p>
<ul>
<li>the parser itself should work in a fixed amout of memory, except for
information maintained about the stacks of names and entities locations.
The I/O and encoding handlers will probably account for a few KBytes. This
is true for both the XML and HTML parser (though the HTML parser need more
state).</li>
The I/O and encoding handlers will probably account for a few KBytes.
This is true for both the XML and HTML parser (though the HTML parser
need more state).</li>
<li>If you are generating the DOM tree then memory requirements will grow
nearly lineary with the size of the data. In general for a balanced
textual document the internal memory requirement is about 4 times the size
of the UTF8 serialization of this document (exmple the XML-1.0
textual document the internal memory requirement is about 4 times the
size of the UTF8 serialization of this document (exmple the XML-1.0
recommendation is a bit more of 150KBytes and takes 650KBytes of main
memory when parsed). Validation will add a amount of memory required for
maintaining the external Dtd state which should be linear with the
complexity of the content model defined by the Dtd</li>
<li>If you don't care about the advanced features of libxml like validation,
DOM, XPath or XPointer, but really need to work fixed memory requirements,
then the SAX interface should be used.</li>
<li>If you don't care about the advanced features of libxml like
validation, DOM, XPath or XPointer, but really need to work fixed memory
requirements, then the SAX interface should be used.</li>
</ul>
<p></p>
<p>
<p><a href="mailto:daniel@veillard.com">Daniel Veillard</a></p>
<p>$Id$</p>
</td></tr></table></td></tr></table></td></tr></table></td>
</tr></table></td></tr></table>
</body>
</html>