mirror of
https://gitlab.gnome.org/GNOME/libxml2.git
synced 2025-08-07 06:43:02 +03:00
- doc/xml.html: applied patch from Ankh
Daniel
This commit is contained in:
@@ -1,3 +1,7 @@
|
|||||||
|
Mon Feb 26 09:30:23 CET 2001 Daniel Veillard <Daniel.Veillard@imag.fr>
|
||||||
|
|
||||||
|
* doc/xml.html: applied patch from Ankh
|
||||||
|
|
||||||
Mon Feb 26 03:34:43 CET 2001 Daniel Veillard <Daniel.Veillard@imag.fr>
|
Mon Feb 26 03:34:43 CET 2001 Daniel Veillard <Daniel.Veillard@imag.fr>
|
||||||
|
|
||||||
* xinclude.c: fixed a problem building on Mac
|
* xinclude.c: fixed a problem building on Mac
|
||||||
|
139
doc/xml.html
139
doc/xml.html
@@ -70,17 +70,17 @@ structured documents/data.</p>
|
|||||||
<ul>
|
<ul>
|
||||||
<li>Libxml exports Push and Pull type parser interfaces for both XML and
|
<li>Libxml exports Push and Pull type parser interfaces for both XML and
|
||||||
HTML.</li>
|
HTML.</li>
|
||||||
<li>Libxml can do Dtd validation at parse time, using a parsed document
|
<li>Libxml can do DTD validation at parse time, using a parsed document
|
||||||
instance, or with an arbitrary Dtd.</li>
|
instance, or with an arbitrary DTD.</li>
|
||||||
<li>Libxml now includes a nearly complete <a
|
<li>Libxml now includes nearly complete <a
|
||||||
href="http://www.w3.org/TR/xpath">XPath</a> and <a
|
href="http://www.w3.org/TR/xpath">XPath</a> and <a
|
||||||
href="http://www.w3.org/TR/xptr">XPointer</a> implementations.</li>
|
href="http://www.w3.org/TR/xptr">XPointer</a> implementations.</li>
|
||||||
<li>It is written in plain C, making as few assumptions as possible, and
|
<li>It is written in plain C, making as few assumptions as possible, and
|
||||||
sticking closely to ANSI C/POSIX for easy embedding. Works on
|
sticking closely to ANSI C/POSIX for easy embedding. Works on
|
||||||
Linux/Unix/Windows, ported to a number of other platforms.</li>
|
Linux/Unix/Windows, ported to a number of other platforms.</li>
|
||||||
<li>Basic support for HTTP and FTP client allowing to fetch remote
|
<li>Basic support for HTTP and FTP client allowing aplications to fetch remote
|
||||||
resources</li>
|
resources</li>
|
||||||
<li>The design of modular, most of the extensions can be compiled out.</li>
|
<li>The design is modular, most of the extensions can be compiled out.</li>
|
||||||
<li>The internal document repesentation is as close as possible to the <a
|
<li>The internal document repesentation is as close as possible to the <a
|
||||||
href="http://www.w3.org/DOM/">DOM</a> interfaces.</li>
|
href="http://www.w3.org/DOM/">DOM</a> interfaces.</li>
|
||||||
<li>Libxml also has a <a href="http://www.megginson.com/SAX/index.html">SAX
|
<li>Libxml also has a <a href="http://www.megginson.com/SAX/index.html">SAX
|
||||||
@@ -113,7 +113,7 @@ structured documents/data.</p>
|
|||||||
href="http://www-4.ibm.com/software/developer/library/gnome3/">an article
|
href="http://www-4.ibm.com/software/developer/library/gnome3/">an article
|
||||||
for IBM developerWorks</a> about using libxml.</li>
|
for IBM developerWorks</a> about using libxml.</li>
|
||||||
<li>It is also a good idea to check to <a href="mailto:raph@levien.com">Raph
|
<li>It is also a good idea to check to <a href="mailto:raph@levien.com">Raph
|
||||||
Levien</a> <a href="http://levien.com/gnome/">web site</a> since he is
|
Levien</a>'s <a href="http://levien.com/gnome/">web site</a> since he is
|
||||||
building the <a href="http://levien.com/gnome/gdome.html">DOM interface
|
building the <a href="http://levien.com/gnome/gdome.html">DOM interface
|
||||||
gdome</a> on top of libxml result tree and an implementation of <a
|
gdome</a> on top of libxml result tree and an implementation of <a
|
||||||
href="http://www.w3.org/Graphics/SVG/">SVG</a> called <a
|
href="http://www.w3.org/Graphics/SVG/">SVG</a> called <a
|
||||||
@@ -148,10 +148,10 @@ href="mailto:majordomo@rpmfind.net">majordomo@rpmfind.net</a> with "subscribe
|
|||||||
xml" in the <strong>content</strong> of the message.</p>
|
xml" in the <strong>content</strong> of the message.</p>
|
||||||
|
|
||||||
<p>Alternatively, you can just send the bug to the <a
|
<p>Alternatively, you can just send the bug to the <a
|
||||||
href="mailto:xml@rpmfind.net">xml@rpmfind.net</a> list, if it's really libxml
|
href="mailto:xml@rpmfind.net">xml@rpmfind.net</a> list; if it's really libxml
|
||||||
related I will approve it..</p>
|
related I will approve it..</p>
|
||||||
|
|
||||||
<p>Of course, bugs reports with a suggested patch for fixing them will
|
<p>Of course, bugs reported with a suggested patch for fixing them will
|
||||||
probably be processed faster.</p>
|
probably be processed faster.</p>
|
||||||
|
|
||||||
<p>If you're looking for help, a quick look at <a
|
<p>If you're looking for help, a quick look at <a
|
||||||
@@ -173,7 +173,7 @@ database:</a>:</p>
|
|||||||
<li>provide the diffs when you port libxml to a new platform. They may not
|
<li>provide the diffs when you port libxml to a new platform. They may not
|
||||||
be integrated in all cases but help pinpointing portability problems
|
be integrated in all cases but help pinpointing portability problems
|
||||||
and</li>
|
and</li>
|
||||||
<li>provice documentation fixes (either as patches to the code comments or
|
<li>provide documentation fixes (either as patches to the code comments or
|
||||||
as HTML diffs).</li>
|
as HTML diffs).</li>
|
||||||
<li>provide new documentations pieces (translations, examples, etc ...)</li>
|
<li>provide new documentations pieces (translations, examples, etc ...)</li>
|
||||||
<li>Check the TODO file and try to close one of the items</li>
|
<li>Check the TODO file and try to close one of the items</li>
|
||||||
@@ -227,7 +227,7 @@ platform, get in touch with me to upload the package. I will keep them in the
|
|||||||
href="http://cvs.gnome.org/lxr/source/gnome-xml/ChangeLog">Changelog</a> file
|
href="http://cvs.gnome.org/lxr/source/gnome-xml/ChangeLog">Changelog</a> file
|
||||||
for a really accurate description</h3>
|
for a really accurate description</h3>
|
||||||
|
|
||||||
<p>Item floating around but not actively worked on, get in touch with me if
|
<p>Items floating around but not actively worked on, get in touch with me if
|
||||||
you want to test those</p>
|
you want to test those</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li>Implementing <a href="http://xmlsoft.org/XSLT">XSLT</a>, this is done as
|
<li>Implementing <a href="http://xmlsoft.org/XSLT">XSLT</a>, this is done as
|
||||||
@@ -666,21 +666,22 @@ href="http://cvs.gnome.org/lxr/source/libxslt/ChangeLog">Changelog</a></p>
|
|||||||
|
|
||||||
<h2>An overview of libxml architecture</h2>
|
<h2>An overview of libxml architecture</h2>
|
||||||
|
|
||||||
<p>Libxml is made of multiple components, some of them optionals, and most of
|
<p>Libxml is made of multiple components; some of them are optional,
|
||||||
|
and most of
|
||||||
the block interfaces are public. The main components are:</p>
|
the block interfaces are public. The main components are:</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li>an Input/Output layer</li>
|
<li>an Input/Output layer</li>
|
||||||
<li>FTP and HTTP client layers (optionnal)</li>
|
<li>FTP and HTTP client layers (optional)</li>
|
||||||
<li>an Internationalization layer managing the encodings support</li>
|
<li>an Internationalization layer managing the encodings support</li>
|
||||||
<li>an URI module</li>
|
<li>a URI module</li>
|
||||||
<li>the XML parser and its basic SAX interface</li>
|
<li>the XML parser and its basic SAX interface</li>
|
||||||
<li>an HTML parser using the same SAX interface (optionnal)</li>
|
<li>an HTML parser using the same SAX interface (optional)</li>
|
||||||
<li>a SAX tree module to build an in-memory DOM representation</li>
|
<li>a SAX tree module to build an in-memory DOM representation</li>
|
||||||
<li>a tree module to manipulate the DOM representation</li>
|
<li>a tree module to manipulate the DOM representation</li>
|
||||||
<li>a validation module using the DOM representation (optionnal)</li>
|
<li>a validation module using the DOM representation (optional)</li>
|
||||||
<li>an XPath module for global lookup in a DOM representation
|
<li>an XPath module for global lookup in a DOM representation
|
||||||
(optionnal)</li>
|
(optional)</li>
|
||||||
<li>a debug module (optionnal)</li>
|
<li>a debug module (optional)</li>
|
||||||
</ul>
|
</ul>
|
||||||
|
|
||||||
<p>Graphically this gives the following:</p>
|
<p>Graphically this gives the following:</p>
|
||||||
@@ -697,7 +698,7 @@ returned is an <strong>xmlDocPtr</strong> (i.e., a pointer to an
|
|||||||
as the file name, the document type, and a <strong>children</strong> pointer
|
as the file name, the document type, and a <strong>children</strong> pointer
|
||||||
which is the root of the document (or more exactly the first child under the
|
which is the root of the document (or more exactly the first child under the
|
||||||
root which is the document). The tree is made of <strong>xmlNode</strong>s,
|
root which is the document). The tree is made of <strong>xmlNode</strong>s,
|
||||||
chained in double-linked lists of siblings and with children<->parent
|
chained in double-linked lists of siblings and with a children<->parent
|
||||||
relationship. An xmlNode can also carry properties (a chain of xmlAttr
|
relationship. An xmlNode can also carry properties (a chain of xmlAttr
|
||||||
structures). An attribute may have a value which is a list of TEXT or
|
structures). An attribute may have a value which is a list of TEXT or
|
||||||
ENTITY_REF nodes.</p>
|
ENTITY_REF nodes.</p>
|
||||||
@@ -711,7 +712,7 @@ should be only one ELEMENT under the root):</p>
|
|||||||
called <strong>xmllint</strong> which parses XML files given as argument and
|
called <strong>xmllint</strong> which parses XML files given as argument and
|
||||||
prints them back as parsed. This is useful for detecting errors both in XML
|
prints them back as parsed. This is useful for detecting errors both in XML
|
||||||
code and in the XML parser itself. It has an option <strong>--debug</strong>
|
code and in the XML parser itself. It has an option <strong>--debug</strong>
|
||||||
which prints the actual in-memory structure of the document, here is the
|
which prints the actual in-memory structure of the document; here is the
|
||||||
result with the <a href="#example">example</a> given before:</p>
|
result with the <a href="#example">example</a> given before:</p>
|
||||||
<pre>DOCUMENT
|
<pre>DOCUMENT
|
||||||
version=1.0
|
version=1.0
|
||||||
@@ -800,7 +801,7 @@ SAX.characters( , 1)
|
|||||||
SAX.endElement(EXAMPLE)
|
SAX.endElement(EXAMPLE)
|
||||||
SAX.endDocument()</pre>
|
SAX.endDocument()</pre>
|
||||||
|
|
||||||
<p>Most of the other functionalities of libxml are based on the DOM
|
<p>Most of the other interfaces of libxml are based on the DOM
|
||||||
tree-building facility, so nearly everything up to the end of this document
|
tree-building facility, so nearly everything up to the end of this document
|
||||||
presupposes the use of the standard DOM tree build. Note that the DOM tree
|
presupposes the use of the standard DOM tree build. Note that the DOM tree
|
||||||
itself is built by a set of registered default callbacks, without internal
|
itself is built by a set of registered default callbacks, without internal
|
||||||
@@ -841,7 +842,7 @@ failure).</p>
|
|||||||
|
|
||||||
<h3 id="Invoking1">Invoking the parser: the push method</h3>
|
<h3 id="Invoking1">Invoking the parser: the push method</h3>
|
||||||
|
|
||||||
<p>In order for the application to keep the control when the document is been
|
<p>In order for the application to keep the control when the document is being
|
||||||
fetched (which is common for GUI based programs) libxml provides a push
|
fetched (which is common for GUI based programs) libxml provides a push
|
||||||
interface, too, as of version 1.8.3. Here are the interface functions:</p>
|
interface, too, as of version 1.8.3. Here are the interface functions:</p>
|
||||||
<pre>xmlParserCtxtPtr xmlCreatePushParserCtxt(xmlSAXHandlerPtr sax,
|
<pre>xmlParserCtxtPtr xmlCreatePushParserCtxt(xmlSAXHandlerPtr sax,
|
||||||
@@ -876,18 +877,19 @@ int xmlParseChunk (xmlParserCtxtPtr ctxt,
|
|||||||
}
|
}
|
||||||
}</pre>
|
}</pre>
|
||||||
|
|
||||||
<p>Also note that the HTML parser embedded into libxml also has a push
|
<p>The HTML parser embedded into libxml also has a push
|
||||||
interface; the functions are just prefixed by "html" rather than "xml"</p>
|
interface; the functions are just prefixed by "html" rather than "xml".</p>
|
||||||
|
|
||||||
<h3 id="Invoking2">Invoking the parser: the SAX interface</h3>
|
<h3 id="Invoking2">Invoking the parser: the SAX interface</h3>
|
||||||
|
|
||||||
<p>A couple of comments can be made, first this mean that the parser is
|
<p>The tree-building interface makes the parser
|
||||||
memory-hungry, first to load the document in memory, second to build the tree.
|
memory-hungry, first loading the document in memory and then building
|
||||||
|
the tree itself.
|
||||||
Reading a document without building the tree is possible using the SAX
|
Reading a document without building the tree is possible using the SAX
|
||||||
interfaces (see SAX.h and <a
|
interfaces (see SAX.h and <a
|
||||||
href="http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html">James
|
href="http://www.daa.com.au/~james/gnome/xml-sax/xml-sax.html">James
|
||||||
Henstridge's documentation</a>). Note also that the push interface can be
|
Henstridge's documentation</a>). Note also that the push interface can be
|
||||||
limited to SAX. Just use the two first arguments of
|
limited to SAX: just use the two first arguments of
|
||||||
<code>xmlCreatePushParserCtxt()</code>.</p>
|
<code>xmlCreatePushParserCtxt()</code>.</p>
|
||||||
|
|
||||||
<h3><a name="Building">Building a tree from scratch</a></h3>
|
<h3><a name="Building">Building a tree from scratch</a></h3>
|
||||||
@@ -925,14 +927,14 @@ example:</p>
|
|||||||
<pre><code>doc->children->children->children</code></pre>
|
<pre><code>doc->children->children->children</code></pre>
|
||||||
|
|
||||||
<p>points to the title element,</p>
|
<p>points to the title element,</p>
|
||||||
<pre>doc->children->children->next->child->child</pre>
|
<pre>doc->children->children->next->children->children</pre>
|
||||||
|
|
||||||
<p>points to the text node containing the chapter title "The Linux
|
<p>points to the text node containing the chapter title "The Linux
|
||||||
adventure".</p>
|
adventure".</p>
|
||||||
|
|
||||||
<p><strong>NOTE</strong>: XML allows <em>PI</em>s and <em>comments</em> to be
|
<p><strong>NOTE</strong>: XML allows <em>PI</em>s and <em>comments</em> to be
|
||||||
present before the document root, so <code>doc->children</code> may point
|
present before the document root, so <code>doc->children</code> may point
|
||||||
to an element which is not the document Root Element, a function
|
to an element which is not the document Root Element; a function
|
||||||
<code>xmlDocGetRootElement()</code> was added for this purpose.</p>
|
<code>xmlDocGetRootElement()</code> was added for this purpose.</p>
|
||||||
|
|
||||||
<h3><a name="Modifying">Modifying the tree</a></h3>
|
<h3><a name="Modifying">Modifying the tree</a></h3>
|
||||||
@@ -959,7 +961,7 @@ elements:</p>
|
|||||||
<dl>
|
<dl>
|
||||||
<dt><code>xmlNodePtr xmlStringGetNodeList(xmlDocPtr doc, const xmlChar
|
<dt><code>xmlNodePtr xmlStringGetNodeList(xmlDocPtr doc, const xmlChar
|
||||||
*value);</code></dt>
|
*value);</code></dt>
|
||||||
<dd><p>This function takes an "external" string and convert it to one text
|
<dd><p>This function takes an "external" string and converts it to one text
|
||||||
node or possibly to a list of entity and text nodes. All non-predefined
|
node or possibly to a list of entity and text nodes. All non-predefined
|
||||||
entity references like &Gnome; will be stored internally as entity
|
entity references like &Gnome; will be stored internally as entity
|
||||||
nodes, hence the result of the function may not be a single node.</p>
|
nodes, hence the result of the function may not be a single node.</p>
|
||||||
@@ -974,8 +976,7 @@ elements:</p>
|
|||||||
argument inLine. If this argument is set to 1, the function will expand
|
argument inLine. If this argument is set to 1, the function will expand
|
||||||
entity references. For example, instead of returning the &Gnome;
|
entity references. For example, instead of returning the &Gnome;
|
||||||
XML encoding in the string, it will substitute it with its value (say,
|
XML encoding in the string, it will substitute it with its value (say,
|
||||||
"GNU Network Object Model Environment"). Set this argument if you want
|
"GNU Network Object Model Environment").</p>
|
||||||
to use the string for non-XML usage like User Interface.</p>
|
|
||||||
</dd>
|
</dd>
|
||||||
</dl>
|
</dl>
|
||||||
|
|
||||||
@@ -1043,7 +1044,7 @@ beginning). Example:</p>
|
|||||||
7 </EXAMPLE></pre>
|
7 </EXAMPLE></pre>
|
||||||
|
|
||||||
<p>Line 3 declares the xml entity. Line 6 uses the xml entity, by prefixing
|
<p>Line 3 declares the xml entity. Line 6 uses the xml entity, by prefixing
|
||||||
it's name with '&' and following it by ';' without any spaces added. There
|
its name with '&' and following it by ';' without any spaces added. There
|
||||||
are 5 predefined entities in libxml allowing you to escape charaters with
|
are 5 predefined entities in libxml allowing you to escape charaters with
|
||||||
predefined meaning in some parts of the xml document content:
|
predefined meaning in some parts of the xml document content:
|
||||||
<strong>&lt;</strong> for the character '<', <strong>&gt;</strong>
|
<strong>&lt;</strong> for the character '<', <strong>&gt;</strong>
|
||||||
@@ -1089,16 +1090,16 @@ suggest that you keep the non-substituting default behaviour and avoid using
|
|||||||
entities in your XML document or data if you are not willing to handle the
|
entities in your XML document or data if you are not willing to handle the
|
||||||
entity references elements in the DOM tree.</p>
|
entity references elements in the DOM tree.</p>
|
||||||
|
|
||||||
<p>Note that at save time libxml enforce the conversion of the predefined
|
<p>Note that at save time libxml enforces the conversion of the predefined
|
||||||
entities where necessary to prevent well-formedness problems, and will also
|
entities where necessary to prevent well-formedness problems, and will also
|
||||||
transparently replace those with chars (i.e., it will not generate entity
|
transparently replace those with chars (i.e. it will not generate entity
|
||||||
reference elements in the DOM tree or call the reference() SAX callback when
|
reference elements in the DOM tree or call the reference() SAX callback when
|
||||||
finding them in the input).</p>
|
finding them in the input).</p>
|
||||||
|
|
||||||
<p><span style="background-color: #FF0000">WARNING</span>: handling entities
|
<p><span style="background-color: #FF0000">WARNING</span>: handling entities
|
||||||
on top of libxml SAX interface is difficult !!! If you plan to use
|
on top of the libxml SAX interface is difficult!!! If you plan to use
|
||||||
non-predefined entities in your documents, then the learning cuvre to handle
|
non-predefined entities in your documents, then the learning cuvre to handle
|
||||||
then using the SAX API may be long. If you plan to use complex document, I
|
then using the SAX API may be long. If you plan to use complex documents, I
|
||||||
strongly suggest you consider using the DOM interface instead and let libxml
|
strongly suggest you consider using the DOM interface instead and let libxml
|
||||||
deal with the complexity rather than trying to do it yourself.</p>
|
deal with the complexity rather than trying to do it yourself.</p>
|
||||||
|
|
||||||
@@ -1115,15 +1116,15 @@ equality operation at the user level.</p>
|
|||||||
<p>I suggest that people using libxml use a namespace, and declare it in the
|
<p>I suggest that people using libxml use a namespace, and declare it in the
|
||||||
root element of their document as the default namespace. Then they don't need
|
root element of their document as the default namespace. Then they don't need
|
||||||
to use the prefix in the content but we will have a basis for future semantic
|
to use the prefix in the content but we will have a basis for future semantic
|
||||||
refinement and merging of data from different sources. This doesn't augment
|
refinement and merging of data from different sources. This doesn't increase
|
||||||
significantly the size of the XML output, but significantly increase its value
|
the size of the XML output significantly, but significantly increases its value
|
||||||
in the long-term. Example:</p>
|
in the long-term. Example:</p>
|
||||||
<pre><mydoc xmlns="http://mydoc.example.org/schemas/">
|
<pre><mydoc xmlns="http://mydoc.example.org/schemas/">
|
||||||
<elem1>...</elem1>
|
<elem1>...</elem1>
|
||||||
<elem2>...</elem2>
|
<elem2>...</elem2>
|
||||||
</mydoc></pre>
|
</mydoc></pre>
|
||||||
|
|
||||||
<p>Concerning the namespace value, this has to be an URL, but the URL doesn't
|
<p>The namespace value has to be an absolute URL, but the URL doesn't
|
||||||
have to point to any existing resource on the Web. It will bind all the
|
have to point to any existing resource on the Web. It will bind all the
|
||||||
element and atributes with that URL. I suggest to use an URL within a domain
|
element and atributes with that URL. I suggest to use an URL within a domain
|
||||||
you control, and that the URL should contain some kind of version information
|
you control, and that the URL should contain some kind of version information
|
||||||
@@ -1135,22 +1136,22 @@ version-independent prefix is installed on the root element of your document,
|
|||||||
and if the version information don't match something you know, warn the user
|
and if the version information don't match something you know, warn the user
|
||||||
and be liberal in what you accept as the input. Also do *not* try to base
|
and be liberal in what you accept as the input. Also do *not* try to base
|
||||||
namespace checking on the prefix value. <foo:text> may be exactly the
|
namespace checking on the prefix value. <foo:text> may be exactly the
|
||||||
same as <bar:text> in another document. What really matter is the URI
|
same as <bar:text> in another document. What really matters is the URI
|
||||||
associated with the element or the attribute, not the prefix string (which is
|
associated with the element or the attribute, not the prefix string (which is
|
||||||
just a shortcut for the full URI). In libxml element and attributes have a
|
just a shortcut for the full URI). In libxml, element and attributes have an
|
||||||
<code>ns</code> field pointing to an xmlNs structure detailing the namespace
|
<code>ns</code> field pointing to an xmlNs structure detailing the namespace
|
||||||
prefix and it's URI.</p>
|
prefix and its URI.</p>
|
||||||
|
|
||||||
<p>@@Interfaces@@</p>
|
<p>@@Interfaces@@</p>
|
||||||
|
|
||||||
<p>@@Examples@@</p>
|
<p>@@Examples@@</p>
|
||||||
|
|
||||||
<p>Usually people object using namespace in the case of validation, I object
|
<p>Usually people object to using namespaces together with validity checking.
|
||||||
this and will make sure that using namespaces won't break validity checking,
|
I will try to make sure that using namespaces won't break validity checking,
|
||||||
so even is you plan to use or currently are using validation I strongly
|
so even if you plan to use or currently are using validation I strongly
|
||||||
suggest adding namespaces to your document. A default namespace scheme
|
suggest adding namespaces to your document. A default namespace scheme
|
||||||
<code>xmlns="http://...."</code> should not break validity even on less
|
<code>xmlns="http://...."</code> should not break validity even on less
|
||||||
flexible parsers. Now using namespace to mix and differentiate content coming
|
flexible parsers. Using namespaces to mix and differentiate content coming
|
||||||
from multiple DTDs will certainly break current validation schemes. I will try
|
from multiple DTDs will certainly break current validation schemes. I will try
|
||||||
to provide ways to do this, but this may not be portable or standardized.</p>
|
to provide ways to do this, but this may not be portable or standardized.</p>
|
||||||
|
|
||||||
@@ -1159,24 +1160,26 @@ to provide ways to do this, but this may not be portable or standardized.</p>
|
|||||||
<p>Well what is validation and what is a DTD ?</p>
|
<p>Well what is validation and what is a DTD ?</p>
|
||||||
|
|
||||||
<p>Validation is the process of checking a document against a set of
|
<p>Validation is the process of checking a document against a set of
|
||||||
construction rules, a <strong>DTD</strong> (Document Type Definition) is such
|
construction rules; a <strong>DTD</strong> (Document Type Definition) is such
|
||||||
a set of rules.</p>
|
a set of rules.</p>
|
||||||
|
|
||||||
<p>The validation process and building DTDs are the two most difficult parts
|
<p>The validation process and building DTDs are the two most difficult parts
|
||||||
of XML life cycle. Briefly a DTD defines all the possibles element to be
|
of the XML life cycle. Briefly a DTD defines all the possibles element to be
|
||||||
found within your document, what is the formal shape of your document tree (by
|
found within your document, what is the formal shape of your document tree (by
|
||||||
defining the allowed content of an element, either text, a regular expression
|
defining the allowed content of an element, either text, a regular expression
|
||||||
for the allowed list of children, or mixed content i.e. both text and
|
for the allowed list of children, or mixed content i.e. both text and
|
||||||
children). The DTD also defines the allowed attributes for all elements and
|
children). The DTD also defines the allowed attributes for all elements and
|
||||||
the types of the attributes. For more detailed informations, I suggest to read
|
the types of the attributes. For more detailed information,
|
||||||
|
I suggest that you read
|
||||||
the related parts of the XML specification, the examples found under
|
the related parts of the XML specification, the examples found under
|
||||||
gnome-xml/test/valid/dtd and the large amount of books available on XML. The
|
gnome-xml/test/valid/dtd and any of the
|
||||||
|
large number of books available on XML. The
|
||||||
dia example in gnome-xml/test/valid should be both simple and complete enough
|
dia example in gnome-xml/test/valid should be both simple and complete enough
|
||||||
to allow you to build your own.</p>
|
to allow you to build your own.</p>
|
||||||
|
|
||||||
<p>A word of warning, building a good DTD which will fit your needs of your
|
<p>A word of warning, building a good DTD which will fit the needs of your
|
||||||
application in the long-term is far from trivial, however the extra level of
|
application in the long-term is far from trivial; however, the extra level of
|
||||||
quality it can insure is well worth the price for some sets of applications or
|
quality it can ensure is well worth the price for some sets of applications or
|
||||||
if you already have already a DTD defined for your application field.</p>
|
if you already have already a DTD defined for your application field.</p>
|
||||||
|
|
||||||
<p>The validation is not completely finished but in a (very IMHO) usable
|
<p>The validation is not completely finished but in a (very IMHO) usable
|
||||||
@@ -1202,13 +1205,13 @@ core.</p>
|
|||||||
<h2><a name="DOM"></a><a name="Principles">DOM Principles</a></h2>
|
<h2><a name="DOM"></a><a name="Principles">DOM Principles</a></h2>
|
||||||
|
|
||||||
<p><a href="http://www.w3.org/DOM/">DOM</a> stands for the <em>Document Object
|
<p><a href="http://www.w3.org/DOM/">DOM</a> stands for the <em>Document Object
|
||||||
Model</em> this is an API for accessing XML or HTML structured documents.
|
Model</em>; this is an API for accessing XML or HTML structured documents.
|
||||||
Native support for DOM in Gnome is on the way (module gnome-dom), and it will
|
Native support for DOM in Gnome is on the way (module gnome-dom), and will
|
||||||
be based on gnome-xml. This will be a far cleaner interface to manipulate XML
|
be based on gnome-xml. This will be a far cleaner interface to manipulate XML
|
||||||
files within Gnome since it won't expose the internal structure. DOM defines a
|
files within Gnome since it won't expose the internal structure. DOM defines a
|
||||||
set of IDL (or Java) interfaces allowing to traverse and manipulate a
|
set of IDL (or Java) interfaces allowing you to traverse and manipulate a
|
||||||
document. The DOM library will allow accessing and modifying "live" documents
|
document. The DOM library will allow accessing and modifying "live" documents
|
||||||
presents on other programs like this:</p>
|
present in other programs like this:</p>
|
||||||
|
|
||||||
<p><img src="DOM.gif" alt=" DOM.gif "></p>
|
<p><img src="DOM.gif" alt=" DOM.gif "></p>
|
||||||
|
|
||||||
@@ -1287,14 +1290,14 @@ base</a>:</p>
|
|||||||
</gjob:Helping></pre>
|
</gjob:Helping></pre>
|
||||||
|
|
||||||
<p>While loading the XML file into an internal DOM tree is a matter of calling
|
<p>While loading the XML file into an internal DOM tree is a matter of calling
|
||||||
only a couple of functions, browsing the tree to gather the informations and
|
only a couple of functions, browsing the tree to gather the ata and
|
||||||
generate the internals structures is harder, and more error prone.</p>
|
generate the internal structures is harder, and more error prone.</p>
|
||||||
|
|
||||||
<p>The suggested principle is to be tolerant with respect to the input
|
<p>The suggested principle is to be tolerant with respect to the input
|
||||||
structure. For example, the ordering of the attributes is not significant,
|
structure. For example, the ordering of the attributes is not significant,
|
||||||
Cthe XML specification is clear about it. It's also usually a good idea to not
|
the XML specification is clear about it. It's also usually a good idea not to
|
||||||
be dependent of the orders of the children of a given node, unless it really
|
depend on the order of the children of a given node, unless it really
|
||||||
makes things harder. Here is some code to parse the informations for a
|
makes things harder. Here is some code to parse the information for a
|
||||||
person:</p>
|
person:</p>
|
||||||
<pre>/*
|
<pre>/*
|
||||||
* A person record
|
* A person record
|
||||||
@@ -1339,10 +1342,10 @@ DEBUG("parsePerson\n");
|
|||||||
return(ret);
|
return(ret);
|
||||||
}</pre>
|
}</pre>
|
||||||
|
|
||||||
<p>Here is a couple of things to notice:</p>
|
<p>Here are a couple of things to notice:</p>
|
||||||
<ul>
|
<ul>
|
||||||
<li>Usually a recursive parsing style is the more convenient one, XML data
|
<li>Usually a recursive parsing style is the more convenient one: XML data
|
||||||
being by nature subject to repetitive constructs and usualy exibit highly
|
is by nature subject to repetitive constructs and usually exibits highly
|
||||||
stuctured patterns.</li>
|
stuctured patterns.</li>
|
||||||
<li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>, i.e.
|
<li>The two arguments of type <em>xmlDocPtr</em> and <em>xmlNsPtr</em>, i.e.
|
||||||
the pointer to the global XML document and the namespace reserved to the
|
the pointer to the global XML document and the namespace reserved to the
|
||||||
@@ -1351,7 +1354,7 @@ DEBUG("parsePerson\n");
|
|||||||
application set of data and test that the element and attributes you're
|
application set of data and test that the element and attributes you're
|
||||||
analyzing actually pertains to your application space. This is done by a
|
analyzing actually pertains to your application space. This is done by a
|
||||||
simple equality test (cur->ns == ns).</li>
|
simple equality test (cur->ns == ns).</li>
|
||||||
<li>To retrieve text and attributes value, it is suggested to use the
|
<li>To retrieve text and attributes value, you can use the
|
||||||
function <em>xmlNodeListGetString</em> to gather all the text and entity
|
function <em>xmlNodeListGetString</em> to gather all the text and entity
|
||||||
reference nodes generated by the DOM output and produce an single text
|
reference nodes generated by the DOM output and produce an single text
|
||||||
string.</li>
|
string.</li>
|
||||||
@@ -1411,7 +1414,7 @@ DEBUG("parseJob\n");
|
|||||||
return(ret);
|
return(ret);
|
||||||
}</pre>
|
}</pre>
|
||||||
|
|
||||||
<p>One can notice that once used to it, writing this kind of code is quite
|
<p>Once you are used to it, writing this kind of code is quite
|
||||||
simple, but boring. Ultimately, it could be possble to write stubbers taking
|
simple, but boring. Ultimately, it could be possble to write stubbers taking
|
||||||
either C data structure definitions, a set of XML examples or an XML DTD and
|
either C data structure definitions, a set of XML examples or an XML DTD and
|
||||||
produce the code needed to import and export the content between C data and
|
produce the code needed to import and export the content between C data and
|
||||||
@@ -1447,6 +1450,6 @@ Gnome CVS base under gnome-xml/example</p>
|
|||||||
|
|
||||||
<p><a href="mailto:Daniel.Veillard@w3.org">Daniel Veillard</a></p>
|
<p><a href="mailto:Daniel.Veillard@w3.org">Daniel Veillard</a></p>
|
||||||
|
|
||||||
<p>$Id: xml.html,v 1.67 2001/02/15 15:55:44 veillard Exp $</p>
|
<p>$Id: xml.html,v 1.68 2001/02/24 17:48:53 veillard Exp $</p>
|
||||||
</body>
|
</body>
|
||||||
</html>
|
</html>
|
||||||
|
Reference in New Issue
Block a user