mirror of
https://gitlab.gnome.org/GNOME/libxml2.git
synced 2025-10-24 13:33:01 +03:00
- changed the way Windows socket stuff get included - removed an indetermination xmLDecl/PI(xml...) - xmlNewNs wasn't checking for double definition - fixed a problem with dist-hook duplicates - fixed the loading of external entities APIs, now xmlLoadExternalEntity() is used everywhere - now the xhtml spec validates with the xhtml DTD. - error.c: fixed crashes in case of no input stream - added the xhtml spec and dtds to the validation tests and results Daniel
1506 lines
56 KiB
HTML
1506 lines
56 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "dtds/xhtml1-strict.dtd">
|
|
<?xml-stylesheet href="W3C-PR.css" type="text/css"?>
|
|
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
|
|
<head>
|
|
<title>XHTML 1.0: The Extensible HyperText Markup
|
|
Language</title>
|
|
<link rel="stylesheet"
|
|
href="W3C-PR.css" type="text/css" />
|
|
<style type="text/css">
|
|
span.term { font-style: italic; color: rgb(0, 0, 192) }
|
|
code {
|
|
color: green;
|
|
font-family: monospace;
|
|
font-weight: bold;
|
|
}
|
|
|
|
code.greenmono {
|
|
color: green;
|
|
font-family: monospace;
|
|
font-weight: bold;
|
|
}
|
|
.good {
|
|
border: solid green;
|
|
border-width: 2px;
|
|
color: green;
|
|
font-weight: bold;
|
|
margin-right: 5%;
|
|
margin-left: 0;
|
|
}
|
|
.bad {
|
|
border: solid red;
|
|
border-width: 2px;
|
|
margin-left: 0;
|
|
margin-right: 5%;
|
|
color: rgb(192, 101, 101);
|
|
}
|
|
|
|
img {
|
|
color: white;
|
|
border: none;
|
|
}
|
|
|
|
div.navbar { text-align: center; }
|
|
div.contents {
|
|
background-color: rgb(204,204,255);
|
|
padding: 0.5em;
|
|
border: none;
|
|
margin-right: 5%;
|
|
}
|
|
.tocline { list-style: none; }
|
|
table.exceptions { background-color: rgb(255,255,153); }
|
|
</style>
|
|
</head>
|
|
<body>
|
|
<div class="navbar">
|
|
<a href="#toc">table of contents</a>
|
|
<hr />
|
|
</div>
|
|
<div class="head"><p><a href="http://www.w3.org/"><img class="head"
|
|
src="w3c_home.gif" alt="W3C" /></a></p>
|
|
|
|
<h1 class="head"><a name="title" id="title">XHTML</a><sup>™</sup> 1.0:
|
|
The Extensible HyperText Markup Language</h1>
|
|
|
|
<h2>A Reformulation of HTML 4.0 in XML 1.0</h2>
|
|
|
|
<h3>W3C Proposed Recommendation 10 December 1999</h3>
|
|
|
|
<dl>
|
|
<dt>This version:</dt>
|
|
|
|
<dd><a href=
|
|
"http://www.w3.org/TR/1999/PR-xhtml1-19991210">
|
|
http://www.w3.org/TR/1999/PR-xhtml1-19991210</a> <br />
|
|
(<a href="xhtml1.ps">Postscript version</a>,
|
|
<a href="xhtml1.pdf">PDF version</a>,
|
|
<a href="xhtml1.zip">ZIP archive</a>, or
|
|
<a href="xhtml1.tgz">Gzip'd TAR archive</a>)
|
|
</dd>
|
|
|
|
<dt>Latest version:</dt>
|
|
|
|
<dd><a href="http://www.w3.org/TR/xhtml1">
|
|
http://www.w3.org/TR/xhtml1</a></dd>
|
|
|
|
<dt>Previous versions:</dt>
|
|
|
|
<dd><a href=
|
|
"http://www.w3.org/TR/1999/WD-xhtml1-19991124">
|
|
http://www.w3.org/TR/1999/WD-xhtml1-19991124</a></dd>
|
|
<dd><a href=
|
|
"http://www.w3.org/TR/1999/PR-xhtml1-19990824">
|
|
http://www.w3.org/TR/1999/PR-xhtml1-19990824</a></dd>
|
|
|
|
<dt>Authors:</dt>
|
|
|
|
<dd>See <a href="#acks">acknowledgements</a>.</dd>
|
|
</dl>
|
|
|
|
<p class="copyright"><a href=
|
|
"http://www.w3.org/Consortium/Legal/ipr-notice#Copyright">
|
|
Copyright</a> © 1999 <a href="http://www.w3.org/">W3C</a><sup>®</sup>
|
|
(<a href="http://www.lcs.mit.edu/">MIT</a>, <a href=
|
|
"http://www.inria.fr/">INRIA</a>, <a href=
|
|
"http://www.keio.ac.jp/">Keio</a>), All Rights Reserved. <abbr
|
|
title="World Wide Web Consortium">W3C</abbr> <a
|
|
href=
|
|
"http://www.w3.org/Consortium/Legal/ipr-notice#Legal_Disclaimer">
|
|
liability</a>, <a href=
|
|
"http://www.w3.org/Consortium/Legal/ipr-notice#W3C_Trademarks">
|
|
trademark</a>, <a href=
|
|
"http://www.w3.org/Consortium/Legal/copyright-documents">document
|
|
use</a> and <a href=
|
|
"http://www.w3.org/Consortium/Legal/copyright-software">software
|
|
licensing</a> rules apply.</p>
|
|
<hr />
|
|
</div>
|
|
|
|
<h2 class="notoc">Abstract</h2>
|
|
|
|
<p>This specification defines <abbr title="Extensible Hypertext Markup
|
|
Language">XHTML</abbr> 1.0, a reformulation of HTML
|
|
4.0 as an XML 1.0 application, and three <abbr title="Document Type
|
|
Definition">DTDs</abbr> corresponding to
|
|
the ones defined by HTML 4.0. The semantics of the elements and
|
|
their attributes are defined in the W3C Recommendation for HTML
|
|
4.0. These semantics provide the foundation for future
|
|
extensibility of XHTML. Compatibility with existing HTML user
|
|
agents is possible by following a small set of guidelines.</p>
|
|
|
|
<h2>Status of this document</h2>
|
|
|
|
<p><em>This section describes the status of this document at the time
|
|
of its publication. Other documents may supersede this document. The
|
|
latest status of this document series is maintained at the W3C.</em></p>
|
|
|
|
<p>This specification is a Proposed Recommendation of the HTML Working Group. It is
|
|
a revision of the Proposed Recommendation dated <a
|
|
href= "http://www.w3.org/TR/1999/PR-xhtml1-19990824/">24 August
|
|
1999</a> incorporating changes as a result of comments from the Proposed
|
|
Recommendation review, and
|
|
comments and further deliberations of the W3C HTML Working Group. A
|
|
<a href="xhtml1-diff-19991210.html">diff-marked version</a> from the previous
|
|
proposed recommendation is available for comparison purposes.</p>
|
|
|
|
<p>On 10 December 1999, this document enters a
|
|
<a href="http://www.w3.org/Consortium/Process/#RecsPR">
|
|
Proposed Recommendation</a> review period. From that date until 8 January
|
|
2000,
|
|
W3C Advisory Committee representatives are encouraged
|
|
to review this specification and return comments in their completed
|
|
ballots to w3c-html-review@w3.org. Please send any comments of a
|
|
confidential nature in separate email to w3t-html@w3.org, which is
|
|
visible to the Team only.</p>
|
|
|
|
<p>No sooner than 14 days after the end of the review period, the
|
|
Director will announce the document's disposition: it may become a W3C
|
|
Recommendation (possibly with minor changes), it may revert to Working
|
|
Draft status, or it may be dropped as a W3C work item.</p>
|
|
|
|
<p>Publication as a Proposed Recommendation does not imply endorsement
|
|
by the W3C membership. This is still a draft document and may be
|
|
updated, replaced or obsoleted by other documents at any time. It is
|
|
inappropriate to cite W3C Proposed Recommendation as other than "work
|
|
in progress."</p>
|
|
|
|
<p>This document has been produced as part of the <a href=
|
|
"http://www.w3.org/MarkUp/">W3C HTML Activity</a>. The goals of
|
|
the <a href="http://www.w3.org/MarkUp/Group/">HTML Working
|
|
Group</a> <i>(<a href="http://cgi.w3.org/MemberAccess/">members
|
|
only</a>)</i> are discussed in the <a href=
|
|
"http://www.w3.org/MarkUp/Group/HTMLcharter">HTML Working Group
|
|
charter</a> <i>(<a href="http://cgi.w3.org/MemberAccess/">members
|
|
only</a>)</i>.</p>
|
|
|
|
<p>A list of current W3C Recommendations and other technical documents
|
|
can be found at <a
|
|
href="http://www.w3.org/TR">http://www.w3.org/TR</a>.</p>
|
|
|
|
<p>Public discussion on <abbr title="HyperText Markup
|
|
Language">HTML</abbr> features takes place on the mailing list <a
|
|
href="mailto:www-html@w3.org"> www-html@w3.org</a> (<a href=
|
|
"http://lists.w3.org/Archives/Public/www-html/">archive</a>). The W3C
|
|
staff contact for work on HTML is <a href= "mailto:dsr@w3.org">Dave
|
|
Raggett</a>.</p>
|
|
|
|
<p>Please report errors in this document to <a
|
|
href="mailto:www-html-editor@w3.org">www-html-editor@w3.org</a>.</p>
|
|
|
|
<p>The list of known errors in this specification is available at <a
|
|
href="http://www.w3.org/1999/12/PR-xhtml1-19991210-errata">http://www.w3.org/1999/12/PR-xhtml1-19991210-errata</a>.</p>
|
|
|
|
<h2 class="notoc"><a id="toc" name="toc">Contents</a></h2>
|
|
|
|
<div class="contents">
|
|
<ul class="toc">
|
|
<li class="tocline">1. <a href="#xhtml">What is XHTML?</a>
|
|
|
|
<ul class="toc">
|
|
<li class="tocline">1.1 <a href="#html4">What is HTML 4.0?</a></li>
|
|
|
|
<li class="tocline">1.2 <a href="#xml">What is XML?</a></li>
|
|
|
|
<li class="tocline">1.3 <a href="#why">Why the need for XHTML?</a></li>
|
|
</ul>
|
|
</li>
|
|
|
|
<li class="tocline">2. <a href="#defs">Definitions</a>
|
|
|
|
<ul class="toc">
|
|
<li class="tocline">2.1 <a href="#terms">Terminology</a></li>
|
|
|
|
<li class="tocline">2.2 <a href="#general">General Terms</a></li>
|
|
</ul>
|
|
</li>
|
|
|
|
<li class="tocline">3. <a href="#normative">Normative Definition of XHTML 1.0</a>
|
|
|
|
|
|
<ul class="toc">
|
|
<li class="tocline">3.1 <a href="#docconf">Document Conformance</a></li>
|
|
|
|
<li class="tocline">3.2 <a href="#uaconf">User Agent Conformance</a></li>
|
|
</ul>
|
|
</li>
|
|
|
|
<li class="tocline">4. <a href="#diffs">Differences with HTML 4.0</a>
|
|
|
|
</li>
|
|
|
|
<li class="tocline">5. <a href="#issues">Compatibility Issues</a>
|
|
|
|
<ul class="toc">
|
|
<li class="tocline">5.1 <a href="#media">Internet Media Types</a></li>
|
|
</ul>
|
|
</li>
|
|
|
|
<li class="tocline">6. <a href="#future">Future Directions</a>
|
|
|
|
<ul class="toc">
|
|
<li class="tocline">6.1 <a href="#mods">Modularizing HTML</a></li>
|
|
|
|
<li class="tocline">6.2 <a href="#extensions">Subsets and Extensibility</a></li>
|
|
|
|
<li class="tocline">6.3 <a href="#profiles">Document Profiles</a></li>
|
|
</ul>
|
|
</li>
|
|
|
|
<li class="tocline"><a href="#dtds">Appendix A. DTDs</a></li>
|
|
|
|
<li class="tocline"><a href="#prohibitions">Appendix B. Element
|
|
Prohibitions</a></li>
|
|
|
|
<li class="tocline"><a href="#guidelines">Appendix C. HTML Compatibility Guidelines</a></li>
|
|
|
|
<li class="tocline"><a href="#acks">Appendix D. Acknowledgements</a></li>
|
|
|
|
<li class="tocline"><a href="#refs">Appendix E. References</a></li>
|
|
</ul>
|
|
</div>
|
|
|
|
<!--OddPage-->
|
|
<h1><a name="xhtml" id="xhtml">1. What is XHTML?</a></h1>
|
|
|
|
<p>XHTML is a family of current and future document types and modules that
|
|
reproduce, subset, and extend HTML 4.0 <a href="#ref-html4">[HTML]</a>. XHTML family document types are <abbr title="Extensible Markup Language">XML</abbr> based,
|
|
and ultimately are designed to work in conjunction with XML-based user agents.
|
|
The details of this family and its evolution are
|
|
discussed in more detail in the section on <a href="#future">Future
|
|
Directions</a>. </p>
|
|
|
|
<p>XHTML 1.0 (this specification) is the first document type in the XHTML
|
|
family. It is a reformulation of the three HTML 4.0 document types as
|
|
applications of XML 1.0 <a href="#ref-xml"> [XML]</a>. It is intended
|
|
to be used as a language for content that is both XML-conforming and, if some
|
|
simple <a href="#guidelines">guidelines</a> are followed,
|
|
operates in HTML 4.0 conforming user agents. Developers who migrate
|
|
their content to XHTML 1.0 will realize the following benefits:</p>
|
|
|
|
<ul>
|
|
<li>XHTML documents are XML conforming. As such, they are readily viewed,
|
|
edited, and validated with standard XML tools.</li>
|
|
<li>XHTML documents can be written to
|
|
to operate as well or better than they did before in existing
|
|
HTML 4.0-conforming user agents as well as in new, XHTML 1.0 conforming user
|
|
agents.</li>
|
|
<li>XHTML documents can utilize applications (e.g. scripts and applets) that rely
|
|
upon either the HTML Document Object Model or the XML Document Object Model <a
|
|
href="#ref-dom">[DOM]</a>.</li>
|
|
<li>As the XHTML family evolves, documents conforming to XHTML 1.0 will be more
|
|
likely to interoperate within and among various XHTML environments.</li>
|
|
</ul>
|
|
|
|
<p>The XHTML family is the next step in the evolution of the Internet. By
|
|
migrating to XHTML today, content developers can enter the XML world with all
|
|
of its attendant benefits, while still remaining confident in their
|
|
content's backward and future compatibility.</p>
|
|
|
|
<h2><a name="html4" id="html4">1.1 What is HTML 4.0?</a></h2>
|
|
|
|
<p>HTML 4.0 <a href="#ref-html4">[HTML]</a> is an <abbr title="Standard
|
|
Generalized Markup Language">SGML</abbr> (Standard
|
|
Generalized Markup Language) application conforming to
|
|
International Standard <abbr title="Organization for International
|
|
Standardization">ISO</abbr> 8879, and is widely regarded as the
|
|
standard publishing language of the World Wide Web.</p>
|
|
|
|
<p>SGML is a language for describing markup languages,
|
|
particularly those used in electronic document exchange, document
|
|
management, and document publishing. HTML is an example of a
|
|
language defined in SGML.</p>
|
|
|
|
<p>SGML has been around since the middle 1980's and has remained
|
|
quite stable. Much of this stability stems from the fact that the
|
|
language is both feature-rich and flexible. This flexibility,
|
|
however, comes at a price, and that price is a level of
|
|
complexity that has inhibited its adoption in a diversity of
|
|
environments, including the World Wide Web.</p>
|
|
|
|
<p>HTML, as originally conceived, was to be a language for the
|
|
exchange of scientific and other technical documents, suitable
|
|
for use by non-document specialists. HTML addressed the problem
|
|
of SGML complexity by specifying a small set of structural and
|
|
semantic tags suitable for authoring relatively simple documents.
|
|
In addition to simplifying the document structure, HTML added
|
|
support for hypertext. Multimedia capabilities were added
|
|
later.</p>
|
|
|
|
<p>In a remarkably short space of time, HTML became wildly
|
|
popular and rapidly outgrew its original purpose. Since HTML's
|
|
inception, there has been rapid invention of new elements for use
|
|
within HTML (as a standard) and for adapting HTML to vertical,
|
|
highly specialized, markets. This plethora of new elements has
|
|
led to compatibility problems for documents across different
|
|
platforms.</p>
|
|
|
|
<p>As the heterogeneity of both software and platforms rapidly
|
|
proliferate, it is clear that the suitability of 'classic' HTML
|
|
4.0 for use on these platforms is somewhat limited.</p>
|
|
|
|
<h2><a name="xml" id="xml">1.2 What is XML?</a></h2>
|
|
|
|
<p>XML<sup>™</sup> is the shorthand for Extensible Markup
|
|
Language, and is an acronym of Extensible Markup Language <a
|
|
href="#ref-xml">[XML]</a>.</p>
|
|
|
|
<p>XML was conceived as a means of regaining the power and
|
|
flexibility of SGML without most of its complexity. Although a
|
|
restricted form of SGML, XML nonetheless preserves most of SGML's
|
|
power and richness, and yet still retains all of SGML's commonly
|
|
used features.</p>
|
|
|
|
<p>While retaining these beneficial features, XML removes many of
|
|
the more complex features of SGML that make the authoring and
|
|
design of suitable software both difficult and costly.</p>
|
|
|
|
<h2><a name="why" id="why">1.3 Why the need for XHTML?</a></h2>
|
|
|
|
<p>The benefits of migrating to XHTML 1.0 are described above. Some of the
|
|
benefits of migrating to XHTML in general are:</p>
|
|
|
|
<ul>
|
|
<li>Document developers and user agent designers are constantly
|
|
discovering new ways to express their ideas through new markup. In XML, it is
|
|
relatively easy to introduce new elements or additional element
|
|
attributes. The XHTML family is designed to accommodate these extensions
|
|
through XHTML modules and techniques for developing new XHTML-conforming
|
|
modules (described in the forthcoming XHTML Modularization specification).
|
|
These modules will permit the combination of existing and
|
|
new feature sets when developing content and when designing new user
|
|
agents.</li>
|
|
|
|
<li>Alternate ways of accessing the Internet are constantly being
|
|
introduced. Some estimates indicate that by the year 2002, 75% of
|
|
Internet document viewing will be carried out on these alternate
|
|
platforms. The XHTML family is designed with general user agent
|
|
interoperability in mind. Through a new user agent and document profiling
|
|
mechanism, servers, proxies, and user agents will be able to perform
|
|
best effort content transformation. Ultimately, it will be possible to
|
|
develop XHTML-conforming content that is usable by any XHTML-conforming
|
|
user agent.</li>
|
|
|
|
</ul>
|
|
<!--OddPage-->
|
|
<h1><a name="defs" id="defs">2. Definitions</a></h1>
|
|
|
|
<h2><a name="terms" id="terms">2.1 Terminology</a></h2>
|
|
|
|
<p>The following terms are used in this specification. These
|
|
terms extend the definitions in <a href="#ref-rfc2119">
|
|
[RFC2119]</a> in ways based upon similar definitions in ISO/<abbr
|
|
title="International Electro-technical Commission">IEC</abbr>
|
|
9945-1:1990 <a href="#ref-posix">[POSIX.1]</a>:</p>
|
|
|
|
<dl>
|
|
<dt>Implementation-defined</dt>
|
|
|
|
<dd>A value or behavior is implementation-defined when it is left
|
|
to the implementation to define [and document] the corresponding
|
|
requirements for correct document construction.</dd>
|
|
|
|
<dt>May</dt>
|
|
|
|
<dd>With respect to implementations, the word "may" is to be
|
|
interpreted as an optional feature that is not required in this
|
|
specification but can be provided. With respect to <a href=
|
|
"#docconf">Document Conformance</a>, the word "may" means that
|
|
the optional feature must not be used. The term "optional" has
|
|
the same definition as "may".</dd>
|
|
|
|
<dt>Must</dt>
|
|
|
|
<dd>In this specification, the word "must" is to be interpreted
|
|
as a mandatory requirement on the implementation or on Strictly
|
|
Conforming XHTML Documents, depending upon the context. The term
|
|
"shall" has the same definition as "must".</dd>
|
|
|
|
<dt>Reserved</dt>
|
|
|
|
<dd>A value or behavior is unspecified, but it is not allowed to
|
|
be used by Conforming Documents nor to be supported by a
|
|
Conforming User Agents.</dd>
|
|
|
|
<dt>Should</dt>
|
|
|
|
<dd>With respect to implementations, the word "should" is to be
|
|
interpreted as an implementation recommendation, but not a
|
|
requirement. With respect to documents, the word "should" is to
|
|
be interpreted as recommended programming practice for documents
|
|
and a requirement for Strictly Conforming XHTML Documents.</dd>
|
|
|
|
<dt>Supported</dt>
|
|
|
|
<dd>Certain facilities in this specification are optional. If a
|
|
facility is supported, it behaves as specified by this
|
|
specification.</dd>
|
|
|
|
<dt>Unspecified</dt>
|
|
|
|
<dd>When a value or behavior is unspecified, the specification
|
|
defines no portability requirements for a facility on an
|
|
implementation even when faced with a document that uses the
|
|
facility. A document that requires specific behavior in such an
|
|
instance, rather than tolerating any behavior when using that
|
|
facility, is not a Strictly Conforming XHTML Document.</dd>
|
|
</dl>
|
|
|
|
<h2><a name="general" id="general">2.2 General Terms</a></h2>
|
|
|
|
<dl>
|
|
<dt>Attribute</dt>
|
|
|
|
<dd>An attribute is a parameter to an element declared in the
|
|
DTD. An attribute's type and value range, including a possible
|
|
default value, are defined in the DTD.</dd>
|
|
|
|
<dt>DTD</dt>
|
|
|
|
<dd>A DTD, or document type definition, is a collection of XML
|
|
declarations that, as a collection, defines the legal structure,
|
|
<span class="term">elements</span>, and <span class="term">
|
|
attributes</span> that are available for use in a document that
|
|
complies to the DTD.</dd>
|
|
|
|
<dt>Document</dt>
|
|
|
|
<dd>A document is a stream of data that, after being combined
|
|
with any other streams it references, is structured such that it
|
|
holds information contained within <span class="term">
|
|
elements</span> that are organized as defined in the associated
|
|
<span class="term">DTD</span>. See <a href="#docconf">Document
|
|
Conformance</a> for more information.</dd>
|
|
|
|
<dt>Element</dt>
|
|
|
|
<dd>An element is a document structuring unit declared in the
|
|
<span class="term">DTD</span>. The element's content model is
|
|
defined in the <span class="term">DTD</span>, and additional
|
|
semantics may be defined in the prose description of the
|
|
element.</dd>
|
|
|
|
<dt><a name="facilities" id="facilities">Facilities</a></dt>
|
|
|
|
<dd>Functionality includes <span class="term">elements</span>,
|
|
<span class="term">attributes</span>, and the semantics
|
|
associated with those <span class="term">elements</span> and
|
|
<span class="term">attributes</span>. An implementation
|
|
supporting that functionality is said to provide the necessary
|
|
facilities.</dd>
|
|
|
|
<dt>Implementation</dt>
|
|
|
|
<dd>An implementation is a system that provides collection of
|
|
<span class="term">facilities</span> and services that supports
|
|
this specification. See <a href="#uaconf">User Agent
|
|
Conformance</a> for more information.</dd>
|
|
|
|
<dt>Parsing</dt>
|
|
|
|
<dd>Parsing is the act whereby a <span class="term">
|
|
document</span> is scanned, and the information contained within
|
|
the <span class="term">document</span> is filtered into the
|
|
context of the <span class="term">elements</span> in which the
|
|
information is structured.</dd>
|
|
|
|
<dt>Rendering</dt>
|
|
|
|
<dd>Rendering is the act whereby the information in a <span
|
|
class="term">document</span> is presented. This presentation is
|
|
done in the form most appropriate to the environment (e.g.
|
|
aurally, visually, in print).</dd>
|
|
|
|
<dt>User Agent</dt>
|
|
|
|
<dd>A user agent is an <span class="term">implementation</span>
|
|
that retrieves and processes XHTML documents. See <a href=
|
|
"#uaconf">User Agent Conformance</a> for more information.</dd>
|
|
|
|
<dt>Validation</dt>
|
|
|
|
<dd>Validation is a process whereby <span class="term">
|
|
documents</span> are verified against the associated <span class=
|
|
"term">DTD</span>, ensuring that the structure, use of <span
|
|
class="term">elements</span>, and use of <span class="term">
|
|
attributes</span> are consistent with the definitions in the
|
|
<span class="term">DTD</span>.</dd>
|
|
|
|
<dt><a name="wellformed" id="wellformed">Well-formed</a></dt>
|
|
|
|
<dd>A <span class="term">document</span> is well-formed when it
|
|
is structured according to the rules defined in <a href=
|
|
"http://www.w3.org/TR/REC-xml#sec-well-formed">Section 2.1</a> of
|
|
the XML 1.0 Recommendation <a href="#ref-xml">[XML]</a>.
|
|
Basically, this definition states that elements, delimited by
|
|
their start and end tags, are nested properly within one
|
|
another.</dd>
|
|
</dl>
|
|
|
|
<!--OddPage-->
|
|
<h1><a name="normative" id="normative">3. Normative Definition of
|
|
XHTML 1.0</a></h1>
|
|
|
|
<h2><a name="docconf" id="docconf">3.1 Document
|
|
Conformance</a></h2>
|
|
|
|
<p>This version of XHTML provides a definition of strictly
|
|
conforming XHTML documents, which are restricted to tags and
|
|
attributes from the XHTML namespace. See <a href=
|
|
"#well-formed">Section 3.1.2</a> for information on using XHTML
|
|
with other namespaces, for instance, to include metadata
|
|
expressed in <abbr title="Resource Description Format">RDF</abbr> within XHTML documents.</p>
|
|
|
|
<h3><a name="strict" id="strict">3.1.1 Strictly Conforming
|
|
Documents</a></h3>
|
|
|
|
<p>A Strictly Conforming XHTML Document is a document that
|
|
requires only the facilities described as mandatory in this
|
|
specification. Such a document must meet all of the following
|
|
criteria:</p>
|
|
|
|
<ol>
|
|
<li>
|
|
<p>It must validate against one of the three DTDs found in <a
|
|
href="#dtds">Appendix A</a>.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The root element of the document must be <code>
|
|
<html></code>.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>The root element of the document must designate the XHTML
|
|
namespace using the <code>xmlns</code> attribute <a href=
|
|
"#ref-xmlns">[XMLNAMES]</a>. The namespace for XHTML is
|
|
defined to be
|
|
<code>http://www.w3.org/1999/xhtml</code>.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>There must be a DOCTYPE declaration in the document prior to
|
|
the root element. The public identifier included in
|
|
the DOCTYPE declaration must reference one of the three DTDs
|
|
found in <a href="#dtds">Appendix A</a> using the respective
|
|
Formal Public Identifier. The system identifier may be changed to reflect
|
|
local system conventions.</p>
|
|
|
|
<pre>
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
|
|
"http://www.w3.org/TR/1999/PR-xhtml1-19991210/DTD/xhtml1-strict.dtd>
|
|
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
|
|
"http://www.w3.org/TR/1999/PR-xhtml1-19991210/DTD/xhtml1-transitional.dtd>
|
|
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"
|
|
"http://www.w3.org/TR/1999/PR-xhtml1-19991210/DTD/xhtml1-frameset.dtd>
|
|
</pre>
|
|
</li>
|
|
</ol>
|
|
|
|
<p>Here is an example of a minimal XHTML document.</p>
|
|
|
|
<div class="good">
|
|
<pre>
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html
|
|
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
|
|
"http://www.w3.org/TR/1999/PR-xhtml1-19991210/DTD/xhtml1-strict.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
|
<head>
|
|
<title>Virtual Library</title>
|
|
</head>
|
|
<body>
|
|
<p>Moved to <a href="http://vlib.org/">vlib.org</a>.</p>
|
|
</body>
|
|
</html></pre>
|
|
</div>
|
|
|
|
<p>Note that in this example, the XML declaration is included. An XML
|
|
declaration like the one above is
|
|
not required in all XML documents. XHTML document authors are strongly encouraged to use XML declarations in all their documents. Such a declaration is required
|
|
when the character encoding of the document is other than the default UTF-8 or
|
|
UTF-16.</p>
|
|
|
|
<h3><a name="well-formed" id="well-formed">3.1.2 Using XHTML with
|
|
other namespaces</a></h3>
|
|
|
|
<p>The XHTML namespace may be used with other XML namespaces
|
|
as per <a href="#ref-xmlns">[XMLNAMES]</a>, although such
|
|
documents are not strictly conforming XHTML 1.0 documents as
|
|
defined above. Future work by W3C will address ways to specify
|
|
conformance for documents involving multiple namespaces.</p>
|
|
|
|
<p>The following example shows the way in which XHTML 1.0 could
|
|
be used in conjunction with the MathML Recommendation:</p>
|
|
|
|
<div class="good">
|
|
<pre>
|
|
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
|
|
<head>
|
|
<title>A Math Example</title>
|
|
</head>
|
|
<body>
|
|
<p>The following is MathML markup:</p>
|
|
<math xmlns="http://www.w3.org/1998/Math/MathML">
|
|
<apply> <log/>
|
|
<logbase>
|
|
<cn> 3 </cn>
|
|
</logbase>
|
|
<ci> x </ci>
|
|
</apply>
|
|
</math>
|
|
</body>
|
|
</html>
|
|
</pre>
|
|
</div>
|
|
|
|
<p>The following example shows the way in which XHTML 1.0 markup
|
|
could be incorporated into another XML namespace:</p>
|
|
|
|
<div class="good">
|
|
<pre>
|
|
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!-- initially, the default namespace is "books" -->
|
|
<book xmlns='urn:loc.gov:books'
|
|
xmlns:isbn='urn:ISBN:0-395-36341-6' xml:lang="en" lang="en">
|
|
<title>Cheaper by the Dozen</title>
|
|
<isbn:number>1568491379</isbn:number>
|
|
<notes>
|
|
<!-- make HTML the default namespace for a hypertext commentary -->
|
|
<p xmlns='http://www.w3.org/1999/xhtml'>
|
|
This is also available <a href="http://www.w3.org/">online</a>.
|
|
</p>
|
|
</notes>
|
|
</book>
|
|
</pre>
|
|
</div>
|
|
|
|
<h2><a name="uaconf" id="uaconf">3.2 User Agent
|
|
Conformance</a></h2>
|
|
|
|
<p>A conforming user agent must meet all of the following
|
|
criteria:</p>
|
|
|
|
<ol>
|
|
<li>In order to be consistent with the XML 1.0 Recommendation <a
|
|
href="#ref-xml">[XML]</a>, the user agent must parse and evaluate
|
|
an XHTML document for well-formedness. If the user agent claims
|
|
to be a validating user agent, it must also validate documents
|
|
against their referenced DTDs according to <a href="#ref-xml">
|
|
[XML]</a>.</li>
|
|
|
|
<li>When the user agent claims to support <a href="#facilities">
|
|
facilities</a> defined within this specification or required by
|
|
this specification through normative reference, it must do so in
|
|
ways consistent with the facilities' definition.</li>
|
|
|
|
<li>When a user agent processes an XHTML document as generic XML,
|
|
it shall only recognize attributes of type
|
|
<code>ID</code> (e.g. the <code>id</code> attribute on most XHTML elements)
|
|
as fragment identifiers.</li>
|
|
|
|
<li>If a user agent encounters an element it does not recognize,
|
|
it must render the element's content.</li>
|
|
|
|
<li>If a user agent encounters an attribute it does not
|
|
recognize, it must ignore the entire attribute specification
|
|
(i.e., the attribute and its value).</li>
|
|
|
|
<li>If a user agent encounters an attribute value it doesn't
|
|
recognize, it must use the default attribute value.</li>
|
|
|
|
<li>If it encounters an entity reference (other than one
|
|
of the predefined entities) for which the User Agent has
|
|
processed no declaration (which could happen if the declaration
|
|
is in the external subset which the User Agent hasn't read), the entity
|
|
reference should be rendered as the characters (starting
|
|
with the ampersand and ending with the semi-colon) that
|
|
make up the entity reference.</li>
|
|
|
|
<li>When rendering content, User Agents that encounter
|
|
characters or character entity references that are recognized but not renderable should display the document in such a way that it is obvious to the user that normal rendering has not taken place.</li>
|
|
|
|
<li>
|
|
The following characters are defined in [XML] as whitespace characters:
|
|
|
|
<ul>
|
|
<li>Space (&#x0020;)</li>
|
|
<li>Tab (&#x0009;)</li>
|
|
<li>Carriage return (&#x000D;)</li>
|
|
<li>Line feed (&#x000A;)</li>
|
|
</ul>
|
|
|
|
<p>
|
|
The XML processor normalizes different system's line end codes into one
|
|
single line-feed character, that is passed up to the application. The XHTML
|
|
user agent in addition, must treat the following characters as whitespace:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>Form feed (&#x000C;)</li>
|
|
<li>Zero-width space (&#x200B;)</li>
|
|
</ul>
|
|
|
|
<p>
|
|
In elements where the 'xml:space' attribute is set to 'preserve', the user
|
|
agent must leave all whitespace characters intact (with the exception of
|
|
leading and trailing whitespace characters, which should be removed).
|
|
Otherwise, whitespace
|
|
is handled according to the following rules:
|
|
</p>
|
|
|
|
<ul>
|
|
<li>
|
|
All whitespace surrounding block elements should be removed.
|
|
</li>
|
|
<li>
|
|
Comments are removed entirely and do not affect whitespace handling. One
|
|
whitespace character on either side of a comment is treated as two white
|
|
space characters.
|
|
</li>
|
|
<li>
|
|
Leading and trailing whitespace inside a block element must be removed.
|
|
</li>
|
|
<li>Line feed characters within a block element must be converted into a
|
|
space (except when the 'xml:space' attribute is set to 'preserve').
|
|
</li>
|
|
<li>
|
|
A sequence of white space characters must be reduced to a single space
|
|
character (except when the 'xml:space' attribute is set to 'preserve').
|
|
</li>
|
|
<li>
|
|
With regard to rendition,
|
|
the User Agent should render the content in a
|
|
manner appropriate to the language in which the content is written.
|
|
In languages whose primary script is Latinate, the ASCII space
|
|
character is typically used to encode both grammatical word boundaries and
|
|
typographic whitespace; in languages whose script is related to Nagari
|
|
(e.g., Sanskrit, Thai, etc.), grammatical boundaries may be encoded using
|
|
the ZW 'space' character, but will not typically be represented by
|
|
typographic whitespace in rendered output; languages using Arabiform scripts
|
|
may encode typographic whitespace using a space character, but may also use
|
|
the ZW space character to delimit 'internal' grammatical boundaries (what
|
|
look like words in Arabic to an English eye frequently encode several words,
|
|
e.g. 'kitAbuhum' = 'kitAbu-hum' = 'book them' == their book); and languages
|
|
in the Chinese script tradition typically neither encode such delimiters nor
|
|
use typographic whitespace in this way.
|
|
</li>
|
|
</ul>
|
|
|
|
<p>Whitespace in attribute values is processed according to <a
|
|
href="#ref-xml">[XML]</a>.</p>
|
|
</li>
|
|
</ol>
|
|
|
|
<!--OddPage-->
|
|
<h1><a name="diffs" id="diffs">4. Differences with HTML
|
|
4.0</a></h1>
|
|
|
|
<p>Due to the fact that XHTML is an XML application, certain
|
|
practices that were perfectly legal in SGML-based HTML 4.0 <a
|
|
href="#ref-html4">[HTML]</a> must be changed.</p>
|
|
|
|
<h2><a name="h-4.1" id="h-4.1">4.1 Documents must be
|
|
well-formed</a></h2>
|
|
|
|
<p><a href="#wellformed">Well-formedness</a> is a new concept
|
|
introduced by <a href="#ref-xml">[XML]</a>. Essentially this
|
|
means that all elements must either have closing tags or be
|
|
written in a special form (as described below), and that all the
|
|
elements must nest.</p>
|
|
|
|
<p>Although overlapping is illegal in SGML, it was widely
|
|
tolerated in existing browsers.</p>
|
|
|
|
<div class="good">
|
|
<p><strong><em>CORRECT: nested elements.</em></strong></p>
|
|
|
|
<p><p>here is an emphasized
|
|
<em>paragraph</em>.</p></p>
|
|
</div>
|
|
|
|
<div class="bad">
|
|
<p><strong><em>INCORRECT: overlapping elements</em></strong></p>
|
|
|
|
<p><p>here is an emphasized
|
|
<em>paragraph.</p></em></p>
|
|
</div>
|
|
|
|
<h2><a name="h-4.2" id="h-4.2">4.2 Element and attribute
|
|
names must be in lower case</a></h2>
|
|
|
|
<p>XHTML documents must use lower case for all HTML element and
|
|
attribute names. This difference is necessary because XML is
|
|
case-sensitive e.g. <li> and <LI> are different
|
|
tags.</p>
|
|
|
|
<h2><a name="h-4.3" id="h-4.3">4.3 For non-empty elements,
|
|
end tags are required</a></h2>
|
|
|
|
<p>In SGML-based HTML 4.0 certain elements were permitted to omit
|
|
the end tag; with the elements that followed implying closure.
|
|
This omission is not permitted in XML-based XHTML. All elements
|
|
other than those declared in the DTD as <code>EMPTY</code> must
|
|
have an end tag.</p>
|
|
|
|
<div class="good">
|
|
<p><strong><em>CORRECT: terminated elements</em></strong></p>
|
|
|
|
<p><p>here is a paragraph.</p><p>here is
|
|
another paragraph.</p></p>
|
|
</div>
|
|
|
|
<div class="bad">
|
|
<p><strong><em>INCORRECT: unterminated elements</em></strong></p>
|
|
|
|
<p><p>here is a paragraph.<p>here is another
|
|
paragraph.</p>
|
|
</div>
|
|
|
|
<h2><a name="h-4.4" id="h-4.4">4.4 Attribute values must
|
|
always be quoted</a></h2>
|
|
|
|
<p>All attribute values must be quoted, even those which appear
|
|
to be numeric.</p>
|
|
|
|
<div class="good">
|
|
<p><strong><em>CORRECT: quoted attribute values</em></strong></p>
|
|
|
|
<p><table rows="3"></p>
|
|
</div>
|
|
|
|
<div class="bad">
|
|
<p><strong><em>INCORRECT: unquoted attribute values</em></strong></p>
|
|
|
|
<p><table rows=3></p>
|
|
</div>
|
|
|
|
<h2><a name="h-4.5" id="h-4.5">4.5 Attribute
|
|
Minimization</a></h2>
|
|
|
|
<p>XML does not support attribute minimization. Attribute-value
|
|
pairs must be written in full. Attribute names such as <code>
|
|
compact</code> and <code>checked</code> cannot occur in elements
|
|
without their value being specified.</p>
|
|
|
|
<div class="good">
|
|
<p><strong><em>CORRECT: unminimized attributes</em></strong></p>
|
|
|
|
<p><dl compact="compact"></p>
|
|
</div>
|
|
|
|
<div class="bad">
|
|
<p><strong><em>INCORRECT: minimized attributes</em></strong></p>
|
|
|
|
<p><dl compact></p>
|
|
</div>
|
|
|
|
<h2><a name="h-4.6" id="h-4.6">4.6 Empty Elements</a></h2>
|
|
|
|
<p>Empty elements must either have an end tag or the start tag must end with <code>/></code>. For instance,
|
|
<code><br/></code> or <code><hr></hr></code>. See <a
|
|
href="#guidelines">HTML Compatibility Guidelines</a> for information on ways to
|
|
ensure this is backward compatible with HTML 4.0 user agents.</p>
|
|
|
|
<div class="good">
|
|
<p><strong><em>CORRECT: terminated empty tags</em></strong></p>
|
|
|
|
<p><br/><hr/></p>
|
|
</div>
|
|
|
|
<div class="bad">
|
|
<p><strong><em>INCORRECT: unterminated empty tags</em></strong></p>
|
|
|
|
<p><br><hr></p>
|
|
</div>
|
|
|
|
<h2><a name="h-4.7" id="h-4.7">4.7 Whitespace handling in
|
|
attribute values</a></h2>
|
|
|
|
<p>In attribute values, user agents will strip leading and
|
|
trailing whitespace from attribute values and map sequences
|
|
of one or more whitespace characters (including line breaks) to
|
|
a single inter-word space (an ASCII space character for western
|
|
scripts). See <a href="http://www.w3.org/TR/REC-xml#AVNormalize">
|
|
Section 3.3.3</a> of <a href="#ref-xml">[XML]</a>.</p>
|
|
|
|
<h2><a name="h-4.8" id="h-4.8">4.8 Script and Style
|
|
elements</a></h2>
|
|
|
|
<p>In XHTML, the script and style elements are declared as having
|
|
<code>#PCDATA</code> content. As a result, <code><</code> and
|
|
<code>&</code> will be treated as the start of markup, and
|
|
entities such as <code>&lt;</code> and <code>&amp;</code>
|
|
will be recognized as entity references by the XML processor to
|
|
<code><</code> and <code>&</code> respectively. Wrapping
|
|
the content of the script or style element within a <code>
|
|
CDATA</code> marked section avoids the expansion of these
|
|
entities.</p>
|
|
|
|
<div class="good">
|
|
<pre>
|
|
<script>
|
|
<![CDATA[
|
|
... unescaped script content ...
|
|
]]>
|
|
</script>
|
|
</pre>
|
|
</div>
|
|
|
|
<p><code>CDATA</code> sections are recognized by the XML
|
|
processor and appear as nodes in the Document Object Model, see
|
|
<a href=
|
|
"http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-E067D597">
|
|
Section 1.3</a> of the DOM Level 1 Recommendation <a href=
|
|
"#ref-dom">[DOM]</a>.</p>
|
|
|
|
<p>An alternative is to use external script and style
|
|
documents.</p>
|
|
|
|
<h2><a name="h-4.9" id="h-4.9">4.9 SGML exclusions</a></h2>
|
|
|
|
<p>SGML gives the writer of a DTD the ability to exclude specific
|
|
elements from being contained within an element. Such
|
|
prohibitions (called "exclusions") are not possible in XML.</p>
|
|
|
|
<p>For example, the HTML 4.0 Strict DTD forbids the nesting of an
|
|
'<code>a</code>' element within another '<code>a</code>' element
|
|
to any descendant depth. It is not possible to spell out such
|
|
prohibitions in XML. Even though these prohibitions cannot be
|
|
defined in the DTD, certain elements should not be nested. A
|
|
summary of such elements and the elements that should not be
|
|
nested in them is found in the normative <a href="#prohibitions">
|
|
Appendix B</a>.</p>
|
|
|
|
<h2><a name="h-4.10" id="h-4.10">4.10 The elements with 'id' and 'name'
|
|
attributes</a></h2>
|
|
|
|
<p>HTML 4.0 defined the <code>name</code> attribute for the elements
|
|
<code>a</code>,
|
|
<code>applet</code>, <code>frame</code>,
|
|
<code>iframe</code>, <code>img</code>, and <code>map</code>.
|
|
HTML 4.0 also introduced
|
|
the <code>id</code> attribute. Both of these attributes are designed to be
|
|
used as fragment identifiers.</p>
|
|
<p>In XML, fragment identifiers are of type <code>ID</code>, and
|
|
there can only be a single attribute of type <code>ID</code> per element.
|
|
Therefore, in XHTML 1.0 the <code>id</code>
|
|
attribute is defined to be of type <code>ID</code>. In order to
|
|
ensure that XHTML 1.0 documents are well-structured XML documents, XHTML 1.0
|
|
documents MUST use the <code>id</code> attribute when defining fragment
|
|
identifiers, even on elements that historically have also had a
|
|
<code>name</code> attribute.
|
|
See the <a href="#guidelines">HTML Compatibility
|
|
Guidelines</a> for information on ensuring such anchors are backwards
|
|
compatible when serving XHTML documents as media type <code>text/html</code>.
|
|
</p>
|
|
<p>Note that in XHTML 1.0, the <code>name</code> attribute of these
|
|
elements is formally deprecated, and will be removed in a
|
|
subsequent version of XHTML.</p>
|
|
|
|
<!--OddPage-->
|
|
<h1><a name="issues" id="issues">5. Compatibility Issues</a></h1>
|
|
|
|
<p>Although there is no requirement for XHTML 1.0 documents to be
|
|
compatible with existing user agents, in practice this is easy to
|
|
accomplish. Guidelines for creating compatible documents can be
|
|
found in <a href="#guidelines">Appendix C</a>.</p>
|
|
|
|
<h2><a name="media" id="media">5.1 Internet Media Type</a></h2>
|
|
<p>As of the publication of this recommendation, the general
|
|
recommended MIME labeling for XML-based applications
|
|
has yet to be resolved.</p>
|
|
|
|
<p>However, XHTML Documents which follow the guidelines set forth
|
|
in <a href="#guidelines">Appendix C</a>, "HTML Compatibility Guidelines" may be
|
|
labeled with the Internet Media Type "text/html", as they
|
|
are compatible with most HTML browsers. This document
|
|
makes no recommendation about MIME labeling of other
|
|
XHTML documents.</p>
|
|
|
|
<!--OddPage-->
|
|
<h1><a name="future" id="future">6. Future Directions</a></h1>
|
|
|
|
<p>XHTML 1.0 provides the basis for a family of document types
|
|
that will extend and subset XHTML, in order to support a wide
|
|
range of new devices and applications, by defining modules and
|
|
specifying a mechanism for combining these modules. This
|
|
mechanism will enable the extension and sub-setting of XHTML 1.0
|
|
in a uniform way through the definition of new modules.</p>
|
|
|
|
<h2><a name="mods" id="mods">6.1 Modularizing HTML</a></h2>
|
|
|
|
<p>As the use of XHTML moves from the traditional desktop user
|
|
agents to other platforms, it is clear that not all of the XHTML
|
|
elements will be required on all platforms. For example a hand
|
|
held device or a cell-phone may only support a subset of XHTML
|
|
elements.</p>
|
|
|
|
<p>The process of modularization breaks XHTML up into a series of
|
|
smaller element sets. These elements can then be recombined to
|
|
meet the needs of different communities.</p>
|
|
|
|
<p>These modules will be defined in a later W3C document.</p>
|
|
|
|
<h2><a name="extensions" id="extensions">6.2 Subsets and
|
|
Extensibility</a></h2>
|
|
|
|
<p>Modularization brings with it several advantages:</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p>It provides a formal mechanism for sub-setting XHTML.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>It provides a formal mechanism for extending XHTML.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>It simplifies the transformation between document types.</p>
|
|
</li>
|
|
|
|
<li>
|
|
<p>It promotes the reuse of modules in new document types.</p>
|
|
</li>
|
|
</ul>
|
|
|
|
<h2><a name="profiles" id="profiles">6.3 Document
|
|
Profiles</a></h2>
|
|
|
|
<p>A document profile specifies the syntax and semantics of a set
|
|
of documents. Conformance to a document profile provides a basis
|
|
for interoperability guarantees. The document profile specifies
|
|
the facilities required to process documents of that type, e.g.
|
|
which image formats can be used, levels of scripting, style sheet
|
|
support, and so on.</p>
|
|
|
|
<p>For product designers this enables various groups to define
|
|
their own standard profile.</p>
|
|
|
|
<p>For authors this will obviate the need to write several
|
|
different versions of documents for different clients.</p>
|
|
|
|
<p>For special groups such as chemists, medical doctors, or
|
|
mathematicians this allows a special profile to be built using
|
|
standard HTML elements plus a group of elements geared to the
|
|
specialist's needs.</p>
|
|
|
|
<!--OddPage-->
|
|
<h1><a name="appendices" id="appendices"></a>
|
|
<a name="dtds" id="dtds">Appendix A. DTDs</a></h1>
|
|
|
|
<p><b>This appendix is normative.</b></p>
|
|
|
|
<p>These DTDs and entity sets form a normative part of this
|
|
specification. The complete set of DTD files together with an XML
|
|
declaration and SGML Open Catalog is included in the <a href=
|
|
"xhtml1.zip">zip file</a> for this specification.</p>
|
|
|
|
<h2><a name="h-A1" id="h-A1">A.1 Document Type
|
|
Definitions</a></h2>
|
|
|
|
<p>These DTDs approximate the HTML 4.0 DTDs. It is likely that
|
|
when the DTDs are modularized, a method of DTD construction will
|
|
be employed that corresponds more closely to HTML 4.0.</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p><a href="DTD/xhtml1-strict.dtd" type="text/plain">
|
|
XHTML-1.0-Strict</a></p>
|
|
</li>
|
|
|
|
<li>
|
|
<p><a href="DTD/xhtml1-transitional.dtd" type="text/plain">
|
|
XHTML-1.0-Transitional</a></p>
|
|
</li>
|
|
|
|
<li>
|
|
<p><a href="DTD/xhtml1-frameset.dtd" type="text/plain">
|
|
XHTML-1.0-Frameset</a></p>
|
|
</li>
|
|
</ul>
|
|
|
|
<h2><a name="h-A2" id="h-A2">A.2 Entity Sets</a></h2>
|
|
|
|
<p>The XHTML entity sets are the same as for HTML 4.0, but have
|
|
been modified to be valid XML 1.0 entity declarations. Note the
|
|
entity for the Euro currency sign (<code>&euro;</code> or
|
|
<code>&#8364;</code> or <code>&#x20AC;</code>) is defined
|
|
as part of the special characters.</p>
|
|
|
|
<ul>
|
|
<li>
|
|
<p><a href="DTD/xhtml-lat1.ent">Latin-1 characters</a></p>
|
|
</li>
|
|
|
|
<li>
|
|
<p><a href="DTD/xhtml-special.ent">Special characters</a></p>
|
|
</li>
|
|
|
|
<li>
|
|
<p><a href="DTD/xhtml-symbol.ent">Symbols</a></p>
|
|
</li>
|
|
</ul>
|
|
|
|
<!--OddPage-->
|
|
<h1><a name="prohibitions" id="prohibitions">Appendix B. Element
|
|
Prohibitions</a></h1>
|
|
|
|
<p><b>This appendix is normative.</b></p>
|
|
|
|
<p>The following elements have prohibitions on which elements
|
|
they can contain (see <a href="#h-4.9">Section 4.9</a>). This
|
|
prohibition applies to all depths of nesting, i.e. it contains
|
|
all the descendant elements.</p>
|
|
|
|
<dl><dt><code class="tag">a</code></dt>
|
|
<dd>
|
|
cannot contain other <code>a</code> elements.</dd>
|
|
<dt><code class="tag">pre</code></dt>
|
|
<dd>cannot contain the <code>img</code>, <code>object</code>,
|
|
<code>big</code>, <code>small</code>, <code>sub</code>, or <code>
|
|
sup</code> elements.</dd>
|
|
|
|
<dt><code class="tag">button</code></dt>
|
|
<dd>cannot contain the <code>input</code>, <code>select</code>,
|
|
<code>textarea</code>, <code>label</code>, <code>button</code>,
|
|
<code>form</code>, <code>fieldset</code>, <code>iframe</code> or
|
|
<code>isindex</code> elements.</dd>
|
|
<dt><code class="tag">label</code></dt>
|
|
<dd>cannot contain other <code class="tag">label</code> elements.</dd>
|
|
<dt><code class="tag">form</code></dt>
|
|
<dd>cannot contain other <code>form</code> elements.</dd>
|
|
</dl>
|
|
|
|
<!--OddPage-->
|
|
<h1><a name="guidelines" id="guidelines">Appendix C.
|
|
HTML Compatibility Guidelines</a></h1>
|
|
|
|
<p><b>This appendix is informative.</b></p>
|
|
|
|
<p>This appendix summarizes design guidelines for authors who
|
|
wish their XHTML documents to render on existing HTML user
|
|
agents.</p>
|
|
|
|
<h2>C.1 Processing Instructions</h2>
|
|
<p>Be aware that processing instructions are rendered on some
|
|
user agents. However, also note that when the XML declaration is not included
|
|
in a document, the document can only use the default character encodings UTF-8
|
|
or UTF-16.</p>
|
|
|
|
<h2>C.2 Empty Elements</h2>
|
|
<p>Include a space before the trailing <code>/</code> and <code>
|
|
></code> of empty elements, e.g. <code class="greenmono">
|
|
<br /></code>, <code class="greenmono">
|
|
<hr /></code> and <code class="greenmono"><img
|
|
src="karen.jpg" alt="Karen" /></code>. Also, use the
|
|
minimized tag syntax for empty elements, e.g. <code class=
|
|
"greenmono"><br /></code>, as the alternative syntax <code
|
|
class="greenmono"><br></br></code> allowed by XML
|
|
gives uncertain results in many existing user agents.</p>
|
|
|
|
<h2>C.3 Element Minimization and Empty Element Content</h2>
|
|
<p>Given an empty instance of an element whose content model is
|
|
not <code>EMPTY</code> (for example, an empty title or paragraph)
|
|
do not use the minimized form (e.g. use <code class="greenmono">
|
|
<p> </p></code> and not <code class="greenmono">
|
|
<p /></code>).</p>
|
|
|
|
<h2>C.4 Embedded Style Sheets and Scripts</h2>
|
|
<p>Use external style sheets if your style sheet uses <code>
|
|
<</code> or <code>&</code> or <code>]]></code> or <code>--</code>. Use
|
|
external scripts if your script uses <code><</code> or <code>
|
|
&</code> or <code>]]></code> or <code>--</code>. Note that XML parsers
|
|
are permitted to silently remove the contents of comments. Therefore, the historical
|
|
practice of "hiding" scripts and style sheets within comments to make the
|
|
documents backward compatible is likely to not work as expected in XML-based
|
|
implementations.</p>
|
|
|
|
<h2>C.5 Line Breaks within Attribute Values</h2>
|
|
<p>Avoid line breaks and multiple whitespace characters within
|
|
attribute values. These are handled inconsistently by user
|
|
agents.</p>
|
|
|
|
<h2>C.6 Isindex</h2>
|
|
<p>Don't include more than one <code>isindex</code> element in
|
|
the document <code>head</code>. The <code>isindex</code> element
|
|
is deprecated in favor of the <code>input</code> element.</p>
|
|
|
|
<h2>C.7 The <code>lang</code> and <code>xml:lang</code> Attributes</h2>
|
|
<p>Use both the <code>lang</code> and <code>xml:lang</code>
|
|
attributes when specifying the language of an element. The value
|
|
of the <code>xml:lang</code> attribute takes precedence.</p>
|
|
|
|
<h2>C.8 Fragment Identifiers</h2>
|
|
<p>In XML, <abbr title="Uniform Resource Identifiers">URIs</abbr> [<a href="#ref-rfc2396">RFC2396</a>] that end with fragment identifiers of the form
|
|
<code>"#foo"</code> do not refer to elements with an attribute
|
|
<code>name="foo"</code>; rather, they refer to elements with an
|
|
attribute defined to be of type <code>ID</code>, e.g., the <code>
|
|
id</code> attribute in HTML 4.0. Many existing HTML clients don't
|
|
support the use of <code>ID</code>-type attributes in this way,
|
|
so identical values may be supplied for both of these attributes to ensure
|
|
maximum forward and backward compatibility (e.g., <code class=
|
|
"greenmono"><a id="foo" name="foo">...</a></code>).</p>
|
|
|
|
<p>Further, since the set of
|
|
legal values for attributes of type <code>ID</code> is much smaller than
|
|
for those of type <code>CDATA</code>, the type of the <code>name</code>
|
|
attribute has been changed to <code>NMTOKEN</code>. This attribute is
|
|
constrained such that it can only have the same values as type
|
|
<code>ID</code>, or as the <code>Name</code> production in XML 1.0 Section
|
|
2.5, production 5. Unfortunately, this constraint cannot be expressed in the
|
|
XHTML 1.0 DTDs. Because of this change, care must be taken when
|
|
converting existing HTML documents. The values of these attributes
|
|
must be unique within the document, valid, and any references to these
|
|
fragment identifiers (both
|
|
internal and external) must be updated should the values be changed during
|
|
conversion.</p>
|
|
<p>Finally, note that XHTML 1.0 has deprecated the
|
|
<code>name</code> attribute of the <code>a</code>, <code>applet</code>, <code>frame</code>, <code>iframe</code>, <code>img</code>, and <code>map</code>
|
|
elements, and it will be
|
|
removed from XHTML in subsequent versions.</p>
|
|
|
|
<h2>C.9 Character Encoding</h2>
|
|
<p>To specify a character encoding in the document, use both the
|
|
encoding attribute specification on the xml declaration (e.g.
|
|
<code class="greenmono"><?xml version="1.0"
|
|
encoding="EUC-JP"?></code>) and a meta http-equiv statement
|
|
(e.g. <code class="greenmono"><meta http-equiv="Content-type"
|
|
content='text/html; charset="EUC-JP"' /></code>). The
|
|
value of the encoding attribute of the xml processing instruction
|
|
takes precedence.</p>
|
|
|
|
<h2>C.10 Boolean Attributes</h2>
|
|
<p>Some HTML user agents are unable to interpret boolean
|
|
attributes when these appear in their full (non-minimized) form,
|
|
as required by XML 1.0. Note this problem doesn't effect user
|
|
agents compliant with HTML 4.0. The following attributes are
|
|
involved: <code>compact</code>, <code>nowrap</code>, <code>
|
|
ismap</code>, <code>declare</code>, <code>noshade</code>, <code>
|
|
checked</code>, <code>disabled</code>, <code>readonly</code>,
|
|
<code>multiple</code>, <code>selected</code>, <code>
|
|
noresize</code>, <code>defer</code>.</p>
|
|
|
|
<h2>C.11 Document Object Model and XHTML</h2>
|
|
<p>
|
|
The Document Object Model level 1 Recommendation [<a href="#ref-dom">DOM</a>]
|
|
defines document object model interfaces for XML and HTML 4.0. The HTML 4.0
|
|
document object model specifies that HTML element and attribute names are
|
|
returned in upper-case. The XML document object model specifies that
|
|
element and attribute names are returned in the case they are specified. In
|
|
XHTML 1.0, elements and attributes are specified in lower-case. This apparent difference can be
|
|
addressed in two ways:
|
|
</p>
|
|
<ol>
|
|
<li>Applications that access XHTML documents served as Internet media type
|
|
<code>text/html</code>
|
|
via the <abbr title="Document Object Model">DOM</abbr> can use the HTML DOM,
|
|
and can rely upon element and attribute names being returned in
|
|
upper-case from those interfaces.</li>
|
|
<li>Applications that access XHTML documents served as Internet media types
|
|
<code>text/xml</code> or <code>application/xml</code>
|
|
can also use the XML DOM. Elements and attributes will be returned in lower-case.
|
|
Also, some XHTML elements may or may
|
|
not appear
|
|
in the object tree because they are optional in the content model
|
|
(e.g. the <code>tbody</code> element within
|
|
<code>table</code>). This occurs because in HTML 4.0 some elements were
|
|
permitted to be minimized such that their start and end tags are both omitted
|
|
(an SGML feature).
|
|
This is not possible in XML. Rather than require document authors to insert
|
|
extraneous elements, XHTML has made the elements optional.
|
|
Applications need to adapt to this
|
|
accordingly.</li>
|
|
</ol>
|
|
|
|
<h2>C.12 Using Ampersands in Attribute Values</h2>
|
|
<p>
|
|
When an attribute value contains an ampersand, it must be expressed as a character
|
|
entity reference
|
|
(e.g. "<code>&amp;</code>"). For example, when the
|
|
<code>href</code> attribute
|
|
of the <code>a</code> element refers to a
|
|
CGI script that takes parameters, it must be expressed as
|
|
<code>http://my.site.dom/cgi-bin/myscript.pl?class=guest&amp;name=user</code>
|
|
rather than as
|
|
<code>http://my.site.dom/cgi-bin/myscript.pl?class=guest&name=user</code>.
|
|
</p>
|
|
|
|
<h2>C.13 Cascading Style Sheets (CSS) and XHTML</h2>
|
|
|
|
<p>The Cascading Style Sheets level 2 Recommendation [<a href="#ref-css2">CSS2</a>] defines style
|
|
properties which are applied to the parse tree of the HTML or XML
|
|
document. Differences in parsing will produce different visual or
|
|
aural results, depending on the selectors used. The following hints
|
|
will reduce this effect for documents which are served without
|
|
modification as both media types:</p>
|
|
|
|
<ol>
|
|
<li>
|
|
CSS style sheets for XHTML should use lower case element and
|
|
attribute names.</li>
|
|
|
|
|
|
<li>In tables, the tbody element will be inferred by the parser of an
|
|
HTML user agent, but not by the parser of an XML user agent. Therefore
|
|
you should always explicitely add a tbody element if it is referred to
|
|
in a CSS selector.</li>
|
|
|
|
<li>Within the XHTML name space, user agents are expected to
|
|
recognize the "id" attribute as an attribute of type ID.
|
|
Therefore, style sheets should be able to continue using the
|
|
shorthand "#" selector syntax even if the user agent does not read
|
|
the DTD.</li>
|
|
|
|
<li>Within the XHTML name space, user agents are expected to
|
|
recognize the "class" attribute. Therefore, style sheets should be
|
|
able to continue using the shorthand "." selector syntax.</li>
|
|
|
|
<li>
|
|
CSS defines different conformance rules for HTML and XML documents;
|
|
be aware that the HTML rules apply to XHTML documents delivered as
|
|
HTML and the XML rules apply to XHTML documents delivered as XML.</li>
|
|
</ol>
|
|
<!--OddPage-->
|
|
<h1><a name="acks" id="acks">Appendix D.
|
|
Acknowledgements</a></h1>
|
|
|
|
<p><b>This appendix is informative.</b></p>
|
|
|
|
<p>This specification was written with the participation of the
|
|
members of the W3C HTML working group:</p>
|
|
|
|
<dl>
|
|
<dd>Steven Pemberton, CWI (HTML Working Group Chair)<br />
|
|
Murray Altheim, Sun Microsystems<br />
|
|
Daniel Austin, CNET: The Computer Network<br />
|
|
Frank Boumphrey, HTML Writers Guild<br />
|
|
John Burger, Mitre<br />
|
|
Andrew W. Donoho, IBM<br />
|
|
Sam Dooley, IBM<br />
|
|
Klaus Hofrichter, GMD<br />
|
|
Philipp Hoschka, W3C<br />
|
|
Masayasu Ishikawa, W3C<br />
|
|
Warner ten Kate, Philips Electronics<br />
|
|
Peter King, Phone.com<br />
|
|
Paula Klante, JetForm<br />
|
|
Shin'ichi Matsui, W3C/Panasonic<br />
|
|
Shane McCarron, Applied Testing and Technology (The Open Group through August
|
|
1999)<br />
|
|
Ann Navarro, HTML Writers Guild<br />
|
|
Zach Nies, Quark<br />
|
|
Dave Raggett, W3C/HP (W3C lead for HTML)<br />
|
|
Patrick Schmitz, Microsoft<br />
|
|
Sebastian Schnitzenbaumer, Stack Overflow<br />
|
|
Chris Wilson, Microsoft<br />
|
|
Ted Wugofski, Gateway 2000<br />
|
|
Dan Zigmond, WebTV Networks</dd>
|
|
</dl>
|
|
|
|
<!--OddPage-->
|
|
<h1><a name="refs" id="refs">Appendix E. References</a></h1>
|
|
|
|
<p><b>This appendix is informative.</b></p>
|
|
|
|
<dl>
|
|
|
|
<dt><a name="ref-css2" id="ref-css2"><b>[CSS2]</b></a></dt>
|
|
|
|
<dd><a href="http://www.w3.org/TR/REC-CSS2">"Cascading Style Sheets, level 2 (CSS2) Specification"</a>, B.
|
|
Bos, H. W. Lie, C. Lilley, I. Jacobs, 12 May 1998.<br />
|
|
Available at: <a href="http://www.w3.org/TR/REC-CSS2">
|
|
http://www.w3.org/TR/REC-CSS2</a></dd>
|
|
|
|
<dt><a name="ref-dom" id="ref-dom"><b>[DOM]</b></a></dt>
|
|
|
|
<dd><a href="http://www.w3.org/TR/REC-DOM-Level-1">"Document Object Model (DOM) Level 1 Specification"</a>, Lauren
|
|
Wood <i>et al.</i>, 1 October 1998.<br />
|
|
Available at: <a href="http://www.w3.org/TR/REC-DOM-Level-1">
|
|
http://www.w3.org/TR/REC-DOM-Level-1</a></dd>
|
|
|
|
<dt><a name="ref-html4" id="ref-html4"><b>[HTML]</b></a></dt>
|
|
|
|
<dd><a href="http://www.w3.org/TR/1999/PR-html40-19990824">"HTML 4.01 Specification"</a>, D. Raggett, A. Le Hors, I.
|
|
Jacobs, 24 August 1999.<br />
|
|
Available at: <a href="http://www.w3.org/TR/1999/PR-html40-19990824">
|
|
http://www.w3.org/TR/1999/PR-html40-19990824</a></dd>
|
|
|
|
<dt><a name="ref-posix" id="ref-posix"><b>[POSIX.1]</b></a></dt>
|
|
|
|
<dd>"ISO/IEC 9945-1:1990 Information Technology - Portable
|
|
Operating System Interface (POSIX) - Part 1: System Application
|
|
Program Interface (API) [C Language]", Institute of Electrical
|
|
and Electronics Engineers, Inc, 1990.</dd>
|
|
|
|
<dt><a name="ref-rfc2046" id="ref-rfc2046"><b>
|
|
[RFC2046]</b></a></dt>
|
|
|
|
<dd><a href="http://www.ietf.org/rfc/rfc2046.txt">"RFC2046: Multipurpose Internet Mail Extensions (MIME) Part
|
|
Two: Media Types"</a>, N. Freed and N. Borenstein, November
|
|
1996.<br />
|
|
Available at <a href="http://www.ietf.org/rfc/rfc2046.txt">
|
|
http://www.ietf.org/rfc/rfc2046.txt</a>. Note that this RFC
|
|
obsoletes RFC1521, RFC1522, and RFC1590.</dd>
|
|
|
|
<dt><a name="ref-rfc2119" id="ref-rfc2119"><b>
|
|
[RFC2119]</b></a></dt>
|
|
|
|
<dd><a href="http://www.ietf.org/rfc/rfc2119.txt">"RFC2119: Key words for use in RFCs to Indicate Requirement
|
|
Levels"</a>, S. Bradner, March 1997.<br />
|
|
Available at: <a href="http://www.ietf.org/rfc/rfc2119.txt">
|
|
http://www.ietf.org/rfc/rfc2119.txt</a></dd>
|
|
|
|
<dt><a name="ref-rfc2376" id="ref-rfc2376"><b>
|
|
[RFC2376]</b></a></dt>
|
|
|
|
<dd><a href="http://www.ietf.org/rfc/rfc2376.txt">"RFC2376: XML Media Types"</a>, E. Whitehead, M. Murata, July
|
|
1998.<br />
|
|
Available at: <a href="http://www.ietf.org/rfc/rfc2376.txt">
|
|
http://www.ietf.org/rfc/rfc2376.txt</a></dd>
|
|
|
|
<dt><a name="ref-rfc2396" id="ref-rfc2396"><b>
|
|
[RFC2396]</b></a></dt>
|
|
|
|
<dd><a href="http://www.ietf.org/rfc/rfc2396.txt">"RFC2396: Uniform Resource Identifiers (URI): Generic
|
|
Syntax"</a>, T. Berners-Lee, R. Fielding, L. Masinter, August
|
|
1998.<br />
|
|
This document updates RFC1738 and RFC1808.<br />
|
|
Available at: <a href="http://www.ietf.org/rfc/rfc2396.txt">
|
|
http://www.ietf.org/rfc/rfc2396.txt</a></dd>
|
|
|
|
<dt><a name="ref-xml" id="ref-xml"><b>[XML]</b></a></dt>
|
|
|
|
<dd><a href="http://www.w3.org/TR/REC-xml">"Extensible Markup Language (XML) 1.0 Specification"</a>, T.
|
|
Bray, J. Paoli, C. M. Sperberg-McQueen, 10 February 1998.<br />
|
|
Available at: <a href="http://www.w3.org/TR/REC-xml">
|
|
http://www.w3.org/TR/REC-xml</a></dd>
|
|
|
|
<dt><a name="ref-xmlns" id="ref-xmlns"><b>[XMLNAMES]</b></a></dt>
|
|
|
|
<dd><a href="http://www.w3.org/TR/REC-xml-names">"Namespaces in XML"</a>, T. Bray, D. Hollander, A. Layman, 14
|
|
January 1999.<br />
|
|
XML namespaces provide a simple method for qualifying names used
|
|
in XML documents by associating them with namespaces identified
|
|
by URI.<br />
|
|
Available at: <a href="http://www.w3.org/TR/REC-xml-names">
|
|
http://www.w3.org/TR/REC-xml-names</a></dd>
|
|
|
|
</dl>
|
|
<p><a href="http://www.w3.org/WAI/WCAG1AAA-Conformance"
|
|
title="Explanation of Level Triple-A Conformance">
|
|
<img height="32" width="88"
|
|
src="wcag1AAA.gif"
|
|
alt="Level Triple-A conformance icon, W3C-WAI Web Content Accessibility Guidelines 1.0" /></a></p>
|
|
<div class="navbar">
|
|
<hr />
|
|
<a href="#toc">table of contents</a>
|
|
</div>
|
|
</body>
|
|
</html>
|
|
|