mirror of
https://github.com/apache/httpd.git
synced 2025-05-17 15:21:13 +03:00
git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@87505 13f79535-47bb-0310-9956-ffa450edef68
506 lines
13 KiB
HTML
506 lines
13 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<TITLE>The Apache EBCDIC Port</TITLE>
|
|
</HEAD>
|
|
|
|
<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
|
|
<BODY
|
|
BGCOLOR="#FFFFFF"
|
|
TEXT="#000000"
|
|
LINK="#0000FF"
|
|
VLINK="#000080"
|
|
ALINK="#FF0000"
|
|
>
|
|
<!--#include virtual="header.html" -->
|
|
|
|
<blockquote><strong>Warning:</strong>
|
|
This document has not been updated to take into account changes
|
|
made in the 2.0 version of the Apache HTTP Server. Some of the
|
|
information may still be relevant, but please use it
|
|
with care.
|
|
</blockquote>
|
|
|
|
<H1 ALIGN="CENTER">Overview of the Apache EBCDIC Port</H1>
|
|
|
|
<P>
|
|
Version 1.3 of the Apache HTTP Server is the first version which
|
|
includes a port to a (non-ASCII) mainframe machine which uses
|
|
the EBCDIC character set as its native codeset.<BR>
|
|
(It is the SIEMENS family of mainframes running the
|
|
<A HREF="http://www.siemens.de/servers/bs2osd/osdbc_us.htm">BS2000/OSD
|
|
operating system</A>. This mainframe OS nowadays features a
|
|
SVR4-derived POSIX subsystem).
|
|
</P>
|
|
|
|
<P>
|
|
The port was started initially to
|
|
<UL>
|
|
<LI> prove the feasibility of porting
|
|
<A HREF="http://dev.apache.org/">the Apache HTTP server</A>
|
|
to this platform
|
|
<LI> find a "worthy and capable" successor for the venerable
|
|
<A HREF="http://www.w3.org/Daemon/">CERN-3.0</A> daemon
|
|
(which was ported a couple of years ago), and to
|
|
<LI> prove that Apache's preforking process model can on this platform
|
|
easily outperform the accept-fork-serve model used by CERN by a
|
|
factor of 5 or more.
|
|
</UL>
|
|
</P>
|
|
|
|
<P>
|
|
This document serves as a rationale to describe some of the design
|
|
decisions of the port to this machine.
|
|
</P>
|
|
|
|
<H2 ALIGN=CENTER>Design Goals</H2>
|
|
<P>
|
|
One objective of the EBCDIC port was to maintain enough backwards
|
|
compatibility with the (EBCDIC) CERN server to make the transition to
|
|
the new server attractive and easy. This required the addition of
|
|
a configurable method to define whether a HTML document was stored
|
|
in ASCII (the only format accepted by the old server) or in EBCDIC
|
|
(the native document format in the POSIX subsystem, and therefore
|
|
the only realistic format in which the other POSIX tools like grep
|
|
or sed could operate on the documents). The current solution to
|
|
this is a "pseudo-MIME-format" which is intercepted and
|
|
interpreted by the Apache server (see below). Future versions
|
|
might solve the problem by defining an "ebcdic-handler" for all
|
|
documents which must be converted.
|
|
</P>
|
|
|
|
<H2 ALIGN=CENTER>Technical Solution</H2>
|
|
<P>
|
|
Since all Apache input and output is based upon the BUFF data type
|
|
and its methods, the easiest solution was to add the conversion to
|
|
the BUFF handling routines. The conversion must be settable at any
|
|
time, so a BUFF flag was added which defines whether a BUFF object
|
|
has currently enabled conversion or not. This flag is modified at
|
|
several points in the HTTP protocol:
|
|
<UL>
|
|
<LI><STRONG>set</STRONG> before a request is received (because the
|
|
request and the request header lines are always in ASCII
|
|
format)
|
|
|
|
<LI><STRONG>set/unset</STRONG> when the request body is
|
|
received - depending on the content type of the request body
|
|
(because the request body may contain ASCII text or a binary file)
|
|
|
|
<LI><STRONG>set</STRONG> before a reply header is sent (because the
|
|
response header lines are always in ASCII format)
|
|
|
|
<LI><STRONG>set/unset</STRONG> when the response body is
|
|
sent - depending on the content type of the response body
|
|
(because the response body may contain text or a binary file)
|
|
</UL>
|
|
</P>
|
|
|
|
<H2 ALIGN=CENTER>Porting Notes</H2>
|
|
<P>
|
|
<OL>
|
|
<LI>
|
|
The relevant changes in the source are #ifdef'ed into two
|
|
categories:
|
|
<DL>
|
|
<DT><CODE><STRONG>#ifdef CHARSET_EBCDIC</STRONG></CODE>
|
|
<DD>Code which is needed for any EBCDIC based machine. This
|
|
includes character translations, differences in
|
|
contiguity of the two character sets, flags which
|
|
indicate which part of the HTTP protocol has to be
|
|
converted and which part doesn't <EM>etc.</EM>
|
|
<DT><CODE><STRONG>#ifdef _OSD_POSIX</STRONG></CODE>
|
|
<DD>Code which is needed for the SIEMENS BS2000/OSD
|
|
mainframe platform only. This deals with include file
|
|
differences and socket implementation topics which are
|
|
only required on the BS2000/OSD platform.
|
|
</DL>
|
|
</LI><BR>
|
|
|
|
<LI>
|
|
The possibility to translate between ASCII and EBCDIC at the
|
|
socket level (on BS2000 POSIX, there is a socket option which
|
|
supports this) was intentionally <EM>not</EM> chosen, because
|
|
the byte stream at the HTTP protocol level consists of a
|
|
mixture of protocol related strings and non-protocol related
|
|
raw file data. HTTP protocol strings are always encoded in
|
|
ASCII (the GET request, any Header: lines, the chunking
|
|
information <EM>etc.</EM>) whereas the file transfer parts (<EM>i.e.</EM>, GIF
|
|
images, CGI output <EM>etc.</EM>) should usually be just "passed through"
|
|
by the server. This separation between "protocol string" and
|
|
"raw data" is reflected in the server code by functions like
|
|
bgets() or rvputs() for strings, and functions like bwrite()
|
|
for binary data. A global translation of everything would
|
|
therefore be inadequate.<BR>
|
|
(In the case of text files of course, provisions must be made so
|
|
that EBCDIC documents are always served in ASCII)
|
|
</LI><BR>
|
|
|
|
<LI>
|
|
This port therefore features a built-in protocol level conversion
|
|
for the server-internal strings (which the compiler translated to
|
|
EBCDIC strings) and thus for all server-generated documents.
|
|
The hard coded ASCII escapes \012 and \015 which are
|
|
ubiquitous in the server code are an exception: they are
|
|
already the binary encoding of the ASCII \n and \r and must
|
|
not be converted to ASCII a second time. This exception is
|
|
only relevant for server-generated strings; and <EM>external</EM>
|
|
EBCDIC documents are not expected to contain ASCII newline characters.
|
|
</LI><BR>
|
|
|
|
<LI>
|
|
By examining the call hierarchy for the BUFF management
|
|
routines, I added an "ebcdic/ascii conversion layer" which
|
|
would be crossed on every puts/write/get/gets, and a
|
|
conversion flag which allowed enabling/disabling the
|
|
conversions on-the-fly. Usually, a document crosses this
|
|
layer twice from its origin source (a file or CGI output) to
|
|
its destination (the requesting client): <SAMP>file ->
|
|
Apache</SAMP>, and <SAMP>Apache -> client</SAMP>.<BR>
|
|
The server can now read the header
|
|
lines of a CGI-script output in EBCDIC format, and then find
|
|
out that the remainder of the script's output is in ASCII
|
|
(like in the case of the output of a WWW Counter program: the
|
|
document body contains a GIF image). All header processing is
|
|
done in the native EBCDIC format; the server then determines,
|
|
based on the type of document being served, whether the
|
|
document body (except for the chunking information, of
|
|
course) is in ASCII already or must be converted from EBCDIC.
|
|
</LI><BR>
|
|
|
|
<LI>
|
|
For Text documents (MIME types text/plain, text/html <EM>etc.</EM>),
|
|
an implicit translation to ASCII can be used, or (if the
|
|
users prefer to store some documents in raw ASCII form for
|
|
faster serving, or because the files reside on a NFS-mounted
|
|
directory tree) can be served without conversion.
|
|
<BR>
|
|
<STRONG>Example:</STRONG><BLOCKQUOTE>
|
|
to serve files with the suffix .ahtml as a raw ASCII text/html
|
|
document without implicit conversion (and suffix .ascii
|
|
as ASCII text/plain), use the directives:<PRE>
|
|
AddType text/x-ascii-html .ahtml
|
|
AddType text/x-ascii-plain .ascii
|
|
</PRE></BLOCKQUOTE>
|
|
Similarly, any text/XXXX MIME type can be served as "raw ASCII" by
|
|
configuring a MIME type "text/x-ascii-XXXX" for it using AddType.
|
|
</LI><BR>
|
|
|
|
<LI>
|
|
Non-text documents are always served "binary" without conversion.
|
|
This seems to be the most sensible choice for, .<EM>e.g.</EM>, GIF/ZIP/AU
|
|
file types. This of course requires the user to copy them to the
|
|
mainframe host using the "rcp -b" binary switch.
|
|
</LI><BR>
|
|
|
|
<LI>
|
|
Server parsed files are always assumed to be in native (<EM>i.e.</EM>,
|
|
EBCDIC) format as used on the machine, and are converted after
|
|
processing.
|
|
</LI><BR>
|
|
|
|
<LI>
|
|
For CGI output, the CGI script determines whether a conversion is
|
|
needed or not: by setting the appropriate Content-Type, text files
|
|
can be converted, or GIF output can be passed through unmodified.
|
|
An example for the latter case is the wwwcount program which we ported
|
|
as well.
|
|
</LI><BR>
|
|
</OL>
|
|
</P>
|
|
|
|
<H2 ALIGN=CENTER>Document Storage Notes</H2>
|
|
<H3 ALIGN=CENTER>Binary Files</H3>
|
|
<P>
|
|
All files with a <SAMP>Content-Type:</SAMP> which does not
|
|
start with <SAMP>text/</SAMP> are regarded as <EM>binary files</EM>
|
|
by the server and are not subject to any conversion.
|
|
Examples for binary files are GIF images, gzip-compressed
|
|
files and the like.
|
|
</P>
|
|
<P>
|
|
When exchanging binary files between the mainframe host and a
|
|
Unix machine or Windows PC, be sure to use the ftp "binary"
|
|
(<SAMP>TYPE I</SAMP>) command, or use the
|
|
<SAMP>rcp -b</SAMP> command from the mainframe host
|
|
(the -b switch is not supported in unix rcp's).
|
|
</P>
|
|
|
|
<H3 ALIGN=CENTER>Text Documents</H3>
|
|
<P>
|
|
The default assumption of the server is that Text Files
|
|
(<EM>i.e.</EM>, all files whose <SAMP>Content-Type:</SAMP> starts with
|
|
<SAMP>text/</SAMP>) are stored in the native character
|
|
set of the host, EBCDIC.
|
|
</P>
|
|
|
|
<H3 ALIGN=CENTER>Server Side Included Documents</H3>
|
|
<P>
|
|
SSI documents must currently be stored in EBCDIC only. No
|
|
provision is made to convert it from ASCII before processing.
|
|
</P>
|
|
|
|
<H2 ALIGN=CENTER>Apache Modules' Status</H2>
|
|
<TABLE BORDER ALIGN=middle>
|
|
<TR>
|
|
<TH>Module
|
|
<TH>Status
|
|
<TH>Notes
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>http_core
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_access
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_actions
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_alias
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_asis
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_auth
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_auth_anon
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_auth_db
|
|
<TD ALIGN=CENTER>?
|
|
<TD>with own libdb.a
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_auth_dbm
|
|
<TD ALIGN=CENTER>?
|
|
<TD>with own libdb.a
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_autoindex
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_cern_meta
|
|
<TD ALIGN=CENTER>?
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_cgi
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_digest
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_dir
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_so
|
|
<TD ALIGN=CENTER>-
|
|
<TD>no shared libs
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_env
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_example
|
|
<TD ALIGN=CENTER>-
|
|
<TD>(test bed only)
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_expires
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_headers
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_imap
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_include
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_info
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_log_agent
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_log_config
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_log_referer
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_mime
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_mime_magic
|
|
<TD ALIGN=CENTER>?
|
|
<TD>not ported yet
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_negotiation
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_proxy
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_rewrite
|
|
<TD ALIGN=CENTER>+
|
|
<TD>untested
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_setenvif
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_speling
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_status
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_unique_id
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_userdir
|
|
<TD ALIGN=CENTER>+
|
|
<TD>
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT>mod_usertrack
|
|
<TD ALIGN=CENTER>?
|
|
<TD>untested
|
|
</TR>
|
|
</TABLE>
|
|
|
|
<H2 ALIGN=CENTER>Third Party Modules' Status</H2>
|
|
<TABLE BORDER ALIGN=middle>
|
|
<TR>
|
|
<TH>Module
|
|
<TH>Status
|
|
<TH>Notes
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT><A HREF="http://java.apache.org/">mod_jserv</A>
|
|
<TD ALIGN=CENTER>-
|
|
<TD>JAVA still being ported.
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT><A HREF="http://www.php.net/">mod_php3</A>
|
|
<TD ALIGN=CENTER>+
|
|
<TD>mod_php3 runs fine, with LDAP and GD and FreeType libraries
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT
|
|
><A HREF="http://hpwww.ec-lyon.fr/~vincent/apache/mod_put.html">mod_put</A>
|
|
<TD ALIGN=CENTER>?
|
|
<TD>untested
|
|
</TR>
|
|
|
|
<TR>
|
|
<TD ALIGN=LEFT
|
|
><A HREF="ftp://hachiman.vidya.com/pub/apache/">mod_session</A>
|
|
<TD ALIGN=CENTER>-
|
|
<TD>untested
|
|
</TR>
|
|
|
|
</TABLE>
|
|
|
|
<!--#include virtual="footer.html" -->
|
|
</BODY>
|
|
</HTML>
|