mirror of
https://github.com/apache/httpd.git
synced 2025-05-19 02:21:09 +03:00
git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@82066 13f79535-47bb-0310-9956-ffa450edef68
325 lines
15 KiB
HTML
325 lines
15 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
|
|
<HTML>
|
|
<HEAD>
|
|
<TITLE>Connections in FIN_WAIT_2 and Apache</TITLE>
|
|
<LINK REV="made" HREF="mailto:marc@apache.org">
|
|
|
|
</HEAD>
|
|
|
|
<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
|
|
<BODY
|
|
BGCOLOR="#FFFFFF"
|
|
TEXT="#000000"
|
|
LINK="#0000FF"
|
|
VLINK="#000080"
|
|
ALINK="#FF0000"
|
|
>
|
|
<!--#include virtual="header.html" -->
|
|
|
|
<H1 ALIGN="CENTER">Connections in the FIN_WAIT_2 state and Apache</H1>
|
|
<OL>
|
|
<LI><H2>What is the FIN_WAIT_2 state?</H2>
|
|
Starting with the Apache 1.2 betas, people are reporting many more
|
|
connections in the FIN_WAIT_2 state (as reported by
|
|
<CODE>netstat</CODE>) than they saw using older versions. When the
|
|
server closes a TCP connection, it sends a packet with the FIN bit
|
|
sent to the client, which then responds with a packet with the ACK bit
|
|
set. The client then sends a packet with the FIN bit set to the
|
|
server, which responds with an ACK and the connection is closed. The
|
|
state that the connection is in during the period between when the
|
|
server gets the ACK from the client and the server gets the FIN from
|
|
the client is known as FIN_WAIT_2. See the <A
|
|
HREF="ftp://ds.internic.net/rfc/rfc793.txt">TCP RFC</A> for the
|
|
technical details of the state transitions.<P>
|
|
|
|
The FIN_WAIT_2 state is somewhat unusual in that there is no timeout
|
|
defined in the standard for it. This means that on many operating
|
|
systems, a connection in the FIN_WAIT_2 state will stay around until
|
|
the system is rebooted. If the system does not have a timeout and
|
|
too many FIN_WAIT_2 connections build up, it can fill up the space
|
|
allocated for storing information about the connections and crash
|
|
the kernel. The connections in FIN_WAIT_2 do not tie up an httpd
|
|
process.<P>
|
|
|
|
<LI><H2>But why does it happen?</H2>
|
|
|
|
There are numerous reasons for it happening, some of them may not
|
|
yet be fully clear. What is known follows.<P>
|
|
|
|
<H3>Buggy clients and persistent connections</H3>
|
|
|
|
Several clients have a bug which pops up when dealing with
|
|
<A HREF="../keepalive.html">persistent connections</A> (aka keepalives).
|
|
When the connection is idle and the server closes the connection
|
|
(based on the <A HREF="../mod/core.html#keepalivetimeout">
|
|
KeepAliveTimeout</A>), the client is programmed so that the client does
|
|
not send back a FIN and ACK to the server. This means that the
|
|
connection stays in the FIN_WAIT_2 state until one of the following
|
|
happens:<P>
|
|
<UL>
|
|
<LI>The client opens a new connection to the same or a different
|
|
site, which causes it to fully close the older connection on
|
|
that socket.
|
|
<LI>The user exits the client, which on some (most?) clients
|
|
causes the OS to fully shutdown the connection.
|
|
<LI>The FIN_WAIT_2 times out, on servers that have a timeout
|
|
for this state.
|
|
</UL><P>
|
|
If you are lucky, this means that the buggy client will fully close the
|
|
connection and release the resources on your server. However, there
|
|
are some cases where the socket is never fully closed, such as a dialup
|
|
client disconnecting from their provider before closing the client.
|
|
In addition, a client might sit idle for days without making another
|
|
connection, and thus may hold its end of the socket open for days
|
|
even though it has no further use for it.
|
|
<STRONG>This is a bug in the browser or in its operating system's
|
|
TCP implementation.</STRONG> <P>
|
|
|
|
The clients on which this problem has been verified to exist:<P>
|
|
<UL>
|
|
<LI>Mozilla/3.01 (X11; I; FreeBSD 2.1.5-RELEASE i386)
|
|
<LI>Mozilla/2.02 (X11; I; FreeBSD 2.1.5-RELEASE i386)
|
|
<LI>Mozilla/3.01Gold (X11; I; SunOS 5.5 sun4m)
|
|
<LI>MSIE 3.01 on the Macintosh
|
|
<LI>MSIE 3.01 on Windows 95
|
|
</UL><P>
|
|
|
|
This does not appear to be a problem on:
|
|
<UL>
|
|
<LI>Mozilla/3.01 (Win95; I)
|
|
</UL>
|
|
<P>
|
|
|
|
It is expected that many other clients have the same problem. What a
|
|
client <STRONG>should do</STRONG> is periodically check its open
|
|
socket(s) to see if they have been closed by the server, and close their
|
|
side of the connection if the server has closed. This check need only
|
|
occur once every few seconds, and may even be detected by a OS signal
|
|
on some systems (<EM>e.g.</EM>, Win95 and NT clients have this capability, but
|
|
they seem to be ignoring it).<P>
|
|
|
|
Apache <STRONG>cannot</STRONG> avoid these FIN_WAIT_2 states unless it
|
|
disables persistent connections for the buggy clients, just
|
|
like we recommend doing for Navigator 2.x clients due to other bugs.
|
|
However, non-persistent connections increase the total number of
|
|
connections needed per client and slow retrieval of an image-laden
|
|
web page. Since non-persistent connections have their own resource
|
|
consumptions and a short waiting period after each closure, a busy server
|
|
may need persistence in order to best serve its clients.<P>
|
|
|
|
As far as we know, the client-caused FIN_WAIT_2 problem is present for
|
|
all servers that support persistent connections, including Apache 1.1.x
|
|
and 1.2.<P>
|
|
|
|
<H3>A necessary bit of code introduced in 1.2</H3>
|
|
|
|
While the above bug is a problem, it is not the whole problem.
|
|
Some users have observed no FIN_WAIT_2 problems with Apache 1.1.x,
|
|
but with 1.2b enough connections build up in the FIN_WAIT_2 state to
|
|
crash their server.
|
|
|
|
The most likely source for additional FIN_WAIT_2 states
|
|
is a function called <CODE>lingering_close()</CODE> which was added
|
|
between 1.1 and 1.2. This function is necessary for the proper
|
|
handling of persistent connections and any request which includes
|
|
content in the message body (<EM>e.g.</EM>, PUTs and POSTs).
|
|
What it does is read any data sent by the client for
|
|
a certain time after the server closes the connection. The exact
|
|
reasons for doing this are somewhat complicated, but involve what
|
|
happens if the client is making a request at the same time the
|
|
server sends a response and closes the connection. Without lingering,
|
|
the client might be forced to reset its TCP input buffer before it
|
|
has a chance to read the server's response, and thus understand why
|
|
the connection has closed.
|
|
See the <A HREF="#appendix">appendix</A> for more details.<P>
|
|
|
|
The code in <CODE>lingering_close()</CODE> appears to cause problems
|
|
for a number of factors, including the change in traffic patterns
|
|
that it causes. The code has been thoroughly reviewed and we are
|
|
not aware of any bugs in it. It is possible that there is some
|
|
problem in the BSD TCP stack, aside from the lack of a timeout
|
|
for the FIN_WAIT_2 state, exposed by the <CODE>lingering_close</CODE>
|
|
code that causes the observed problems.<P>
|
|
|
|
<H2><LI>What can I do about it?</H2>
|
|
|
|
There are several possible workarounds to the problem, some of
|
|
which work better than others.<P>
|
|
|
|
<H3>Add a timeout for FIN_WAIT_2</H3>
|
|
|
|
The obvious workaround is to simply have a timeout for the FIN_WAIT_2 state.
|
|
This is not specified by the RFC, and could be claimed to be a
|
|
violation of the RFC, but it is widely recognized as being necessary.
|
|
The following systems are known to have a timeout:
|
|
<P>
|
|
<UL>
|
|
<LI><A HREF="http://www.freebsd.org/">FreeBSD</A> versions starting at
|
|
2.0 or possibly earlier.
|
|
<LI><A HREF="http://www.netbsd.org/">NetBSD</A> version 1.2(?)
|
|
<LI><A HREF="http://www.openbsd.org/">OpenBSD</A> all versions(?)
|
|
<LI><A HREF="http://www.bsdi.com/">BSD/OS</A> 2.1, with the
|
|
<A HREF="ftp://ftp.bsdi.com/bsdi/patches/patches-2.1/K210-027">
|
|
K210-027</A> patch installed.
|
|
<LI><A HREF="http://www.sun.com/">Solaris</A> as of around version
|
|
2.2. The timeout can be tuned by using <CODE>ndd</CODE> to
|
|
modify <CODE>tcp_fin_wait_2_flush_interval</CODE>, but the
|
|
default should be appropriate for most servers and improper
|
|
tuning can have negative impacts.
|
|
<LI><A HREF="http://www.linux.org/">Linux</A> 2.0.x and
|
|
earlier(?)
|
|
<LI><A HREF="http://www.hp.com/">HP-UX</A> 10.x defaults to
|
|
terminating connections in the FIN_WAIT_2 state after the
|
|
normal keepalive timeouts. This does not
|
|
refer to the persistent connection or HTTP keepalive
|
|
timeouts, but the <CODE>SO_LINGER</CODE> socket option
|
|
which is enabled by Apache. This parameter can be adjusted
|
|
by using <CODE>nettune</CODE> to modify parameters such as
|
|
<CODE>tcp_keepstart</CODE> and <CODE>tcp_keepstop</CODE>.
|
|
In later revisions, there is an explicit timer for
|
|
connections in FIN_WAIT_2 that can be modified; contact HP
|
|
support for details.
|
|
<LI><A HREF="http://www.sgi.com/">SGI IRIX</A> can be patched to
|
|
support a timeout. For IRIX 5.3, 6.2, and 6.3,
|
|
use patches 1654, 1703 and 1778 respectively. If you
|
|
have trouble locating these patches, please contact your
|
|
SGI support channel for help.
|
|
<LI><A HREF="http://www.ncr.com/">NCR's MP RAS Unix</A> 2.xx and
|
|
3.xx both have FIN_WAIT_2 timeouts. In 2.xx it is non-tunable
|
|
at 600 seconds, while in 3.xx it defaults to 600 seconds and
|
|
is calculated based on the tunable "max keep alive probes"
|
|
(default of 8) multiplied by the "keep alive interval" (default
|
|
75 seconds).
|
|
<LI><A HREF="http://www.sequent.com">Sequent's ptx/TCP/IP for
|
|
DYNIX/ptx</A> has had a FIN_WAIT_2 timeout since around
|
|
release 4.1 in mid-1994.
|
|
</UL>
|
|
<P>
|
|
The following systems are known to not have a timeout:
|
|
<P>
|
|
<UL>
|
|
<LI><A HREF="http://www.sun.com/">SunOS 4.x</A> does not and
|
|
almost certainly never will have one because it as at the
|
|
very end of its development cycle for Sun. If you have kernel
|
|
source should be easy to patch.
|
|
</UL>
|
|
<P>
|
|
There is a
|
|
<A HREF="http://www.apache.org/dist/contrib/patches/1.2/fin_wait_2.patch"
|
|
>patch available</A> for adding a timeout to the FIN_WAIT_2 state; it
|
|
was originally intended for BSD/OS, but should be adaptable to most
|
|
systems using BSD networking code. You need kernel source code to be
|
|
able to use it. If you do adapt it to work for any other systems,
|
|
please drop me a note at <A HREF="mailto:marc@apache.org">marc@apache.org</A>.
|
|
<P>
|
|
<H3>Compile without using <CODE>lingering_close()</CODE></H3>
|
|
|
|
It is possible to compile Apache 1.2 without using the
|
|
<CODE>lingering_close()</CODE> function. This will result in that
|
|
section of code being similar to that which was in 1.1. If you do
|
|
this, be aware that it can cause problems with PUTs, POSTs and
|
|
persistent connections, especially if the client uses pipelining.
|
|
That said, it is no worse than on 1.1, and we understand that keeping your
|
|
server running is quite important.<P>
|
|
|
|
To compile without the <CODE>lingering_close()</CODE> function, add
|
|
<CODE>-DNO_LINGCLOSE</CODE> to the end of the
|
|
<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE> file,
|
|
rerun <CODE>Configure</CODE> and rebuild the server.
|
|
<P>
|
|
<H3>Use <CODE>SO_LINGER</CODE> as an alternative to
|
|
<CODE>lingering_close()</CODE></H3>
|
|
|
|
On most systems, there is an option called <CODE>SO_LINGER</CODE> that
|
|
can be set with <CODE>setsockopt(2)</CODE>. It does something very
|
|
similar to <CODE>lingering_close()</CODE>, except that it is broken
|
|
on many systems so that it causes far more problems than
|
|
<CODE>lingering_close</CODE>. On some systems, it could possibly work
|
|
better so it may be worth a try if you have no other alternatives. <P>
|
|
|
|
To try it, add <CODE>-DUSE_SO_LINGER -DNO_LINGCLOSE</CODE> to the end of the
|
|
<CODE>EXTRA_CFLAGS</CODE> line in your <CODE>Configuration</CODE>
|
|
file, rerun <CODE>Configure</CODE> and rebuild the server. <P>
|
|
|
|
<STRONG>NOTE:</STRONG> Attempting to use <CODE>SO_LINGER</CODE> and
|
|
<CODE>lingering_close()</CODE> at the same time is very likely to do
|
|
very bad things, so don't.<P>
|
|
|
|
<H3>Increase the amount of memory used for storing connection state</H3>
|
|
<DL>
|
|
<DT>BSD based networking code:
|
|
<DD>BSD stores network data, such as connection states,
|
|
in something called an mbuf. When you get so many connections
|
|
that the kernel does not have enough mbufs to put them all in, your
|
|
kernel will likely crash. You can reduce the effects of the problem
|
|
by increasing the number of mbufs that are available; this will not
|
|
prevent the problem, it will just make the server go longer before
|
|
crashing.<P>
|
|
|
|
The exact way to increase them may depend on your OS; look
|
|
for some reference to the number of "mbufs" or "mbuf clusters". On
|
|
many systems, this can be done by adding the line
|
|
<CODE>NMBCLUSTERS="n"</CODE>, where <CODE>n</CODE> is the number of
|
|
mbuf clusters you want to your kernel config file and rebuilding your
|
|
kernel.<P>
|
|
</DL>
|
|
|
|
<H3>Disable KeepAlive</H3>
|
|
<P>If you are unable to do any of the above then you should, as a last
|
|
resort, disable KeepAlive. Edit your httpd.conf and change "KeepAlive On"
|
|
to "KeepAlive Off".
|
|
|
|
<H2><LI>Feedback</H2>
|
|
|
|
If you have any information to add to this page, please contact me at
|
|
<A HREF="mailto:marc@apache.org">marc@apache.org</A>.<P>
|
|
|
|
<H2><A NAME="appendix"><LI>Appendix</A></H2>
|
|
<P>
|
|
Below is a message from Roy Fielding, one of the authors of HTTP/1.1.
|
|
|
|
<H3>Why the lingering close functionality is necessary with HTTP</H3>
|
|
|
|
The need for a server to linger on a socket after a close is noted a couple
|
|
times in the HTTP specs, but not explained. This explanation is based on
|
|
discussions between myself, Henrik Frystyk, Robert S. Thau, Dave Raggett,
|
|
and John C. Mallery in the hallways of MIT while I was at W3C.<P>
|
|
|
|
If a server closes the input side of the connection while the client
|
|
is sending data (or is planning to send data), then the server's TCP
|
|
stack will signal an RST (reset) back to the client. Upon
|
|
receipt of the RST, the client will flush its own incoming TCP buffer
|
|
back to the un-ACKed packet indicated by the RST packet argument.
|
|
If the server has sent a message, usually an error response, to the
|
|
client just before the close, and the client receives the RST packet
|
|
before its application code has read the error message from its incoming
|
|
TCP buffer and before the server has received the ACK sent by the client
|
|
upon receipt of that buffer, then the RST will flush the error message
|
|
before the client application has a chance to see it. The result is
|
|
that the client is left thinking that the connection failed for no
|
|
apparent reason.<P>
|
|
|
|
There are two conditions under which this is likely to occur:
|
|
<OL>
|
|
<LI>sending POST or PUT data without proper authorization
|
|
<LI>sending multiple requests before each response (pipelining)
|
|
and one of the middle requests resulting in an error or
|
|
other break-the-connection result.
|
|
</OL>
|
|
<P>
|
|
The solution in all cases is to send the response, close only the
|
|
write half of the connection (what shutdown is supposed to do), and
|
|
continue reading on the socket until it is either closed by the
|
|
client (signifying it has finally read the response) or a timeout occurs.
|
|
That is what the kernel is supposed to do if SO_LINGER is set.
|
|
Unfortunately, SO_LINGER has no effect on some systems; on some other
|
|
systems, it does not have its own timeout and thus the TCP memory
|
|
segments just pile-up until the next reboot (planned or not).<P>
|
|
|
|
Please note that simply removing the linger code will not solve the
|
|
problem -- it only moves it to a different and much harder one to detect.
|
|
</OL>
|
|
<!--#include virtual="footer.html" -->
|
|
</BODY>
|
|
</HTML>
|