mirror of
https://github.com/apache/httpd.git
synced 2025-04-18 22:24:07 +03:00
git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@1907148 13f79535-47bb-0310-9956-ffa450edef68
1478 lines
80 KiB
XML
1478 lines
80 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
|
|
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
|
|
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type" />
|
|
<!--
|
|
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
|
|
This file is generated from xml source: DO NOT EDIT
|
|
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
|
|
-->
|
|
<title>Performance Scaling - Apache HTTP Server Version 2.5</title>
|
|
<link href="../style/css/manual.css" rel="stylesheet" media="all" type="text/css" title="Main stylesheet" />
|
|
<link href="../style/css/manual-loose-100pc.css" rel="alternate stylesheet" media="all" type="text/css" title="No Sidebar - Default font size" />
|
|
<link href="../style/css/manual-print.css" rel="stylesheet" media="print" type="text/css" /><link rel="stylesheet" type="text/css" href="../style/css/prettify.css" />
|
|
<script src="../style/scripts/prettify.min.js" type="text/javascript">
|
|
</script>
|
|
|
|
<link href="../images/favicon.ico" rel="shortcut icon" /></head>
|
|
<body id="manual-page"><div id="page-header">
|
|
<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/quickreference.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p>
|
|
<p class="apache">Apache HTTP Server Version 2.5</p>
|
|
<img alt="" src="../images/feather.png" /></div>
|
|
<div class="up"><a href="./"><img title="<-" alt="<-" src="../images/left.gif" /></a></div>
|
|
<div id="path">
|
|
<a href="http://www.apache.org/">Apache</a> > <a href="http://httpd.apache.org/">HTTP Server</a> > <a href="http://httpd.apache.org/docs/">Documentation</a> > <a href="../">Version 2.5</a> > <a href="./">Miscellaneous Documentation</a></div><div id="page-content"><div id="preamble"><h1>Performance Scaling</h1>
|
|
<div class="toplang">
|
|
<p><span>Available Languages: </span><a href="../en/misc/perf-scaling.html" title="English"> en </a> |
|
|
<a href="../es/misc/perf-scaling.html" hreflang="es" rel="alternate" title="Español"> es </a> |
|
|
<a href="../fr/misc/perf-scaling.html" hreflang="fr" rel="alternate" title="Français"> fr </a></p>
|
|
</div>
|
|
|
|
|
|
<p>The Performance Tuning page in the Apache 1.3 documentation says:
|
|
</p>
|
|
<blockquote><p>
|
|
"Apache is a general webserver, which is designed to be
|
|
correct first, and fast
|
|
second. Even so, its performance is quite satisfactory. Most
|
|
sites have less than 10Mbits of outgoing bandwidth, which
|
|
Apache can fill using only a low end Pentium-based
|
|
webserver."</p>
|
|
</blockquote>
|
|
<p>However, this sentence was written a few years ago, and in the
|
|
meantime several things have happened. On one hand, web server
|
|
hardware has become much faster. On the other hand, many sites now
|
|
are allowed much more than ten megabits per second of outgoing
|
|
bandwidth. In addition, web applications have become more complex.
|
|
The classic brochureware site is alive and well, but the web has
|
|
grown up substantially as a computing application platform and
|
|
webmasters may find themselves running dynamic content in Perl, PHP
|
|
or Java, all of which take a toll on performance.
|
|
</p>
|
|
<p>Therefore, in spite of strides forward in machine speed and
|
|
bandwidth allowances, web server performance and web application
|
|
performance remain areas of concern. In this documentation several
|
|
aspects of web server performance will be discussed.
|
|
</p>
|
|
|
|
</div>
|
|
<div id="quickview"><ul id="toc"><li><img alt="" src="../images/down.gif" /> <a href="#what-will-and-will-not-be-discussed">What Will and Will Not Be Discussed
|
|
</a></li>
|
|
<li><img alt="" src="../images/down.gif" /> <a href="#monitoring-your-server">Monitoring Your Server
|
|
</a></li>
|
|
<li><img alt="" src="../images/down.gif" /> <a href="#configuring-for-performance">Configuring for Performance
|
|
</a></li>
|
|
<li><img alt="" src="../images/down.gif" /> <a href="#caching-content">Caching Content
|
|
</a></li>
|
|
<li><img alt="" src="../images/down.gif" /> <a href="#further-considerations">Further Considerations
|
|
</a></li>
|
|
</ul><h3>See also</h3><ul class="seealso"><li><a href="#comments_section">Comments</a></li></ul></div>
|
|
<div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
|
|
<div class="section">
|
|
<h2><a name="what-will-and-will-not-be-discussed" id="what-will-and-will-not-be-discussed">What Will and Will Not Be Discussed
|
|
</a> <a title="Permanent link" href="#what-will-and-will-not-be-discussed" class="permalink">¶</a></h2>
|
|
|
|
<p>The session will focus on easily accessible configuration and tuning
|
|
options for Apache httpd 2.2 and 2.4 as well as monitoring tools.
|
|
Monitoring tools will allow you to observe your web server to
|
|
gather information about its performance, or lack thereof.
|
|
We'll assume that you don't have an unlimited budget for
|
|
server hardware, so the existing infrastructure will have to do the
|
|
job. You have no desire to compile your own Apache, or to recompile
|
|
the operating system kernel. We do assume, though, that you have
|
|
some familiarity with the Apache httpd configuration file.
|
|
</p>
|
|
|
|
</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
|
|
<div class="section">
|
|
<h2><a name="monitoring-your-server" id="monitoring-your-server">Monitoring Your Server
|
|
</a> <a title="Permanent link" href="#monitoring-your-server" class="permalink">¶</a></h2>
|
|
|
|
<p>The first task when sizing or performance-tuning your server is to
|
|
find out how your system is currently performing. By monitoring
|
|
your server under real-world load, or artificially generated load,
|
|
you can extrapolate its behavior under stress, such as when your
|
|
site is mentioned on Slashdot.
|
|
</p>
|
|
|
|
|
|
<h3><a name="monitoring-tools" id="monitoring-tools">Monitoring Tools
|
|
</a></h3>
|
|
|
|
|
|
|
|
<h4><a name="top" id="top">top
|
|
</a></h4>
|
|
|
|
<p>The top tool ships with Linux and FreeBSD. Solaris offers
|
|
<code>prstat(1)</code>. It collects a number of statistics for the
|
|
system and for each running process, then displays them
|
|
interactively on your terminal. The data displayed is
|
|
refreshed every second and varies by platform, but
|
|
typically includes system load average, number of processes
|
|
and their current states, the percent CPU(s) time spent
|
|
executing user and system code, and the state of the
|
|
virtual memory system. The data displayed for each process
|
|
is typically configurable and includes its process name and
|
|
ID, priority and nice values, memory footprint, and
|
|
percentage CPU usage. The following example shows multiple
|
|
httpd processes (with MPM worker and event) running on an
|
|
Linux (Xen) system:
|
|
</p>
|
|
|
|
<div class="example"><pre>top - 23:10:58 up 71 days, 6:14, 4 users, load average: 0.25, 0.53, 0.47
|
|
Tasks: 163 total, 1 running, 162 sleeping, 0 stopped, 0 zombie
|
|
Cpu(s): 11.6%us, 0.7%sy, 0.0%ni, 87.3%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st
|
|
Mem: 2621656k total, 2178684k used, 442972k free, 100500k buffers
|
|
Swap: 4194296k total, 860584k used, 3333712k free, 1157552k cached
|
|
|
|
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
|
|
16687 example_ 20 0 1200m 547m 179m S 45 21.4 1:09.59 httpd-worker
|
|
15195 www 20 0 441m 33m 2468 S 0 1.3 0:41.41 httpd-worker
|
|
1 root 20 0 10312 328 308 S 0 0.0 0:33.17 init
|
|
2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd
|
|
3 root RT -5 0 0 0 S 0 0.0 0:00.14 migration/0
|
|
4 root 15 -5 0 0 0 S 0 0.0 0:04.58 ksoftirqd/0
|
|
5 root RT -5 0 0 0 S 0 0.0 4:45.89 watchdog/0
|
|
6 root 15 -5 0 0 0 S 0 0.0 1:42.52 events/0
|
|
7 root 15 -5 0 0 0 S 0 0.0 0:00.00 khelper
|
|
19 root 15 -5 0 0 0 S 0 0.0 0:00.00 xenwatch
|
|
20 root 15 -5 0 0 0 S 0 0.0 0:00.00 xenbus
|
|
28 root RT -5 0 0 0 S 0 0.0 0:00.14 migration/1
|
|
29 root 15 -5 0 0 0 S 0 0.0 0:00.20 ksoftirqd/1
|
|
30 root RT -5 0 0 0 S 0 0.0 0:05.96 watchdog/1
|
|
31 root 15 -5 0 0 0 S 0 0.0 1:18.35 events/1
|
|
32 root RT -5 0 0 0 S 0 0.0 0:00.08 migration/2
|
|
33 root 15 -5 0 0 0 S 0 0.0 0:00.18 ksoftirqd/2
|
|
34 root RT -5 0 0 0 S 0 0.0 0:06.00 watchdog/2
|
|
35 root 15 -5 0 0 0 S 0 0.0 1:08.39 events/2
|
|
36 root RT -5 0 0 0 S 0 0.0 0:00.10 migration/3
|
|
37 root 15 -5 0 0 0 S 0 0.0 0:00.16 ksoftirqd/3
|
|
38 root RT -5 0 0 0 S 0 0.0 0:06.08 watchdog/3
|
|
39 root 15 -5 0 0 0 S 0 0.0 1:22.81 events/3
|
|
68 root 15 -5 0 0 0 S 0 0.0 0:06.28 kblockd/0
|
|
69 root 15 -5 0 0 0 S 0 0.0 0:00.04 kblockd/1
|
|
70 root 15 -5 0 0 0 S 0 0.0 0:00.04 kblockd/2</pre></div>
|
|
|
|
<p>Top is a wonderful tool even though it's slightly resource
|
|
intensive (when running, its own process is usually in the
|
|
top ten CPU gluttons). It is indispensable in determining
|
|
the size of a running process, which comes in handy when
|
|
determining how many server processes you can run on your
|
|
machine. How to do this is described in <a href="#sizing-maxClients">sizing MaxClients</a>.
|
|
Top is, however, an interactive tool and running it
|
|
continuously has few if any advantages.
|
|
</p>
|
|
|
|
<h4><a name="free" id="free">free
|
|
</a></h4>
|
|
|
|
<p>This command is only available on Linux. It shows how much
|
|
memory and swap space is in use. Linux allocates unused
|
|
memory as file system cache. The free command shows usage
|
|
both with and without this cache. The free command can be
|
|
used to find out how much memory the operating system is
|
|
using, as described in the paragraph <a href="#sizing-maxClients">sizing MaxClients</a>.
|
|
The output of free looks like this:
|
|
</p>
|
|
|
|
<div class="example"><pre>sctemme@brutus:~$ free
|
|
total used free shared buffers cached
|
|
Mem: 4026028 3901892 124136 0 253144 841044
|
|
-/+ buffers/cache: 2807704 1218324
|
|
Swap: 3903784 12540 3891244</pre></div>
|
|
|
|
|
|
<h4><a name="vmstat" id="vmstat">vmstat
|
|
</a></h4>
|
|
|
|
<p>This command is available on many unix platforms. It
|
|
displays a large number of operating system metrics. Run
|
|
without argument, it displays a status line for that
|
|
moment. When a numeric argument is added, the status is
|
|
redisplayed at designated intervals. For example,
|
|
<code>vmstat 5</code>
|
|
causes the information to reappear every five seconds.
|
|
Vmstat displays the amount of virtual memory in use, how
|
|
much memory is swapped in and out each second, the number
|
|
of processes currently running and sleeping, the number of
|
|
interrupts and context switches per second and the usage
|
|
percentages of the CPU.
|
|
</p>
|
|
<p>
|
|
The following is <code>vmstat</code> output of an idle server:
|
|
</p>
|
|
|
|
|
|
<div class="example"><pre>[sctemme@GayDeceiver sctemme]$ vmstat 5 3
|
|
procs memory swap io system cpu
|
|
r b w swpd free buff cache si so bi bo in cs us sy id
|
|
0 0 0 0 186252 6688 37516 0 0 12 5 47 311 0 1 99
|
|
0 0 0 0 186244 6696 37516 0 0 0 16 41 314 0 0 100
|
|
0 0 0 0 186236 6704 37516 0 0 0 9 44 314 0 0 100</pre></div>
|
|
|
|
<p>And this is output of a server that is under a load of one
|
|
hundred simultaneous connections fetching static content:
|
|
</p>
|
|
|
|
<div class="example"><pre>[sctemme@GayDeceiver sctemme]$ vmstat 5 3
|
|
procs memory swap io system cpu
|
|
r b w swpd free buff cache si so bi bo in cs us sy id
|
|
1 0 1 0 162580 6848 40056 0 0 11 5 150 324 1 1 98
|
|
6 0 1 0 163280 6856 40248 0 0 0 66 6384 1117 42 25 32
|
|
11 0 0 0 162780 6864 40436 0 0 0 61 6309 1165 33 28 40</pre></div>
|
|
|
|
<p>The first line gives averages since the last reboot. The
|
|
subsequent lines give information for five second
|
|
intervals. The second argument tells vmstat to generate
|
|
three reports and then exit.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="se-toolkit" id="se-toolkit">SE Toolkit
|
|
</a></h4>
|
|
|
|
<p>The SE Toolkit is a system monitoring toolkit for Solaris.
|
|
Its programming language is based on the C preprocessor and
|
|
comes with a number of sample scripts. It can use both the
|
|
command line and the GUI to display information. It can
|
|
also be programmed to apply rules to the system data. The
|
|
example script shown in Figure 2, Zoom.se, shows green,
|
|
orange or red indicators when utilization of various parts
|
|
of the system rises above certain thresholds. Another
|
|
included script, Virtual Adrian, applies performance tuning
|
|
metrics according to.
|
|
</p>
|
|
<p>The SE Toolkit has drifted around for a while and has had
|
|
several owners since its inception. It seems that it has
|
|
now found a final home at Sunfreeware.com, where it can be
|
|
downloaded at no charge. There is a single package for
|
|
Solaris 8, 9 and 10 on SPARC and x86, and includes source
|
|
code. SE Toolkit author Richard Pettit has started a new
|
|
company, Captive Metrics4 that plans to bring to market a
|
|
multiplatform monitoring tool built on the same principles
|
|
as SE Toolkit, written in Java.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="dtrace" id="dtrace">DTrace
|
|
</a></h4>
|
|
|
|
<p>Given that DTrace is available for Solaris, FreeBSD and OS
|
|
X, it might be worth exploring it. There's also
|
|
mod_dtrace available for httpd.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="mod_status" id="mod_status">mod_status
|
|
</a></h4>
|
|
|
|
<p>The mod_status module gives an overview of the server
|
|
performance at a given moment. It generates an HTML page
|
|
with, among others, the number of Apache processes running
|
|
and how many bytes each has served, and the CPU load caused
|
|
by httpd and the rest of the system. The Apache Software
|
|
Foundation uses <code class="module"><a href="../mod/mod_status.html">mod_status</a></code> on its own
|
|
<a href="http://apache.org/server-status">web site</a>.
|
|
If you put the <code>ExtendedStatus On</code>
|
|
directive in your <code>httpd.conf</code>,
|
|
the <code class="module"><a href="../mod/mod_status.html">mod_status</a></code>
|
|
page will give you more information at the cost of a little
|
|
extra work per request.
|
|
</p>
|
|
|
|
|
|
|
|
|
|
<h3><a name="web-server-log-files" id="web-server-log-files">Web Server Log Files
|
|
</a></h3>
|
|
|
|
<p>Monitoring and analyzing the log files httpd writes is one of
|
|
the most effective ways to keep track of your server health and
|
|
performance. Monitoring the error log allows you to detect
|
|
error conditions, discover attacks and find performance issues.
|
|
Analyzing the access logs tells you how busy your server is,
|
|
which resources are the most popular and where your users come
|
|
from. Historical log file data can give you invaluable insight
|
|
into trends in access to your server, which allows you to
|
|
predict when your performance needs will overtake your server
|
|
capacity.
|
|
</p>
|
|
|
|
|
|
<h4><a name="ErrorLog" id="ErrorLog">Error Log
|
|
</a></h4>
|
|
|
|
<p>The error log will contain messages if the server has
|
|
reached the maximum number of active processes or the
|
|
maximum number of concurrently open files. The error log
|
|
also reflects when processes are being spawned at a
|
|
higher-than-usual rate in response to a sudden increase in
|
|
load. When the server starts, the stderr file descriptor is
|
|
redirected to the error logfile, so any error encountered
|
|
by httpd after it opens its logfiles will appear in this
|
|
log. This makes it good practice to review the error log
|
|
frequently.
|
|
</p>
|
|
<p>Before Apache httpd opens its logfiles, any errors will be
|
|
written to the stderr stream. If you start httpd manually,
|
|
this error information will appear on your terminal and you
|
|
can use it directly to troubleshoot your server. If your
|
|
httpd is started by a startup script, the destination of
|
|
early error messages depends on their design. The
|
|
<code>/var/log/messages</code>
|
|
file is usually a good bet. On Windows, early error
|
|
messages are written to the Applications Event Log, which
|
|
can be viewed through the Event Viewer in Administrative
|
|
Tools.
|
|
</p>
|
|
<p>
|
|
The Error Log is configured through the <code class="directive"><a href="../mod/core.html#errorlog">ErrorLog</a></code>
|
|
and <code class="directive"><a href="../mod/core.html#loglevel">LogLevel</a></code>
|
|
configuration directives. The error log of httpd's main
|
|
server configuration receives the log messages that pertain
|
|
to the entire server: startup, shutdown, crashes, excessive
|
|
process spawns, etc. The <code class="directive"><a href="../mod/core.html#errorlog">ErrorLog</a></code>
|
|
directive can also be used in virtual host containers. The
|
|
error log of a virtual host receives only log messages
|
|
specific to that virtual host, such as authentication
|
|
failures and 'File not Found' errors.
|
|
</p>
|
|
<p>On a server that is visible to the Internet, expect to see a
|
|
lot of exploit attempt and worm attacks in the error log. A
|
|
lot of these will be targeted at other server platforms
|
|
instead of Apache, but the current state of affairs is that
|
|
attack scripts just throw everything they have at any open
|
|
port, regardless of which server is actually running or
|
|
what applications might be installed. You could block these
|
|
attempts using a firewall or <a href="http://www.modsecurity.org/">mod_security</a>,
|
|
but this falls outside the scope of this discussion.
|
|
</p>
|
|
<p>
|
|
The <code class="directive"><a href="../mod/core.html#loglevel">LogLevel</a></code>
|
|
directive determines the level of detail included in the
|
|
logs. There are eight log levels as described here:
|
|
</p>
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
<p><strong>Level</strong></p>
|
|
</td>
|
|
<td>
|
|
<p><strong>Description</strong></p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>emerg</p>
|
|
</td>
|
|
<td>
|
|
<p>Emergencies - system is unusable.</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>alert</p>
|
|
</td>
|
|
<td>
|
|
<p>Action must be taken immediately.</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>crit</p>
|
|
</td>
|
|
<td>
|
|
<p>Critical Conditions.</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>error</p>
|
|
</td>
|
|
<td>
|
|
<p>Error conditions.</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>warn</p>
|
|
</td>
|
|
<td>
|
|
<p>Warning conditions.</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>notice</p>
|
|
</td>
|
|
<td>
|
|
<p>Normal but significant condition.</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>info</p>
|
|
</td>
|
|
<td>
|
|
<p>Informational.</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>debug</p>
|
|
</td>
|
|
<td>
|
|
<p>Debug-level messages</p>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
<p>The default log level is warn. A production server should
|
|
not be run on debug, but increasing the level of detail in
|
|
the error log can be useful during troubleshooting.
|
|
Starting with 2.3.8 <code class="directive"><a href="../mod/core.html#loglevel">LogLevel</a></code>
|
|
can be specified on a per module basis:
|
|
</p>
|
|
|
|
<pre class="prettyprint lang-config">LogLevel debug mod_ssl:warn</pre>
|
|
|
|
|
|
<p>
|
|
This puts all of the server in debug mode, except for
|
|
<code class="module"><a href="../mod/mod_ssl.html">mod_ssl</a></code>, which tends to be very noisy.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="AccessLog" id="AccessLog">Access Log
|
|
</a></h4>
|
|
|
|
<p>Apache httpd keeps track of every request it services in its
|
|
access log file. In addition to the time and nature of a
|
|
request, httpd can log the client IP address, date and time
|
|
of the request, the result and a host of other information.
|
|
The various logging format features are documented in the
|
|
manual. This file exists by default for the main server and can be
|
|
configured per virtual host by using the <code class="directive"><a href="../mod/mod_log_config.html#transferlog">TransferLog</a></code>
|
|
or <code class="directive"><a href="../mod/mod_log_config.html#customlog">CustomLog</a></code>
|
|
configuration directive.
|
|
</p>
|
|
<p>The access logs can be analyzed with any of several free and
|
|
commercially available programs. Popular free analysis
|
|
packages include Analog and Webalizer. Log analysis should
|
|
be done offline so the web server machine is not burdened
|
|
by processing the log files. Most log analysis packages
|
|
understand the Common Log Format. The fields in the log
|
|
lines are explained in the following:
|
|
</p>
|
|
|
|
|
|
<div class="example"><pre>195.54.228.42 - - [24/Mar/2007:23:05:11 -0400] "GET /sander/feed/ HTTP/1.1" 200 9747
|
|
64.34.165.214 - - [24/Mar/2007:23:10:11 -0400] "GET /sander/feed/atom HTTP/1.1" 200 9068
|
|
60.28.164.72 - - [24/Mar/2007:23:11:41 -0400] "GET / HTTP/1.0" 200 618
|
|
85.140.155.56 - - [24/Mar/2007:23:14:12 -0400] "GET /sander/2006/09/27/44/ HTTP/1.1" 200 14172
|
|
85.140.155.56 - - [24/Mar/2007:23:14:15 -0400] "GET /sander/2006/09/21/gore-tax-pollution/ HTTP/1.1" 200 15147
|
|
74.6.72.187 - - [24/Mar/2007:23:18:11 -0400] "GET /sander/2006/09/27/44/ HTTP/1.0" 200 14172
|
|
74.6.72.229 - - [24/Mar/2007:23:24:22 -0400] "GET /sander/2006/11/21/os-java/ HTTP/1.0" 200 13457</pre></div>
|
|
|
|
<table>
|
|
<tr>
|
|
<td>
|
|
<p><strong>Field</strong></p>
|
|
</td>
|
|
<td>
|
|
<p><strong>Content</strong></p>
|
|
</td>
|
|
<td>
|
|
<p><strong>Explanation</strong></p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>Client IP</p>
|
|
</td>
|
|
<td>
|
|
<p>195.54.228.42</p>
|
|
</td>
|
|
<td>
|
|
<p>IP address where the request originated</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>RFC 1413 ident</p>
|
|
</td>
|
|
<td>
|
|
<p>-</p>
|
|
</td>
|
|
<td>
|
|
<p>Remote user identity as reported by their identd</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>username</p>
|
|
</td>
|
|
<td>
|
|
<p>-</p>
|
|
</td>
|
|
<td>
|
|
<p>Remote username as authenticated by Apache</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>timestamp</p>
|
|
</td>
|
|
<td>
|
|
<p>[24/Mar/2007:23:05:11 -0400]</p>
|
|
</td>
|
|
<td>
|
|
<p>Date and time of request</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>Request</p>
|
|
</td>
|
|
<td>
|
|
<p>"GET /sander/feed/ HTTP/1.1"</p>
|
|
</td>
|
|
<td>
|
|
<p>Request line</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>Status Code</p>
|
|
</td>
|
|
<td>
|
|
<p>200</p>
|
|
</td>
|
|
<td>
|
|
<p>Response code</p>
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>
|
|
<p>Content Bytes</p>
|
|
</td>
|
|
<td>
|
|
<p>9747</p>
|
|
</td>
|
|
<td>
|
|
<p>Bytes transferred w/o headers</p>
|
|
</td>
|
|
</tr>
|
|
</table>
|
|
|
|
|
|
<h4><a name="rotating-log-files" id="rotating-log-files">Rotating Log Files
|
|
</a></h4>
|
|
|
|
<p>There are several reasons to rotate logfiles. Even though
|
|
almost no operating systems out there have a hard file size
|
|
limit of two Gigabytes anymore, log files simply become too
|
|
large to handle over time. Additionally, any periodic log
|
|
file analysis should not be performed on files to which the
|
|
server is actively writing. Periodic logfile rotation helps
|
|
keep the analysis job manageable, and allows you to keep a
|
|
closer eye on usage trends.
|
|
</p>
|
|
<p>On unix systems, you can simply rotate logfiles by giving
|
|
the old file a new name using mv. The server will keep
|
|
writing to the open file even though it has a new name.
|
|
When you send a graceful restart signal to the server, it
|
|
will open a new logfile with the configured name. For
|
|
example, you could run a script from cron like this:
|
|
</p>
|
|
|
|
|
|
<div class="example"><p><code>
|
|
APACHE=/usr/local/apache2<br />
|
|
HTTPD=$APACHE/bin/httpd<br />
|
|
mv $APACHE/logs/access_log
|
|
$APACHE/logarchive/access_log-`date +%F`<br />
|
|
$HTTPD -k graceful
|
|
</code></p></div>
|
|
|
|
<p>This approach also works on Windows, just not as smoothly.
|
|
While the httpd process on your Windows server will keep
|
|
writing to the log file after it has been renamed, the
|
|
Windows Service that runs Apache can not do a graceful
|
|
restart. Restarting a Service on Windows means stopping it
|
|
and then starting it again. The advantage of a graceful
|
|
restart is that the httpd child processes get to complete
|
|
responding to their current requests before they exit.
|
|
Meanwhile, the httpd server becomes immediately available
|
|
again to serve new requests. The stop-start that the
|
|
Windows Service has to perform will interrupt any requests
|
|
currently in progress, and the server is unavailable until
|
|
it is started again. Plan for this when you decide the
|
|
timing of your restarts.
|
|
</p>
|
|
<p>
|
|
A second approach is to use piped logs. From the
|
|
<code class="directive"><a href="../mod/mod_log_config.html#customlog">CustomLog</a></code>,
|
|
<code class="directive"><a href="../mod/mod_log_config.html#transferlog">TransferLog</a></code>
|
|
or <code class="directive"><a href="../mod/core.html#errorlog">ErrorLog
|
|
</a></code>
|
|
directives you can send the log data into any program using
|
|
a pipe character (<code>|</code>). For instance:
|
|
</p>
|
|
|
|
<div class="example"><p><code>
|
|
CustomLog "|/usr/local/apache2/bin/rotatelogs /var/log/access_log 86400" common
|
|
</code></p></div>
|
|
|
|
<p>The program on the other end of the pipe will receive the
|
|
Apache log data on its stdin stream, and can do with this
|
|
data whatever it wants. The rotatelogs program that comes
|
|
with Apache seamlessly turns over the log file based on
|
|
time elapsed or the amount of data written, and leaves the
|
|
old log files with a timestamp suffix to its name. This
|
|
method for rotating logfiles works well on unix platforms,
|
|
but is currently broken on Windows.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="logging-and-performance" id="logging-and-performance">Logging and Performance
|
|
</a></h4>
|
|
|
|
<p>Writing entries to the Apache log files obviously takes some
|
|
effort, but the information gathered from the logs is so
|
|
valuable that under normal circumstances logging should not
|
|
be turned off. For optimal performance, you should put your
|
|
disk-based site content on a different physical disk than
|
|
the server log files: the access patterns are very
|
|
different. Retrieving content from disk is a read operation
|
|
in a fairly random pattern, and log files are written to
|
|
disk sequentially.
|
|
</p>
|
|
<p>
|
|
Do not run a production server with your error <code class="directive"><a href="../mod/core.html#loglevel">LogLevel</a></code>
|
|
set to debug. This log level causes a vast amount of
|
|
information to be written to the error log, including, in
|
|
the case of SSL access, complete dumps of BIO read and
|
|
write operations. The performance implications are
|
|
significant: use the default warn level instead.
|
|
</p>
|
|
<p>If your server has more than one virtual host, you may give
|
|
each virtual host a separate access logfile. This makes it
|
|
easier to analyze the logfile later. However, if your
|
|
server has many virtual hosts, all the open logfiles put a
|
|
resource burden on your system, and it may be preferable to
|
|
log to a single file. Use the <code>%v</code>
|
|
format character at the start of your <code class="directive"><a href="../mod/mod_log_config.html#logformat">LogFormat</a></code>
|
|
and starting 2.3.8 of your <code class="directive"><a href="../mod/core.html#errorlog">ErrorLog</a></code>
|
|
to make httpd print the hostname of the virtual host that
|
|
received the request or the error at the beginning of each
|
|
log line. A simple Perl script can split out the log file
|
|
after it rotates: one is included with the Apache source
|
|
under <code>support/split-logfile</code>.
|
|
</p>
|
|
<p>
|
|
You can use the <code class="directive"><a href="../mod/mod_log_config.html#bufferedlogs">BufferedLogs</a></code>
|
|
directive to have Apache collect several log lines in
|
|
memory before writing them to disk. This might yield better
|
|
performance, but could affect the order in which the
|
|
server's log is written.
|
|
</p>
|
|
|
|
|
|
|
|
|
|
<h3><a name="generating-a-test-load" id="generating-a-test-load">Generating A Test Load
|
|
</a></h3>
|
|
|
|
<p>It is useful to generate a test load to monitor system
|
|
performance under realistic operating circumstances. Besides
|
|
commercial packages such as <a href="http://learnloadrunner.com/">LoadRunner</a>
|
|
,there are a number of freely available tools to generate a
|
|
test load against your web server.
|
|
</p>
|
|
<ul>
|
|
<li>Apache ships with a test program called ab, short for
|
|
Apache Bench. It can generate a web server load by
|
|
repeatedly asking for the same file in rapid succession.
|
|
You can specify a number of concurrent connections and have
|
|
the program run for either a given amount of time or a
|
|
specified number of requests.
|
|
</li>
|
|
<li>Another freely available load generator is http load11 .
|
|
This program works with a URL file and can be compiled with
|
|
SSL support.
|
|
</li>
|
|
<li>The Apache Software Foundation offers a tool named flood12
|
|
. Flood is a fairly sophisticated program that is
|
|
configured through an XML file.
|
|
</li>
|
|
<li>Finally, JMeter13 , a Jakarta subproject, is an all-Java
|
|
load-testing tool. While early versions of this application
|
|
were slow and difficult to use, the current version 2.1.1
|
|
seems to be versatile and useful.
|
|
</li>
|
|
<li>
|
|
<p>ASF external projects, that have proven to be quite
|
|
good: grinder, httperf, tsung, <a href="http://funkload.nuxeo.org/">FunkLoad</a>
|
|
</p>
|
|
</li>
|
|
</ul>
|
|
<p>When you load-test your web server, please keep in mind that if
|
|
that server is in production, the test load may negatively
|
|
affect the server's response. Also, any data traffic you
|
|
generate may be charged against your monthly traffic allowance.
|
|
</p>
|
|
|
|
|
|
|
|
</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
|
|
<div class="section">
|
|
<h2><a name="configuring-for-performance" id="configuring-for-performance">Configuring for Performance
|
|
</a> <a title="Permanent link" href="#configuring-for-performance" class="permalink">¶</a></h2>
|
|
|
|
|
|
|
|
<h3><a name="apache-configuration" id="apache-configuration">Httpd Configuration
|
|
</a></h3>
|
|
|
|
<p>The Apache 2.2 httpd is by default a pre-forking web server.
|
|
When the server starts, the parent process spawns a number of
|
|
child processes that do the actual work of servicing requests.
|
|
But Apache httpd 2.0 introduced the concept of the
|
|
Multi-Processing Module (MPM). Developers can write MPMs to
|
|
suit the process- or threadingarchitecture of their specific
|
|
operating system. Apache 2 comes with special MPMs for Windows,
|
|
OS/2, Netware and BeOS. On unix-like platforms, the two most
|
|
popular MPMs are Prefork and Worker. The Prefork MPM offers the
|
|
same pre-forking process model that Apache 1.3 uses. The Worker
|
|
MPM runs a smaller number of child processes, and spawns
|
|
multiple request handling threads within each child process. In
|
|
2.4 MPMs are no longer hard-wired. They too can be exchanged
|
|
via <code class="directive"><a href="../mod/mod_so.html#loadmodule">LoadModule</a></code>.
|
|
The default MPM in 2.4 is the event MPM.
|
|
</p>
|
|
<p>The maximum number of workers, be they pre-forked child
|
|
processes or threads within a process, is an indication of how
|
|
many requests your server can manage concurrently. It is merely
|
|
a rough estimate because the kernel can queue connection
|
|
attempts for your web server. When your site becomes busy and
|
|
the maximum number of workers is running, the machine
|
|
doesn't hit a hard limit beyond which clients will be
|
|
denied access. However, once requests start backing up, system
|
|
performance is likely to degrade.
|
|
</p>
|
|
<p>Finally, if the httpd server in question is not executing any third-party
|
|
code, via <code>mod_php</code>, <code>mod_perl</code> or similar,
|
|
we recommend the use of <code class="module">mpm_event</code>. This MPM is ideal
|
|
for situations where httpd serves as a thin layer between clients and
|
|
backend servers doing the real job, such as a proxy or cache.
|
|
</p>
|
|
|
|
|
|
<h4><a name="MaxClients" id="MaxClients">MaxClients
|
|
</a></h4>
|
|
|
|
<p>
|
|
The <code>MaxClients</code>
|
|
directive in your Apache httpd configuration file specifies
|
|
the maximum number of workers your server can create. It
|
|
has two related directives, <code>MinSpareServers
|
|
</code>
|
|
and <code>MaxSpareServers
|
|
</code>
|
|
,which specify the number of workers Apache keeps waiting
|
|
in the wings ready to serve requests. The absolute maximum
|
|
number of processes is configurable through the <code>
|
|
ServerLimit
|
|
</code>
|
|
directive.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="spinning-threads" id="spinning-threads">Spinning Threads
|
|
</a></h4>
|
|
|
|
<p>For the prefork MPM of the above directives are all there is
|
|
to determining the process limit. However, if you are
|
|
running a threaded MPM the situation is a little more
|
|
complicated. Threaded MPMs support the <code>
|
|
ThreadsPerChild
|
|
</code>
|
|
directive1 . Apache requires that <code>MaxClients</code>
|
|
is evenly divisible by <code>ThreadsPerChild
|
|
</code>
|
|
.If you set either directive to a number that doesn't
|
|
meet this requirement, Apache will send a message of
|
|
complaint to the error log and adjust the <code>
|
|
ThreadsPerChild
|
|
</code>
|
|
value downwards until it is an even factor of
|
|
<code>MaxClients</code>.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="sizing-maxClients" id="sizing-maxClients">Sizing MaxClients
|
|
</a></h4>
|
|
|
|
<p>Optimally, the maximum number of processes should be set so
|
|
that all the memory on your system is used, but no more. If
|
|
your system gets so overloaded that it needs to heavily
|
|
swap core memory out to disk, performance will degrade
|
|
quickly. The formula for determining <code class="directive"><a href="../mod/mpm_common.html#maxrequestworkers">MaxClients</a></code>
|
|
is fairly simple:
|
|
</p>
|
|
|
|
<div class="example"><p><code>
|
|
total RAM - RAM for OS - RAM for external programs<br />
|
|
MaxClients =
|
|
-------------------------------------------------------<br />
|
|
RAM per httpd process
|
|
</code></p></div>
|
|
|
|
<p>The various amounts of memory allocated for the OS, external
|
|
programs and the httpd processes is best determined by
|
|
observation: use the top and free commands described above
|
|
to determine the memory footprint of the OS without the web
|
|
server running. You can also determine the footprint of a
|
|
typical web server process from top: most top
|
|
implementations have a Resident Size (RSS) column and a
|
|
Shared Memory column.
|
|
</p>
|
|
<p>The difference between these two is the amount of memory
|
|
per-process. The shared segment really exists only once and
|
|
is used for the code and libraries loaded and the dynamic
|
|
inter-process tally, or 'scoreboard,' that Apache
|
|
keeps. How much memory each process takes for itself
|
|
depends heavily on the number and kind of modules you use.
|
|
The best approach to use in determining this need is to
|
|
generate a typical test load against your web site and see
|
|
how large the httpd processes become.
|
|
</p>
|
|
<p>The RAM for external programs parameter is intended mostly
|
|
for CGI programs and scripts that run outside the web
|
|
server process. However, if you have a Java virtual machine
|
|
running Tomcat on the same box it will need a significant
|
|
amount of memory as well. The above assessment should give
|
|
you an idea how far you can push <code>MaxClients
|
|
</code>
|
|
,but it is not an exact science. When in doubt, be
|
|
conservative and use a low <code>MaxClients
|
|
</code>
|
|
value. The Linux kernel will put extra memory to good use
|
|
for caching disk access. On Solaris you need enough
|
|
available real RAM memory to create any process. If no real
|
|
memory is available, httpd will start writing 'No space
|
|
left on device' messages to the error log and be unable
|
|
to fork additional child processes, so a higher <code>
|
|
MaxClients
|
|
</code>
|
|
value may actually be a disadvantage.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="selecting-your-mpm" id="selecting-your-mpm">Selecting your MPM
|
|
</a></h4>
|
|
|
|
<p>The prime reason for selecting a threaded MPM is that
|
|
threads consume fewer system resources than processes, and
|
|
it takes less effort for the system to switch between
|
|
threads. This is more true for some operating systems than
|
|
for others. On systems like Solaris and AIX, manipulating
|
|
processes is relatively expensive in terms of system
|
|
resources. On these systems, running a threaded MPM makes
|
|
sense. On Linux, the threading implementation actually uses
|
|
one process for each thread. Linux processes are relatively
|
|
lightweight, but it means that a threaded MPM offers less
|
|
of a performance advantage than in other environments.
|
|
</p>
|
|
<p>Running a threaded MPM can cause stability problems in some
|
|
situations For instance, should a child process of a
|
|
preforked MPM crash, at most one client connection is
|
|
affected. However, if a threaded child crashes, all the
|
|
threads in that process disappear, which means all the
|
|
clients currently being served by that process will see
|
|
their connection aborted. Additionally, there may be
|
|
so-called "thread-safety" issues, especially with
|
|
third-party libraries. In threaded applications, threads
|
|
may access the same variables indiscriminately, not knowing
|
|
whether a variable may have been changed by another thread.
|
|
</p>
|
|
<p>This has been a sore point within the PHP community. The PHP
|
|
processor heavily relies on third-party libraries and
|
|
cannot guarantee that all of these are thread-safe. The
|
|
good news is that if you are running Apache on Linux, you
|
|
can run PHP in the preforked MPM without fear of losing too
|
|
much performance relative to the threaded option.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="spinning-locks" id="spinning-locks">Spinning Locks
|
|
</a></h4>
|
|
|
|
<p>Apache httpd maintains an inter-process lock around its
|
|
network listener. For all practical purposes, this means
|
|
that only one httpd child process can receive a request at
|
|
any given time. The other processes are either servicing
|
|
requests already received or are 'camping out' on
|
|
the lock, waiting for the network listener to become
|
|
available. This process is best visualized as a revolving
|
|
door, with only one process allowed in the door at any
|
|
time. On a heavily loaded web server with requests arriving
|
|
constantly, the door spins quickly and requests are
|
|
accepted at a steady rate. On a lightly loaded web server,
|
|
the process that currently "holds" the lock may
|
|
have to stay in the door for a while, during which all the
|
|
other processes sit idle, waiting to acquire the lock. At
|
|
this time, the parent process may decide to terminate some
|
|
children based on its <code>MaxSpareServers
|
|
</code>
|
|
directive.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="the-thundering-herd" id="the-thundering-herd">The Thundering Herd
|
|
</a></h4>
|
|
|
|
<p>The function of the 'accept mutex' (as this
|
|
inter-process lock is called) is to keep request reception
|
|
moving along in an orderly fashion. If the lock is absent,
|
|
the server may exhibit the Thundering Herd syndrome.
|
|
</p>
|
|
<p>Consider an American Football team poised on the line of
|
|
scrimmage. If the football players were Apache processes
|
|
all team members would go for the ball simultaneously at
|
|
the snap. One process would get it, and all the others
|
|
would have to lumber back to the line for the next snap. In
|
|
this metaphor, the accept mutex acts as the quarterback,
|
|
delivering the connection "ball" to the
|
|
appropriate player process.
|
|
</p>
|
|
<p>Moving this much information around is obviously a lot of
|
|
work, and, like a smart person, a smart web server tries to
|
|
avoid it whenever possible. Hence the revolving door
|
|
construction. In recent years, many operating systems,
|
|
including Linux and Solaris, have put code in place to
|
|
prevent the Thundering Herd syndrome. Apache recognizes
|
|
this and if you run with just one network listener, meaning
|
|
one virtual host or just the main server, Apache will
|
|
refrain from using an accept mutex. If you run with
|
|
multiple listeners (for instance because you have a virtual
|
|
host serving SSL requests), it will activate the accept
|
|
mutex to avoid internal conflicts.
|
|
</p>
|
|
<p>
|
|
You can manipulate the accept mutex with the <code>
|
|
AcceptMutex
|
|
</code>
|
|
directive. Besides turning the accept mutex off, you can
|
|
select the locking mechanism. Common locking mechanisms
|
|
include fcntl, System V Semaphores and pthread locking. Not
|
|
all are available on every platform, and their availability
|
|
also depends on compile-time settings. The various locking
|
|
mechanisms may place specific demands on system resources:
|
|
manipulate them with care.
|
|
</p>
|
|
<p>There is no compelling reason to disable the accept mutex.
|
|
Apache automatically recognizes the single listener
|
|
situation described above and knows if it is safe to run
|
|
without mutex on your platform.
|
|
</p>
|
|
|
|
|
|
|
|
|
|
<h3><a name="tuning-the-operating-system" id="tuning-the-operating-system">Tuning the Operating System
|
|
</a></h3>
|
|
|
|
<p>People often look for the 'magic tune-up' that will
|
|
make their system perform four times as fast by tweaking just
|
|
one little setting. The truth is, present-day UNIX derivatives
|
|
are pretty well adjusted straight out of the box and there is
|
|
not a lot that needs to be done to make them perform optimally.
|
|
However, there are a few things that an administrator can do to
|
|
improve performance.
|
|
</p>
|
|
|
|
|
|
<h4><a name="ram-and-swap-space" id="ram-and-swap-space">RAM and Swap Space
|
|
</a></h4>
|
|
|
|
<p>The usual mantra regarding RAM is "more is
|
|
better". As discussed above, unused RAM is put to good
|
|
use as file system cache. The Apache processes get bigger
|
|
if you load more modules, especially if you use modules
|
|
that generate dynamic page content within the processes,
|
|
like PHP and mod_perl. A large configuration file-with many
|
|
virtual hosts-also tends to inflate the process footprint.
|
|
Having ample RAM allows you to run Apache with more child
|
|
processes, which allows the server to process more
|
|
concurrent requests.
|
|
</p>
|
|
<p>While the various platforms treat their virtual memory in
|
|
different ways, it is never a good idea to run with less
|
|
disk-based swap space than RAM. The virtual memory system
|
|
is designed to provide a fallback for RAM, but when you
|
|
don't have disk space available and run out of
|
|
swappable memory, your machine grinds to a halt. This can
|
|
crash your box, requiring a physical reboot for which your
|
|
hosting facility may charge you.
|
|
</p>
|
|
<p>Also, such an outage naturally occurs when you least want
|
|
it: when the world has found your website and is beating a
|
|
path to your door. If you have enough disk-based swap space
|
|
available and the machine gets overloaded, it may get very,
|
|
very slow as the system needs to swap memory pages to disk
|
|
and back, but when the load decreases the system should
|
|
recover. Remember, you still have <code>MaxClients
|
|
</code>
|
|
to keep things in hand.
|
|
</p>
|
|
<p>Most unix-like operating systems use designated disk
|
|
partitions for swap space. When a system starts up it finds
|
|
all swap partitions on the disk(s), by partition type or
|
|
because they are listed in the file <code>/etc/fstab
|
|
</code>
|
|
,and automatically enables them. When adding a disk or
|
|
installing the operating system, be sure to allocate enough
|
|
swap space to accommodate eventual RAM upgrades.
|
|
Reassigning disk space on a running system is a cumbersome
|
|
process.
|
|
</p>
|
|
<p>Plan for available hard drive swap space of at least twice
|
|
your amount of RAM, perhaps up to four times in situations
|
|
with frequent peaking loads. Remember to adjust this
|
|
configuration whenever you upgrade RAM on your system. In a
|
|
pinch, you can use a regular file as swap space. For
|
|
instructions on how to do this, see the manual pages for
|
|
the <code>mkswap
|
|
</code>
|
|
and <code>swapon
|
|
</code>
|
|
or <code>swap
|
|
</code>
|
|
programs.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="ulimit-files-and-processes" id="ulimit-files-and-processes">ulimit: Files and Processes
|
|
</a></h4>
|
|
|
|
<p>Given a machine with plenty of RAM and processor capacity,
|
|
you can run hundreds of Apache processes if necessary. . .
|
|
and if your kernel allows it.
|
|
</p>
|
|
<p>Consider a situation in which several hundred web servers
|
|
are running; if some of these need to spawn CGI processes,
|
|
the maximum number of processes would occur quickly.
|
|
</p>
|
|
<p>However, you can change this limit with the command
|
|
</p>
|
|
|
|
<div class="example"><p><code>
|
|
ulimit [-H|-S] -u [newvalue]
|
|
</code></p></div>
|
|
|
|
<p>This must be changed before starting the server, since the
|
|
new value will only be available to the current shell and
|
|
programs started from it. In newer Linux kernels the
|
|
default has been raised to 2048. On FreeBSD, the number
|
|
seems to be the rather unusual 513. In the default user
|
|
shell on this system, <code>csh
|
|
</code>
|
|
the equivalent is <code>limit
|
|
</code>
|
|
and works analogous to the Bourne-like <code>ulimit
|
|
</code>
|
|
:
|
|
</p>
|
|
|
|
<div class="example"><p><code>
|
|
limit [-h] maxproc [newvalue]
|
|
</code></p></div>
|
|
|
|
<p>Similarly, the kernel may limit the number of open files per
|
|
process. This is generally not a problem for pre-forked
|
|
servers, which just handle one request at a time per
|
|
process. Threaded servers, however, serve many requests per
|
|
process and much more easily run out of available file
|
|
descriptors. You can increase the maximum number of open
|
|
files per process by running the
|
|
</p>
|
|
|
|
<div class="example"><p><code>ulimit -n [newvalue]
|
|
</code></p></div>
|
|
|
|
<p>command. Once again, this must be done prior to starting
|
|
Apache.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="setting-user-limits-on-system-startup" id="setting-user-limits-on-system-startup">Setting User Limits on System Startup
|
|
</a></h4>
|
|
|
|
<p>Under Linux, you can set the ulimit parameters on bootup by
|
|
editing the <code>/etc/security/limits.conf
|
|
</code>
|
|
file. This file allows you to set soft and hard limits on a
|
|
per-user or per-group basis; the file contains commentary
|
|
explaining the options. To enable this, make sure that the
|
|
file <code>/etc/pam.d/login
|
|
</code>
|
|
contains the line
|
|
</p>
|
|
|
|
<div class="example"><p><code>session required /lib/security/pam_limits.so
|
|
</code></p></div>
|
|
|
|
<p>All items can have a 'soft' and a 'hard'
|
|
limit: the first is the default setting and the second the
|
|
maximum value for that item.
|
|
</p>
|
|
<p>
|
|
In FreeBSD's <code>/etc/login.conf
|
|
</code>
|
|
these resources can be limited or extended system wide,
|
|
analogously to <code>limits.conf</code>.
|
|
'Soft' limits can be specified with <code>-cur</code>
|
|
and 'hard' limits with <code>-max</code>.
|
|
</p>
|
|
<p>Solaris has a similar mechanism for manipulating limit
|
|
values at boot time: In <code>/etc/system</code>
|
|
you can set kernel tunables valid for the entire system at
|
|
boot time. These are the same tunables that can be set with
|
|
the <code>mdb</code>
|
|
kernel debugger during run time. The soft and hard limit
|
|
corresponding to ulimit -u can be set via:
|
|
</p>
|
|
|
|
<div class="example"><p><code>
|
|
set rlim_fd_max=65536<br />
|
|
set rlim_fd_cur=2048
|
|
</code></p></div>
|
|
|
|
<p>Solaris calculates the maximum number of allowed processes
|
|
per user (<code>maxuprc</code>) based on the total amount
|
|
available memory on the system (<code>maxusers</code>).
|
|
You can review the numbers with
|
|
</p>
|
|
|
|
<div class="example"><p><code>sysdef -i | grep maximum
|
|
</code></p></div>
|
|
|
|
<p>but it is not recommended to change them.
|
|
</p>
|
|
|
|
|
|
|
|
<h4><a name="turn-off-unused-services-and-modules" id="turn-off-unused-services-and-modules">Turn Off Unused Services and Modules
|
|
</a></h4>
|
|
|
|
<p>Many UNIX and Linux distributions come with a slew of
|
|
services turned on by default. You probably need few of
|
|
them. For example, your web server does not need to be
|
|
running sendmail, nor is it likely to be an NFS server,
|
|
etc. Turn them off.
|
|
</p>
|
|
<p>On Red Hat Linux, the chkconfig tool will help you do this
|
|
from the command line. On Solaris systems <code>svcs</code>
|
|
and <code>svcadm</code>
|
|
will show which services are enabled and disable them
|
|
respectively.
|
|
</p>
|
|
<p>In a similar fashion, cast a critical eye on the Apache
|
|
modules you load. Most binary distributions of Apache
|
|
httpd, and pre-installed versions that come with Linux
|
|
distributions, have their modules enabled through the
|
|
<code class="directive">LoadModule</code> directive.
|
|
</p>
|
|
<p>Unused modules may be culled: if you don't rely on
|
|
their functionality and configuration directives, you can
|
|
turn them off by commenting out the corresponding
|
|
<code class="directive">LoadModule</code>
|
|
lines. Read the documentation on each module's
|
|
functionality before deciding whether to keep it enabled.
|
|
While the performance overhead of an unused module is
|
|
small, it's also unnecessary.
|
|
</p>
|
|
|
|
|
|
|
|
|
|
</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
|
|
<div class="section">
|
|
<h2><a name="caching-content" id="caching-content">Caching Content
|
|
</a> <a title="Permanent link" href="#caching-content" class="permalink">¶</a></h2>
|
|
|
|
<p>Requests for dynamically generated content usually take
|
|
significantly more resources than requests for static content.
|
|
Static content consists of simple filespages, images, etc.-on disk
|
|
that are very efficiently served. Many operating systems also
|
|
automatically cache the contents of frequently accessed files in
|
|
memory.
|
|
</p>
|
|
<p>Processing dynamic requests, on the contrary, can be much more
|
|
involved. Running CGI scripts, handing off requests to an external
|
|
application server and accessing database content can introduce
|
|
significant latency and processing load to a busy web server. Under
|
|
many circumstances, performance can be improved by turning popular
|
|
dynamic requests into static requests. In this section, two
|
|
approaches to this will be discussed.
|
|
</p>
|
|
|
|
|
|
<h3><a name="making-popular-pages-static" id="making-popular-pages-static">Making Popular Pages Static
|
|
</a></h3>
|
|
|
|
<p>By pre-rendering the response pages for the most popular queries
|
|
in your application, you can gain a significant performance
|
|
improvement without giving up the flexibility of dynamically
|
|
generated content. For instance, if your application is a
|
|
flower delivery service, you would probably want to pre-render
|
|
your catalog pages for red roses during the weeks leading up to
|
|
Valentine's Day. When the user searches for red roses,
|
|
they are served the pre-rendered page. Queries for, say, yellow
|
|
roses will be generated directly from the database. The
|
|
mod_rewrite module included with Apache is a great tool to
|
|
implement these substitutions.
|
|
</p>
|
|
|
|
|
|
<h4><a name="example-a-statically-rendered-blog" id="example-a-statically-rendered-blog">Example: A Statically Rendered Blog
|
|
</a></h4>
|
|
|
|
|
|
|
|
<p>Blosxom is a lightweight web log package that runs as a CGI.
|
|
It is written in Perl and uses plain text files for entry
|
|
input. Besides running as CGI, Blosxom can be run from the
|
|
command line to pre-render blog pages. Pre-rendering pages
|
|
to static HTML can yield a significant performance boost in
|
|
the event that large numbers of people actually start
|
|
reading your blog.
|
|
</p>
|
|
<p>To run blosxom for static page generation, edit the CGI
|
|
script according to the documentation. Set the $static dir
|
|
variable to the <code class="directive">DocumentRoot</code>
|
|
of the web server, and run the script from the command line
|
|
as follows:
|
|
</p>
|
|
|
|
<div class="example"><p><code>$ perl blosxom.cgi -password='whateveryourpassword'
|
|
</code></p></div>
|
|
|
|
<p>This can be run periodically from Cron, after you upload
|
|
content, etc. To make Apache substitute the statically
|
|
rendered pages for the dynamic content, we'll use
|
|
mod_rewrite. This module is included with the Apache source
|
|
code, but is not compiled by default. It can be built with
|
|
the server by passing the option <code>--enable-rewrite[=shared]</code>
|
|
to the configure command. Many binary distributions of
|
|
Apache come with <code class="module"><a href="../mod/mod_rewrite.html">mod_rewrite </a></code> included. The following is an
|
|
example of an Apache virtual host that takes advantage of
|
|
pre-rendered blog pages:
|
|
</p>
|
|
|
|
<pre class="prettyprint lang-config">Listen *:8001
|
|
<VirtualHost *:8001>
|
|
ServerName blog.sandla.org:8001
|
|
ServerAdmin sander@temme.net
|
|
DocumentRoot "/home/sctemme/inst/blog/httpd/htdocs"
|
|
<Directory "/home/sctemme/inst/blog/httpd/htdocs">
|
|
Options +Indexes
|
|
Require all granted
|
|
RewriteEngine on
|
|
RewriteCond "%{REQUEST_FILENAME}" "!-f"
|
|
RewriteCond "%{REQUEST_FILENAME}" "!-d"
|
|
RewriteRule "^(.*)$" "/cgi-bin/blosxom.cgi/$1" [L,QSA]
|
|
</Directory>
|
|
RewriteLog "/home/sctemme/inst/blog/httpd/logs/rewrite_log"
|
|
RewriteLogLevel 9
|
|
ErrorLog "/home/sctemme/inst/blog/httpd/logs/error_log"
|
|
LogLevel debug
|
|
CustomLog "/home/sctemme/inst/blog/httpd/logs/access_log" common
|
|
ScriptAlias "/cgi-bin/" "/home/sctemme/inst/blog/bin/"
|
|
<Directory "/home/sctemme/inst/blog/bin">
|
|
Options +ExecCGI
|
|
Require all granted
|
|
</Directory>
|
|
</VirtualHost></pre>
|
|
|
|
|
|
<p>
|
|
The <code class="directive">RewriteCond</code>
|
|
and <code class="directive">RewriteRule</code>
|
|
directives say that, if the requested resource does not
|
|
exist as a file or a directory, its path is passed to the
|
|
Blosxom CGI for rendering. Blosxom uses Path Info to
|
|
specify blog entries and index pages, so this means that if
|
|
a particular path under Blosxom exists as a static file in
|
|
the file system, the file is served instead. Any request
|
|
that isn't pre- rendered is served by the CGI. This
|
|
means that individual entries, which show the comments, are
|
|
always served by the CGI which in turn means that your
|
|
comment spam is always visible. This configuration also
|
|
hides the Blosxom CGI from the user-visible URL in their
|
|
Location bar. mod_rewrite is a fantastically powerful and
|
|
versatile module: investigate it to arrive at a
|
|
configuration that is best for your situation.
|
|
</p>
|
|
|
|
|
|
|
|
|
|
<h3><a name="caching-content-with-mod_cache" id="caching-content-with-mod_cache">Caching Content With mod_cache
|
|
</a></h3>
|
|
|
|
<p>The mod_cache module provides intelligent caching of HTTP
|
|
responses: it is aware of the expiration timing and content
|
|
requirements that are part of the HTTP specification. The
|
|
mod_cache module caches URL response content. If content sent
|
|
to the client is considered cacheable, it is saved to disk.
|
|
Subsequent requests for that URL will be served directly from
|
|
the cache. The provider module for mod_cache, mod_disk_cache,
|
|
determines how the cached content is stored on disk. Most
|
|
server systems will have more disk available than memory, and
|
|
it's good to note that some operating system kernels cache
|
|
frequently accessed disk content transparently in memory, so
|
|
replicating this in the server is not very useful.
|
|
</p>
|
|
<p>To enable efficient content caching and avoid presenting the
|
|
user with stale or invalid content, the application that
|
|
generates the actual content has to send the correct response
|
|
headers. Without headers like <code>Etag:</code>,
|
|
<code>Last-Modified:</code> or <code>Expires:</code>,
|
|
<code class="module"><a href="../mod/mod_cache.html">mod_cache</a></code> can not make the right decision on whether to cache
|
|
the content, serve it from cache or leave it alone. When
|
|
testing content caching, you may find that you need to modify
|
|
your application or, if this is impossible, selectively disable
|
|
caching for URLs that cause problems. The mod_cache modules are
|
|
not compiled by default, but can be enabled by passing the
|
|
option <code>--enable-cache[=shared]</code>
|
|
to the configure script. If you use a binary distribution of
|
|
Apache httpd, or it came with your port or package collection,
|
|
it may have <code class="module"><a href="../mod/mod_cache.html">mod_cache</a></code> already included.
|
|
</p>
|
|
|
|
|
|
<h4><a name="example-wiki" id="example-wiki">Example: wiki.apache.org
|
|
</a></h4>
|
|
|
|
|
|
<p>
|
|
The Apache Software Foundation Wiki is served by
|
|
MoinMoin. MoinMoin is written in Python and runs as
|
|
a CGI. To date, any attempts to run it under
|
|
mod_python has been unsuccessful. The CGI proved to
|
|
place an untenably high load on the server machine,
|
|
especially when the Wiki was being indexed by search
|
|
engines like Google. To lighten the load on the
|
|
server machine, the Apache Infrastructure team
|
|
turned to mod_cache. It turned out MoinMoin needed a
|
|
small patch to ensure proper behavior behind the
|
|
caching server: certain requests can never be cached
|
|
and the corresponding Python modules were patched to
|
|
send the proper HTTP response headers. After this
|
|
modification, the cache in front of the Wiki was
|
|
enabled with the following configuration snippet in
|
|
<code>httpd.conf</code>:
|
|
</p>
|
|
|
|
<pre class="prettyprint lang-config">CacheRoot /raid1/cacheroot
|
|
CacheEnable disk /
|
|
# A page modified 100 minutes ago will expire in 10 minutes
|
|
CacheLastModifiedFactor .1
|
|
# Always check again after 6 hours
|
|
CacheMaxExpire 21600</pre>
|
|
|
|
|
|
<p>This configuration will try to cache any and all content
|
|
within its virtual host. It will never cache content for
|
|
more than six hours (the <code class="directive"><a href="../mod/mod_cache.html#cachemaxexpire">CacheMaxExpire</a></code>
|
|
directive). If no <code>Expires:</code>
|
|
header is present in the response, <code class="module"><a href="../mod/mod_cache.html">mod_cache</a></code> will compute
|
|
an expiration period from the <code>Last-Modified:</code>
|
|
header. The computation using <code class="directive"><a href="../mod/mod_cache.html#cachelastmodifiedfactor">CacheLastModifiedFactor</a></code>
|
|
is based on the assumption that if a page was recently
|
|
modified, it is likely to change again in the near future
|
|
and will have to be re-cached.
|
|
</p>
|
|
<p>
|
|
Do note that it can pay off to <em>disable</em>
|
|
the <code>ETag:</code>
|
|
header: For files smaller than 1k the server has to
|
|
calculate the checksum (usually MD5) and then send out a
|
|
<code>304 Not Modified</code>
|
|
response, which will use up some CPU and still saturate
|
|
the same amount of network resources for the transfer (one
|
|
TCP packet). For resources larger than 1k it might prove
|
|
CPU expensive to calculate the header for each request.
|
|
Unfortunately there does currently not exist a way to cache
|
|
these headers.
|
|
</p>
|
|
<pre class="prettyprint lang-config"><FilesMatch "\.(jpe?g|png|gif|js|css|x?html|xml)">
|
|
FileETag None
|
|
</FilesMatch></pre>
|
|
|
|
|
|
<p>
|
|
This will disable the generation of the <code>ETag:</code>
|
|
header for most static resources. The server does not
|
|
calculate these headers for dynamic resources.
|
|
</p>
|
|
|
|
|
|
|
|
|
|
</div><div class="top"><a href="#page-header"><img alt="top" src="../images/up.gif" /></a></div>
|
|
<div class="section">
|
|
<h2><a name="further-considerations" id="further-considerations">Further Considerations
|
|
</a> <a title="Permanent link" href="#further-considerations" class="permalink">¶</a></h2>
|
|
|
|
<p>Armed with the knowledge of how to tune a system to deliver the
|
|
desired the performance, we will soon discover that <em>one</em>
|
|
system might prove a bottleneck. How to make a system fit for
|
|
growth, or how to put a number of systems into tune will be
|
|
discussed in <a href="http://wiki.apache.org/httpd/PerformanceScalingOut">PerformanceScalingOut</a>.
|
|
</p>
|
|
</div></div>
|
|
<div class="bottomlang">
|
|
<p><span>Available Languages: </span><a href="../en/misc/perf-scaling.html" title="English"> en </a> |
|
|
<a href="../es/misc/perf-scaling.html" hreflang="es" rel="alternate" title="Español"> es </a> |
|
|
<a href="../fr/misc/perf-scaling.html" hreflang="fr" rel="alternate" title="Français"> fr </a></p>
|
|
</div><div class="top"><a href="#page-header"><img src="../images/up.gif" alt="top" /></a></div><div class="section"><h2><a id="comments_section" name="comments_section">Comments</a></h2><div class="warning"><strong>Notice:</strong><br />This is not a Q&A section. Comments placed here should be pointed towards suggestions on improving the documentation or server, and may be removed by our moderators if they are either implemented or considered invalid/off-topic. Questions on how to manage the Apache HTTP Server should be directed at either our IRC channel, #httpd, on Libera.chat, or sent to our <a href="https://httpd.apache.org/lists.html">mailing lists</a>.</div>
|
|
<script type="text/javascript"><!--//--><![CDATA[//><!--
|
|
var comments_shortname = 'httpd';
|
|
var comments_identifier = 'http://httpd.apache.org/docs/trunk/misc/perf-scaling.html';
|
|
(function(w, d) {
|
|
if (w.location.hostname.toLowerCase() == "httpd.apache.org") {
|
|
d.write('<div id="comments_thread"><\/div>');
|
|
var s = d.createElement('script');
|
|
s.type = 'text/javascript';
|
|
s.async = true;
|
|
s.src = 'https://comments.apache.org/show_comments.lua?site=' + comments_shortname + '&page=' + comments_identifier;
|
|
(d.getElementsByTagName('head')[0] || d.getElementsByTagName('body')[0]).appendChild(s);
|
|
}
|
|
else {
|
|
d.write('<div id="comments_thread">Comments are disabled for this page at the moment.<\/div>');
|
|
}
|
|
})(window, document);
|
|
//--><!]]></script></div><div id="footer">
|
|
<p class="apache">Copyright 2023 The Apache Software Foundation.<br />Licensed under the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.</p>
|
|
<p class="menu"><a href="../mod/">Modules</a> | <a href="../mod/quickreference.html">Directives</a> | <a href="http://wiki.apache.org/httpd/FAQ">FAQ</a> | <a href="../glossary.html">Glossary</a> | <a href="../sitemap.html">Sitemap</a></p></div><script type="text/javascript"><!--//--><![CDATA[//><!--
|
|
if (typeof(prettyPrint) !== 'undefined') {
|
|
prettyPrint();
|
|
}
|
|
//--><!]]></script>
|
|
</body></html> |