How Does Apache Process HTTP Requests?
This article explores the internal mechanisms of the Apache HTTP Server to explain exactly how it manages incoming web traffic. We will walk through the lifecycle of an HTTP request, from the moment a connection is established on a listening port to the final delivery of the response back to the client. Key concepts covered include the role of Multi-Processing Modules (MPMs), the step-by-step request processing phases, and how Apache translates and serves the final content.
Establishing the Connection
Before any request can be processed, Apache must first be ready to
receive it. When the Apache service starts, it binds to specific IP
addresses and ports (typically port 80 for HTTP and port 443 for HTTPS)
as defined by the Listen directive in its configuration
files. The server continuously listens on these network sockets for
incoming Transmission Control Protocol (TCP) connections from web
clients, such as browsers. Once a client initiates a connection, a TCP
handshake occurs, and a network channel is established for data
transfer.
The Role of Multi-Processing Modules (MPMs)
Because Apache is designed to handle thousands of concurrent connections, it delegates the physical management of network connections to Multi-Processing Modules (MPMs). The choice of MPM dictates how the server allocates system resources to handle concurrent requests.
- Prefork MPM: This is a non-threaded model. Apache
creates a single parent process that forks multiple child processes.
Each child process handles exactly one request at a time. This model is
highly stable and isolated, making it ideal for running non-thread-safe
libraries (like
mod_php), but it consumes more memory. - Worker MPM: This model uses a combination of multiple processes and multiple threads per process. Threads are lighter than separate processes, allowing Worker to handle a larger volume of concurrent requests with a smaller memory footprint compared to Prefork.
- Event MPM: The modern default for most Apache installations. It builds upon the Worker model but dedicates specific listener threads to manage idle connections (Keep-Alive connections). This prevents threads from being tied up waiting for a client to send its next request, massively improving scalability under heavy load.
The Request Processing Lifecycle
Once an MPM thread or process accepts a connection and reads the HTTP request headers, Apache routes the request through a highly structured, multi-phase lifecycle. This modular architecture allows administrators to hook into specific phases using various Apache modules.
1. URI Translation
The first major step is translating the requested Uniform Resource
Identifier (URI) into a physical file path or an internal handler.
Apache evaluates directives like DocumentRoot,
Alias, ProxyPass, and RewriteRule
(via mod_rewrite). For example, a request for
/images/logo.png might be translated to
/var/www/html/images/logo.png on the server’s
filesystem.
2. Header Parsing and Access Control
Apache evaluates the HTTP headers to determine if the server can
fulfill the request based on hostnames (Virtual Hosts) and server
configurations. It then checks access controls (configured via
Require directives) to see if the client’s IP address or
network is allowed to access the translated resource.
3. Authentication and Authorization
If the requested directory is protected, Apache checks if the user has provided valid credentials.
- Authentication: Verifying who the user is (e.g.,
checking a username and password against a database using
mod_auth_basic). - Authorization: Verifying if the authenticated user has the necessary permissions to view the specific file or directory.
4. MIME Type Checking
Before delivering a file, Apache determines the content type of the
resource being served so it can inform the client’s browser how to
handle it. It maps the file extension (e.g., .html,
.css, .php) to a specific MIME type (e.g.,
text/html, text/css) using the
mime.types file and AddType directives.
5. Content Generation and Response
Once all checks are passed, Apache determines how to generate the content.
- Static Content: If the request is for a static file (like an image or HTML document), the core Apache module simply reads the file from the disk and prepares to send it.
- Dynamic Content: If the resource is dynamic (like a PHP or Python script), Apache passes the request to the appropriate handler or external process (via CGI or FastCGI) to execute the code and generate the output.
Filtering and Output
Before the final output is sent back over the network, it passes
through Apache’s output filter chain. Modules can intercept the response
data to modify it on the fly. A common example is
mod_deflate, which compresses the outgoing text, HTML, or
CSS data into GZIP format to save bandwidth and improve load times.
Finally, the completed HTTP response, containing the appropriate
status code (like 200 OK or 404 Not Found),
the response headers, and the body content, is transmitted back to the
client over the established TCP connection. Depending on the HTTP
Keep-Alive settings, the MPM will then either close the connection or
keep it open, waiting for the client’s next request.