Table of Contents
An Introduction to the Apache Configuration File
The apache web service is usually run as a system service (which in Unix is known as a daemon process). As it implements the hypertext transfer protocol (HTTP) and runs as a daemon process, the executable program that runs the apache web server is the http daemon or httpd (httpd.exe on windows). In common with many applications that originated in to Unix world, httpd is configured by a text-based configuration file which is by default namedhttpd.conf. In this document, we shall introduce httpd.conf using the XAMPP platform configuration on Windows and Linux as an example. You should note that other platforms, for example Ubuntu, SuSE, Fedora or Macintosh OS X, will have slightly different settings, but also that these differences are largely due to variations in the installation locations that are used for these distributions. Please feel free to extend this page to describe these.
Some Important Definitions
Before we explore the configuration of an Apache we server, there are some important concepts that we need to define.
- A URI (uniform resource identifier) takes the form:
<protocol>://<host>/<resource>
and uniquely identifies a resource on the Internet. For example:http://mycompany.com/products/
identifies the resourceproducts/
which is located on hostmycompany.com
and can be accessed using the hypertext transfer protocol. - The
Content-Type
field identifies the type of data contained in an HTTP response and is used by the browser to render the data content of a response. It uses MIME standard specifications. Examples:text/html
,image/jpeg
,application/pdf
. - The
Host
field in an HTTP request identifies the host to which request is directed. Normally this will be the fully-qualified domain name of the host on which the web server is running. But its presence allows Virtual Hosting which is a web server feature that allows many web hosts to exist at a single IP address.
Locating the configuration file
In order to explore httpd.conf, you first need to find it!
- In XAMPP on windows it is
c:\xampp\apache\conf\httpd.conf
. - For XAMPP on Linux, you'll find it in
/opt/lampp/apache/conf/httpd.conf
- In a typical Linux installation, you'll find it in
/etc/apache2/httpd.conf
When you have found the file, open it in your favourite text editor.1)
Exploring the Apache Configuration
A typical Apache web server installation will consist of four directories:
conf
contains the configuration file, usually calledhttpd.conf
, which tells Apache how to respond to different kinds of requests.htdocs
contains the resources: documents, images, data, and so forth that you want to serve up your clients.logs
contains the log files that record what happens as the server is running. You should consultlogs/error_log
(Windows XAMPPc:\xampp\apache\logs\error.log
) whenever your server fails to work as planned.cgi-bin
contains any CGI scripts that are needed. If you don’t use scripts you don’t need this directory.
Standard settings
The configuration of the Apache web server will have several predefined settings. These are:
ServerRoot
which is the location of the httpd program and its configuration files. On XAMPP for windows this is usuallyc:\xampp\apache
.Listen
is the port number which Apache should listen to for web requests. By default this is port 80.ServerAdmin
is the email address of the web server administrator (Webmaster). The value of this setting is used in the server error messages, for example the message for404 not found
.ServerName
is the name of the web server. You would normally set this to the host name of your web server, or its IP address if your site does not have a DNS server. By defaultServerName
is set tolocalhost:80
.DocumentRoot
is the default location for resources that the web server can deliver. It is normally set to the location of the htdocs directory (c:\xampp\htdocs
on XAMPP for Windows).DocumentRoot
is represented by the/
between the fully-qualified domain name and the resource address. That is forhttp://mycompany.com/
–/
maps to htdocs and inhttp://mycompany.com/resource
,resource
is a directory (folder) within the htdocs directory.2)
Server Directives
Apache provides a large number of directives for controlling how the server will work
<VirtualHost>
defines settings (ServerName
, Administrator email,DocumentRoot
, log-file locations and log settings)<Directory>
modifies the behaviour of certain directories defined by absolute path. By default, only htdocs is a trusted location for web resources. Use of theDirectory
makes other locations (or Virtual Directories) on the host file system available to the web server.<File>
modifies the behaviour for certain files withinDocumentRoot
or within a directory defined in aDirectory
directive.<Location>
modifies behaviour of certain locations (i.e. resource directories)<Limit>
controls who (person or group) has access to a resource or which host name, domain or IP address has access.<Options>
provides modifications to default behaviour: e.g. turn on server-side includes; which files are index files; whether files can be executed as CGI scripts, etc. It is possible also to redefine properties on a per-directory basis by use of .htaccess file.
Directory Aliasing
To avoid everything having to be located in DocumentRoot
you use an alias:
Alias /marketing /home/marketing
This essentially defines a URI redirection. If the URI /marketing
is requested, the server will look for files in /home/marketing
rather than htdocs/marketing
. A <Directory>
directive would normally be required to specify access rights for /home/marketing
.
User Directories
UserDir
is a special directive which allows you to set up user-owned web sites. These are indicated by special location /~user
. The actual directory used will depend on your operating system. In Linux it is usually /home/user/public_html
. Global settings can be defined by server and overridden (if allowed) by an .htaccess file in ~/public_html
.
Authentication
Apache provides a simple mechanism for authentication and authorisation. Additional modules add sophistication.
An example:
Alias /marketing /home/marketing
<Directory /home/marketing> AuthType basic AuthName "sales people" AuthUserFile some_dir/sales AuthGroupFile some_dir/groups Require valid-user # or valid-group </Directory>
The example <Directory>
directive shows an example of the settings that define “basic authentication”. When a client attempts to access a resource in directory /marketing
a “401 unauthorized” message is sent back. The client's browser shows a simple login page (user_name password). The user's credentials are returned to the client as the data in a WWW-Authentication
field . If the password supplied matches user_name
’s password (stored in some_dir/sales
) authentication is passed and access granted.
The password file is created using <ServerRoot>/bin/htpasswd
. It is similar in format to the password file (/etc/passwd
) used in Unix. The Groups file is just a file containing group records each of which contains a list of users who belong to that group of the form:
marketing: chris ellie joe
Server Logs
By default, the apache web-server keeps a log of every successful request and errors that occur. The location of the log files is defined in httpd.conf.
The types of data that can be logged include:
- Referrer – where was the client's browser before it requested this page (useful to find out who is linking to your web site)
- Error – which requests caused problems.
- Access – log of every successful request.
Usually, each log is kept in a separate file (e.g. error.log, access.log, referrer.log) but you can use server directives to turn off certain logs, provide more or less detail, or even to direct all logging messages to a single file.
Once your web site has been running for a while, you will, as a web master, want to examine the log files. The types of analysis you can do include:
- Browser stats: which user agents are being used by you visitors – useful for website designers
- Visitor location: ie. IP address of client.
- Quanity of data transferred
- Sites which link to your site
- Peak times
- Unique “eyeballs”, e.g. for marketing
On XAMPP, such data is accessible on XAMPP through the Webalizer.
Rest of the Story
Apache contains many more directives. You will see some first hand in the EG-253 lab exercises.
A Typical VirtualHost Definition
Virtual hosts can be set up which allows the appearence of multiple hosts on a single IP address. Here's an example that sets up a fully qualified domain name marketing.mycompany.com
. This host will have its own settings for SeverAdmin
, DocumentRoot
, ServerName
, ErrorLog
and TransferLog
.
<VirtualHost marketing.mycompany.com> ServerAdmin sales@mycompany.com DocumentRoot /opt/lampp/htdocs/marketing ServerName marketing.mycompany.com ErrorLog /opt/lampp/logs/marketing/error_log TransferLog /opt/lampp/logs/marketing/access_log </VirtualHost
A corresponding HTTP request for a resource located on this virtual host would be:3)
GET /catalogue.pdf HTTP/1.1 Host: marketing.mycompany.com
Homework Exercise
A real httpd.conf
file is fairly complex, but it is usually well documented and as it is text, it's fairly easy to read. Examine the configuration file of your web server (XAMPP for Linux: /opt/lampp/httpd.conf
; XAMPP for Windows c:\xampp\apache\conf\httpd.conf
). Open the file, and then
- Determine the
User
andGroup
of your server - Note the location of your log files, mime types,
ServerRoot
,DocumentRoot
etc.