User Tools

Site Tools


eg-253:httpd_conf

An Introduction to the Apache Configuration File

The apache web service is usually run as a system service (which in Unix is known as a daemon process). As it implements the hypertext transfer protocol (HTTP) and runs as a daemon process, the executable program that runs the apache web server is the http daemon or httpd (httpd.exe on windows). In common with many applications that originated in to Unix world, httpd is configured by a text-based configuration file which is by default namedhttpd.conf. In this document, we shall introduce httpd.conf using the XAMPP platform configuration on Windows and Linux as an example. You should note that other platforms, for example Ubuntu, SuSE, Fedora or Macintosh OS X, will have slightly different settings, but also that these differences are largely due to variations in the installation locations that are used for these distributions. Please feel free to extend this page to describe these.

Some Important Definitions

Before we explore the configuration of an Apache we server, there are some important concepts that we need to define.

  • A URI (uniform resource identifier) takes the form: <protocol>://<host>/<resource> and uniquely identifies a resource on the Internet. For example: http://mycompany.com/products/ identifies the resource products/ which is located on host mycompany.com and can be accessed using the hypertext transfer protocol.
  • The Content-Type field identifies the type of data contained in an HTTP response and is used by the browser to render the data content of a response. It uses MIME standard specifications. Examples: text/html, image/jpeg, application/pdf.
  • The Host field in an HTTP request identifies the host to which request is directed. Normally this will be the fully-qualified domain name of the host on which the web server is running. But its presence allows Virtual Hosting which is a web server feature that allows many web hosts to exist at a single IP address.

Locating the configuration file

In order to explore httpd.conf, you first need to find it!

  • In XAMPP on windows it is c:\xampp\apache\conf\httpd.conf.
  • For XAMPP on Linux, you'll find it in /opt/lampp/apache/conf/httpd.conf
  • In a typical Linux installation, you'll find it in /etc/apache2/httpd.conf

When you have found the file, open it in your favourite text editor.1)

Exploring the Apache Configuration

A typical Apache web server installation will consist of four directories:

  • conf contains the configuration file, usually called httpd.conf, which tells Apache how to respond to different kinds of requests.
  • htdocs contains the resources: documents, images, data, and so forth that you want to serve up your clients.
  • logs contains the log files that record what happens as the server is running. You should consult logs/error_log (Windows XAMPP c:\xampp\apache\logs\error.log) whenever your server fails to work as planned.
  • cgi-bin contains any CGI scripts that are needed. If you don’t use scripts you don’t need this directory.

Standard settings

The configuration of the Apache web server will have several predefined settings. These are:

  • ServerRoot which is the location of the httpd program and its configuration files. On XAMPP for windows this is usually c:\xampp\apache.
  • Listen is the port number which Apache should listen to for web requests. By default this is port 80.
  • ServerAdmin is the email address of the web server administrator (Webmaster). The value of this setting is used in the server error messages, for example the message for 404 not found.
  • ServerName is the name of the web server. You would normally set this to the host name of your web server, or its IP address if your site does not have a DNS server. By default ServerName is set to localhost:80.
  • DocumentRoot is the default location for resources that the web server can deliver. It is normally set to the location of the htdocs directory (c:\xampp\htdocs on XAMPP for Windows). DocumentRoot is represented by the / between the fully-qualified domain name and the resource address. That is for http://mycompany.com// maps to htdocs and in http://mycompany.com/resource, resource is a directory (folder) within the htdocs directory.2)

Server Directives

Apache provides a large number of directives for controlling how the server will work

  • <VirtualHost> defines settings (ServerName, Administrator email, DocumentRoot, log-file locations and log settings)
  • <Directory> modifies the behaviour of certain directories defined by absolute path. By default, only htdocs is a trusted location for web resources. Use of the Directory makes other locations (or Virtual Directories) on the host file system available to the web server.
  • <File> modifies the behaviour for certain files within DocumentRoot or within a directory defined in a Directory directive.
  • <Location> modifies behaviour of certain locations (i.e. resource directories)
  • <Limit> controls who (person or group) has access to a resource or which host name, domain or IP address has access.
  • <Options> provides modifications to default behaviour: e.g. turn on server-side includes; which files are index files; whether files can be executed as CGI scripts, etc. It is possible also to redefine properties on a per-directory basis by use of .htaccess file.

Directory Aliasing

To avoid everything having to be located in DocumentRoot you use an alias:

Alias /marketing /home/marketing

This essentially defines a URI redirection. If the URI /marketing is requested, the server will look for files in /home/marketing rather than htdocs/marketing. A <Directory> directive would normally be required to specify access rights for /home/marketing.

User Directories

UserDir is a special directive which allows you to set up user-owned web sites. These are indicated by special location /~user. The actual directory used will depend on your operating system. In Linux it is usually /home/user/public_html. Global settings can be defined by server and overridden (if allowed) by an .htaccess file in ~/public_html.

Authentication

Apache provides a simple mechanism for authentication and authorisation. Additional modules add sophistication.

An example:

Alias /marketing /home/marketing
<Directory /home/marketing>
  AuthType basic
  AuthName "sales people"
  AuthUserFile some_dir/sales
  AuthGroupFile some_dir/groups
  Require valid-user # or valid-group  
</Directory>

The example <Directory> directive shows an example of the settings that define “basic authentication”. When a client attempts to access a resource in directory /marketing a “401 unauthorized” message is sent back. The client's browser shows a simple login page (user_name password). The user's credentials are returned to the client as the data in a WWW-Authentication field . If the password supplied matches user_name’s password (stored in some_dir/sales) authentication is passed and access granted.

The password file is created using <ServerRoot>/bin/htpasswd. It is similar in format to the password file (/etc/passwd) used in Unix. The Groups file is just a file containing group records each of which contains a list of users who belong to that group of the form:

 marketing: chris ellie joe 

Server Logs

By default, the apache web-server keeps a log of every successful request and errors that occur. The location of the log files is defined in httpd.conf.

The types of data that can be logged include:

  • Referrer – where was the client's browser before it requested this page (useful to find out who is linking to your web site)
  • Error – which requests caused problems.
  • Access – log of every successful request.

Usually, each log is kept in a separate file (e.g. error.log, access.log, referrer.log) but you can use server directives to turn off certain logs, provide more or less detail, or even to direct all logging messages to a single file.

Once your web site has been running for a while, you will, as a web master, want to examine the log files. The types of analysis you can do include:

  • Browser stats: which user agents are being used by you visitors – useful for website designers
  • Visitor location: ie. IP address of client.
  • Quanity of data transferred
  • Sites which link to your site
  • Peak times
  • Unique “eyeballs”, e.g. for marketing

On XAMPP, such data is accessible on XAMPP through the Webalizer.

Rest of the Story

Apache contains many more directives. You will see some first hand in the EG-253 lab exercises.

A Typical VirtualHost Definition

Virtual hosts can be set up which allows the appearence of multiple hosts on a single IP address. Here's an example that sets up a fully qualified domain name marketing.mycompany.com. This host will have its own settings for SeverAdmin, DocumentRoot, ServerName, ErrorLog and TransferLog.

 <VirtualHost marketing.mycompany.com>
   ServerAdmin sales@mycompany.com
   DocumentRoot /opt/lampp/htdocs/marketing
   ServerName marketing.mycompany.com
   ErrorLog /opt/lampp/logs/marketing/error_log
   TransferLog /opt/lampp/logs/marketing/access_log
 </VirtualHost

A corresponding HTTP request for a resource located on this virtual host would be:3)

  GET /catalogue.pdf HTTP/1.1
  Host: marketing.mycompany.com

Homework Exercise

A real httpd.conf file is fairly complex, but it is usually well documented and as it is text, it's fairly easy to read. Examine the configuration file of your web server (XAMPP for Linux: /opt/lampp/httpd.conf; XAMPP for Windows c:\xampp\apache\conf\httpd.conf). Open the file, and then

  1. Determine the User and Group of your server
  2. Note the location of your log files, mime types, ServerRoot, DocumentRoot etc.
1)
You may need to be an administrator to access the file
2)
This can be changed by use of Virtual Directories – see the Alias and Directory directives
3)
assumes registration of the host IP for marketing with the authoritative name server for mycompany.com.
eg-253/httpd_conf.txt · Last modified: 2011/01/14 12:45 by 127.0.0.1