====== An Introduction to the Apache Configuration File ====== The apache web service is usually run as a system service (which in Unix is known as a //daemon process//). As it implements the hypertext transfer protocol (HTTP) and runs as a //daemon// process, the executable program that runs the apache web server is the //http daemon// or **httpd** (//httpd.exe// on windows). In common with many applications that originated in to Unix world, //httpd// is configured by a text-based configuration file which is by default named//httpd.conf//. In this document, we shall introduce //httpd.conf// using the XAMPP platform configuration on Windows and Linux as an example. You should note that other platforms, for example Ubuntu, SuSE, Fedora or Macintosh OS X, will have slightly different settings, but also that these differences are largely due to variations in the installation locations that are used for these distributions. Please feel free to extend this page to describe these. ===== Some Important Definitions ===== Before we explore the configuration of an Apache we server, there are some important concepts that we need to define. * A **URI** (uniform resource identifier) takes the form: '':%%//%%/'' and uniquely identifies a resource on the Internet. For example: ''http://mycompany.com/products/'' identifies the resource ''products/'' which is located on host ''mycompany.com'' and can be accessed using the hypertext transfer protocol. * The ''Content-Type'' field identifies the type of data contained in an HTTP response and is used by the browser to render the data content of a response. It uses MIME standard specifications. Examples: ''text/html'', ''image/jpeg'', ''application/pdf''. * The ''Host'' field in an HTTP request identifies the host to which request is directed. Normally this will be the fully-qualified domain name of the host on which the web server is running. But its presence allows //Virtual Hosting// which is a web server feature that allows many web hosts to exist at a single IP address. ===== Locating the configuration file ===== In order to explore //httpd.conf//, you first need to find it! * In XAMPP on windows it is ''c:\xampp\apache\conf\httpd.conf''. * For XAMPP on Linux, you'll find it in ''/opt/lampp/apache/conf/httpd.conf'' * In a typical Linux installation, you'll find it in ''/etc/apache2/httpd.conf'' When you have found the file, open it in your favourite text editor.((You may need to be an administrator to access the file)) ===== Exploring the Apache Configuration ===== A typical Apache web server installation will consist of four directories: * ''conf'' contains the configuration file, usually called ''httpd.conf'', which tells Apache how to respond to different kinds of requests. * ''htdocs'' contains the //resources//: documents, images, data, and so forth that you want to serve up your clients. * ''logs'' contains the log files that record what happens as the server is running. You should consult ''logs/error_log'' (Windows XAMPP ''c:\xampp\apache\logs\error.log'') whenever your server fails to work as planned. * ''cgi-bin'' contains any CGI scripts that are needed. If you don’t use scripts you don’t need this directory. ==== Standard settings ==== The configuration of the Apache web server will have several predefined settings. These are: * ''ServerRoot'' which is the location of the //httpd// program and its configuration files. On XAMPP for windows this is usually ''c:\xampp\apache''. * ''Listen'' is the port number which Apache should listen to for web requests. By default this is port 80. * ''ServerAdmin'' is the email address of the web server administrator (Webmaster). The value of this setting is used in the server error messages, for example the message for ''404 not found''. * ''ServerName'' is the name of the web server. You would normally set this to the host name of your web server, or its IP address if your site does not have a DNS server. By default ''ServerName'' is set to ''localhost:80''. * ''DocumentRoot'' is the default location for resources that the web server can deliver. It is normally set to the location of the //htdocs// directory (''c:\xampp\htdocs'' on XAMPP for Windows). ''DocumentRoot'' is represented by the ''/'' between the fully-qualified domain name and the resource address. That is for ''http://mycompany.com/'' -- ''/'' maps to //htdocs// and in ''http://mycompany.com/resource'', ''resource'' is a directory (folder) within the //htdocs// directory.((This can be changed by use of //Virtual Directories// -- see the //Alias// and //Directory// directives)) ==== Server Directives ==== Apache provides a large number of directives for controlling how the server will work * '''' defines settings (''ServerName'', Administrator email, ''DocumentRoot'', log-file locations and log settings) * '''' modifies the behaviour of certain directories defined by absolute path. By default, only //htdocs// is a trusted location for web resources. Use of the ''Directory'' makes other locations (or //Virtual Directories//) on the host file system available to the web server. * '''' modifies the behaviour for certain files within ''DocumentRoot'' or within a directory defined in a ''Directory'' directive. * '''' modifies behaviour of certain locations (i.e. resource directories) * '''' controls who (person or group) has access to a resource or which host name, domain or IP address has access. * '''' provides modifications to default behaviour: e.g. turn on server-side includes; which files are index files; whether files can be executed as CGI scripts, etc. It is possible also to redefine properties on a per-directory basis by use of //.htaccess// file. ==== Directory Aliasing ==== To avoid everything having to be located in ''DocumentRoot'' you use an alias: Alias /marketing /home/marketing This essentially defines a URI redirection. If the URI ''/marketing'' is requested, the server will look for files in ''/home/marketing'' rather than ''htdocs/marketing''. A '''' directive would normally be required to specify access rights for ''/home/marketing''. ==== User Directories ==== ''UserDir'' is a special directive which allows you to set up user-owned web sites. These are indicated by special location ''/~user''. The actual directory used will depend on your operating system. In Linux it is usually ''/home/user/public_html''. Global settings can be defined by server and overridden (if allowed) by an //.htaccess// file in ''~/public_html''. ==== Authentication ==== Apache provides a simple mechanism for authentication and authorisation. Additional modules add sophistication. An example: Alias /marketing /home/marketing AuthType basic AuthName "sales people" AuthUserFile some_dir/sales AuthGroupFile some_dir/groups Require valid-user # or valid-group The example '''' directive shows an example of the settings that define "basic authentication". When a client attempts to access a resource in directory ''/marketing'' a "401 unauthorized" message is sent back. The client's browser shows a simple login page (user_name password). The user's credentials are returned to the client as the data in a ''WWW-Authentication'' field . If the password supplied matches ''user_name''’s password (stored in ''//some_dir/sales//'') authentication is passed and access granted. The password file is created using ''/bin/htpasswd''. It is similar in format to the //password file// (''/etc/passwd'') used in Unix. The //Groups// file is just a file containing group records each of which contains a list of users who belong to that group of the form: marketing: chris ellie joe ==== Server Logs ==== By default, the apache web-server keeps a log of every successful request and errors that occur. The location of the log files is defined in //httpd.conf//. The types of data that can be logged include: * //Referrer// -- where was the client's browser before it requested this page (useful to find out who is linking to your web site) * //Error// -- which requests caused problems. * //Access// -- log of every successful request. Usually, each log is kept in a separate file (e.g. //error.log//, //access.log//, //referrer.log//) but you can use server directives to turn off certain logs, provide more or less detail, or even to direct all logging messages to a single file. Once your web site has been running for a while, you will, as a web master, want to examine the log files. The types of analysis you can do include: * Browser stats: which user agents are being used by you visitors -- useful for website designers * Visitor location: ie. IP address of client. * Quanity of data transferred * Sites which link to your site * Peak times * Unique "eyeballs", e.g. for marketing On XAMPP, such data is accessible on XAMPP through the Webalizer. ==== Rest of the Story ==== Apache contains many more directives. You will see some first hand in the EG-253 lab exercises. ===== A Typical VirtualHost Definition ===== Virtual hosts can be set up which allows the appearence of multiple hosts on a single IP address. Here's an example that sets up a fully qualified domain name ''marketing.mycompany.com''. This host will have its own settings for ''SeverAdmin'', ''DocumentRoot'', ''ServerName'', ''ErrorLog'' and ''TransferLog''. ServerAdmin sales@mycompany.com DocumentRoot /opt/lampp/htdocs/marketing ServerName marketing.mycompany.com ErrorLog /opt/lampp/logs/marketing/error_log TransferLog /opt/lampp/logs/marketing/access_log