Table of Contents
A Glossary of Terms Used in Internet and Web Applications Technology
Created in DokuWiki because Blackboard's glossary wasn't quite good enough! Please feel free to get an account and contribute.
The Apache Web Server is the most widely deployed web server in the world responsible for serving around 50% of all publicly accessible web sites and web applications on the Internet (as of August 2008) and countless other intranet web servers behind corporate firewalls.
The name “Apache Webserver” is a pun on “a patchy webserver”: a name that refers to the web server's origin as a bug-fixed (or “patched”) version of the NCSA webserver. The Apache Webserver is now an Open Source project maintained by the Apache Software Foundation (ASF), a source of numerous other pieces of open source software including the Tomcat Webserver, numerous development tools and libraries used by professional Java developers, and an industry standard set of tools for XML parsing.
The common name of the server process that performs domain name lookup within an ISP and therefore implements DNS. The principle jobs that bind does are 1) act as the authoritative naming service for a domain and 2) link with other domain name servers to process iterative or recursive name look-ups with other domain name servers, including the back-bone name servers. There are usually at least two servers in a domain the run the bind protocol. One typically acts as a master and the other a slave with the idea if one should fail, the other can continue to provide DNS services both within a domain and to the wider Internet.
Bind is so named because it binds a domain name to an IP address (or vice versa) rather than because it is a daemon that implements the bin protocol (see also HTTP).
Dynamic Host Configuration Protocol. A protocol used by most modern home network routers, network providers and ISPs to automatically assign an IP address to a host that wished to connect to the Internet on demand. In the context of home (and some institutional) networks, DHCP allocates addresses from a pool of private IP addresses (e.g. 192.168.1.100). Such hosts cannot directly connect to the Internet and need to connect instead through what is called a NAT service. In larger ISPs such as swan.ac.uk or your broadband supplier, IP addresses are allocated via DHCP from a pool of IP addresses that the ISP has purchased from a higher level ISP. Thus when connecting your registered lap top to the university via the wired Ethernet, you will be assigned an IP address that starts 137.44. However, when you connect via wireless, you will be allocated a private IP address that starts 192.168 and your messages will be routed to the Internet via a NAT server inside the wireless router.
DHCP can be set up to only grant an IP address assignment request from network interface cards with a known physical (MAC) address. This is why you have to register your laptop with the University before it can be connected to the wired or wireless service.
A term, originally named after the configuration parameter
DocumentRoot which is defined for the Apache web server, which has come to mean the top-level directory of a web server. It is usually a real physical directory (or folder) on the hosting server's fie system and any documents or directories placed in
DocumentRoot will themselves become resources on the web site's URI. the URI “/” on the host operating system is mapped to the physical directory defined by the
DocumentRoot parameter and any other static documents, resources and directories contained within the
DocumentRoot will be mapped to the file path part of the URI relative to
DocumentRoot. So for example, if you install the Drupal CMS in a web server,
DocumentRoot will contain a folder
drupal which contains
index.php: the URI will be http://host.domain/drupal/index.php or more simply http://host.domain/drupal/. Although the file path part of the URI hints at a traditional (Unix-like) file structure, the modern web server's support of Virtual Directories, Virtual Hosts, CGI scripts, web applications and URL rewriting, means that this will not necessarily be the case. In a sense web server URIs are similar to Unix file systems in that they appear to form a single rooted file hierarchy which in facts hides the actual location of the resources that a web server provides.
HyperText Markup Language. A text-based markup language originally designed by Tim Berners-Lee to make the creation of technical scientific documents easy to write with a text editor, and, by means of hyperlinks, easy to link to other documents to produce a web of information. The markup was originally not formally defined, but was based on document archiving and exchange standard that existed at the time called Structured Generalised Markup Language (SGML). HTML borrows its tag format from SGML – specially marked-up instructions that define a document's structure (called elements) are easily separated from textual content without the need for any special binary formats. Thus some emphasized text would be marked up like <em>this</em>. HTML extends the idea to structural elements <head>, <body>, <title>, <p>aragraphs, headings (<h1>, <h2>, …), lists etc.
With the invention of graphical browsers, elements for embedding images were added. When Microsoft and Netscape engaged in the so-called browser wars, more elements for defining text effects, and embedding multimedia into web pages were added. Eventually, a standards body called the World Wide Web Consortium (W3C) was established and it tried to formalise definition of HTML. Initially it defined HTML as it existed at the time as a formal SGML application (HTML 3.2). Later, with HTML 4, it attempted to separate structure from presentation by introducing a separate style-sheet language called CSS. More recently, with the emergence of XML as an easier to use version of SGML, the W3C has tried to promote XHTML, which is essentially HTML 4 defined as an XML document type, with limited success.
In something of an industry backlash to W3C's over-formalization of web technologies as XML applications, a new body called WHATWG has proposed HTML 5 as an evolutionary follow-up to HTML 4, the language that is used (often badly) to code most of the web pages the exist in the world today.“ HTTP “HyperText Transfer Protocol: the application protocol that defines the world-wide web. It's a very simple protocol that allows a user agent (or client) – usually, but not always, a web browser – to request a resource from a web server.
Two messages are defined: the HTTP request which consists essentially of a verb (GET, POST, HEAD, PUT, DELETE) and some additional fields and data which act as adverbs, and the HTTP response which provides a status message (200 OK, 404 not found, 304 not modified), a content-type declaration (text/html, text/css, image/jpeg, etc) that allows the client to correctly identify and render the resource, and the resource itself. The resource itself is simply carried as a binary data payload whose size is specified in bytes.
HTTP is a stateless protocol: as far as the web server is concerned each request is handled separately as if it had come from a different web client. HTTP version 1.1 allows the client to keep a socket open after the first request so that subsequent requests to the same host (e.g. for additional resources from the same web site) can be made without the overhead of establishing a new TCP connection.
As web applications typically allow a web user to engage in a conversation with the web server, the stateless nature of HTML makes the handling of state a challenge that has to be handled using browser and server tricks such as sessions and session cookies.” httpd The name of the daemon process that is permanently in a web server host's memory waiting for connection requests to come in on port 80. This program accepts a single HTTP request and responds with a single HTTP response. The name is a result of the common Unix convention of adding the letter 'd' (for daemon) to a server protocol as in ftpd, sshd, telnetd, etc, but not bind! More often than not, httpd is actually Apache's process name. Internet “Literally a interconnection of networks, the Internet (capital I) is the global network of networks that provides the infrastructure for, among many other things, the World-Wide Web, email, and video on demand systems like BBC's iPlayer. In physical terms, the Internet is the collection of hosts, routers and links that allow heterogeneous networks to connect to each other and participate in the transfer of information from end-system (host) to end-system. In protocol terms IP is the network protocol that defines the encapsulation, addressing and routing scheme that is needed to transmit packets of data from host to router and router to router. For many applications, reliable data transfer over IP is provided by TCP and so the Internet is often said to be a TCP/IP network. However, strictly speaking, IP defines the Internet and is the protocol that all hosts and routers must implement to participate in the Internet.
The Internet is a Packet Switched Network in which long messages are broken up into small packets of data that are addressed, transmitted and routed independently of each other (rather like letters in envelopes are moved by the Royal Mail) by the Packet Switches (routers) that act as sorting offices for the network. See also IP and IP address.” ISP “Internet Service Provider. Either an institution or a paid-for service provider that literally gives its clients access to the Internet. An ISP will typically buy a block of IP addresses from an higher level ISP (e.g. for UK academia that is JANET which runs the ac.uk domain) which it can allocate to its customers and name as it wishes. ISPs are usually associated with a domain name, e.g. swan.ac.uk, and to be good citizens on the Internet, must provide domain naming services (DNS) and authoritative naming for the fully qualified domain names (FQDN) of all the hosts (IP addresses) that the ISP has ownership of.
In modern systems, ISPs allocated their supply of IP addresses to hosts on-demand using a protocol known as DHCP.” NCSA National Center for Supercomputing Applications is located at the Universtity of Illinois at Urbana Champagne in the USA. It is famous in the development of the World Wide Web as the place where the first widely distributed web server (which became Apache) and the first graphical web browser, Mosaic, were developed. W3C World Wide Web Consortium – the standards body responsible for defining XML, HTML, CSS and advising the web developer community on such issues as accessibility, internationalization etc. The W3C web site http://www.w3.org should be in every web developer's favourites. The documents there are rather dry but they are certainly definitive. Unfortunately the W3C has no teeth, so the standards it proposes are often ignored or at best poorly applied by web developers, browser developers and tool developers. This unfortunate situation accounts for the frustration that any contentious web developer feels when he or she tries to produce valid standards-conforming documents that will actually work on all browsers across all platforms. See also WHATWG. WHATWG Web Hypertext Application Technology Working Group. An open consortium formed by industry experts, browser and tool developers and others in an attempt to create web standards that would actually be used and provide features, needed by the modern breed of web applications, that could actually be implemented by browser developers. Formed in response to the closed, slow-moving and commercial lobbying that is the common perception of the W3C's official standards process, WHATWG is seeking to define a new version of HTML (HTML 5) that will be an evolution of HTML 4 which is still the ligua franca of the most of the WWW despite the existence of XHTML. Some progress has been made with modern browsers starting to actually implement features of HTML 5. HTML 5 is now being jointly developed by WHATWG and the W3C. WWW World-Wide Web, term coined by Tim Berners-Lee, for the global web of interconnected [scientific] information that he envisaged when he invented the web at the European Nuclear Research Centre (CERN) in the late 1980s. Often simply called “the web”, WWW (sometimes written W3) is often incorrectly thought to be a synonym for the Internet.