~~SLIDESHOW~~ ====== Fundamentals of Web Applications Technology ====== **Background for Contact Hour 2**: To be discussed on Tuesday 31st January, 2012. **Lecturer**: [[C.P.Jobling@Swansea.ac.uk|Dr Chris P. Jobling]]. Setting the scene for the EG-259 Web Applications Technology module. ===== Fundamentals of Web Applications Technology ===== Setting the scene for the EG-259 Web Applications Technology module. //The slides used in this lecture are based on//: * Chapter 1 of Robert W. Sebasta, //Programming the World-Wide Web//, 3rd Edition, Addison Wesley, 2006. * Chapter 2 of James F. Kurose and Keith W. Ross, //Computer Networking: A Top-down Approach Featuring the Internet//, Addison-Wesley, 2005. * Additional material from Jennifer Niederst Robbins, //Web Design in a Nutshell//, 3rd Ed. O'Reilly, 2006. ===== Contents of this Lecture ===== * [[eg-259:lecture1#origins_of_the_internet|Revision of the History of the Internet, Application Protocols and the World Wide Web]] * [[eg-259:lecture1#applications_and_application-layer_protocols|Applications and Application-Layer Protocols]] * [[eg-259:lecture1#the_world_wide_web|The World-Wide Web]] * [[eg-259:lecture1#web_browsers|Web Browsers]] and [[eg-259:lecture1#web_servers|Web Servers]] * [[eg-259:lecture1#uniform_resource_identifier_uri|Uniform Resource Identifiers (URIs)]] * [[eg-259:lecture1#multipurpose_internet_mail_extensions_mime|Multipurpose Internet Mail Extensions (MIME)]] * [[eg-259:lecture1#the_hyper_text_transfer_protocol|The Hyper Text Transfer Protocol]] * [[eg-259:lecture1#web_programming|Web Programming]] * [[The Mobile Web]] ===== Learning Outcomes for this Lecture (1) ===== //At the end of this lecture you should be able to answer this selection of lecture review questions//: * What protocol is used by all computer connections to the Internet? * What is the task of a domain name server? * In what common situation is the document returned by a Web server created **after** the request is received? * What is meant by the terms //document root//, //server root//, //virtual host// when applied to a web server? ===== Learning Outcomes for this Lecture (2) ===== //At the end of this lecture you should be able to answer this selection of lecture review questions//: * What is the purpose of a MIME type specification in a request/response transaction between a browser and a server? * Prior to HTTP 1.1, how long were connections between browsers and servers normally maintained? * What is the purpose of the Common Gateway Interface? * Where is the code for JavaScript, Java Applet, Java Servlet, Perl CGI Script, and PHP script interpreted? ===== History of the Internet (Video) =====

History of the Internet from PICOL on Vimeo.

---- "History of the Internet" is an animated documentary explaining the inventions from time-sharing to filesharing, from Arpanet to Internet. The history is told using the PICOL icons on [[http://www.picol.org/picol.org|picol.org]]. You can already download a pre-release of all picol icons on [[http://blog.picol.org/downloads/icons|blog.picol.org/downloads/icons/]]. You can see the credits and more information on this movie on [[http://lonja.de/motion/mo_history_internet.html|lonja.de/motion/mo_history_internet.html]]. ===== Origins of the Internet ===== * ARPAnet -- late 1960s and early 1970s * BITNET, CS net -- late 1970s & early 1980s * NSF net -- 1986 * NSF net eventually became known as the Internet ---- **//Notes//** * ARPAnet: * Network reliability * For ARPA-funded research organizations * BITNET, CS net: * email and file transfer for other institutions * NSF net: * Originally for non-DOD funded places * Initially connected five supercomputer centers * By 1990, it had replaced ARPAnet for non-military uses * Soon became the network for all (by the early 1990s) ===== What is the Internet? ===== * A world-wide network of computer networks * At the lowest level, since 1982, all connections use the Internet Protocol (IP) * IP hides the differences among devices connected to the Internet ===== Applications and Application-Layer Protocols ===== {{ eg-259:l1-fig1.png?500 |Application protocols act peer to peer! Figure (c) Kurose and Ross.}} ---- * //Application//: communicating, distributed processes * e.g., e-mail, Web, P2P file sharing, instant messaging * running in end systems (hosts) * exchange messages to implement application * //Application-layer protocols// * one //piece// of an application * define messages exchanged by applications and actions taken * use communication services provided by transport layer protocols (TCP, UDP) **//Network applications: some jargon//** * //Process//: program running within a host. * within same host, two processes communicate using //interprocess communication// (defined by operating system). * processes running in different hosts communicate with an application-layer protocol * //User agent//: interfaces with user "above" and network "below". * implements user interface & application-level protocol * //Web//: browser * //E-mail//: mail reader * //streaming audio/video//: media player **//Application-layer protocol defines//** * Types of messages exchanged, e.g., request and response messages * Syntax of message types: what fields in messages and how fields are delineated * Semantics of the fields, ie, meaning of information in fields * Rules for when and how processes send & respond to messages * //Public-domain protocols//: * defined in RFCs * allows for interoperability * e.g., HTTP, SMTP * //Proprietary protocols//: * e.g., KaZaA ===== Client-Server Paradigm ===== * Typical network application has two pieces //client// and //server// {{ eg-259:l1-fig2.png?500 |Client-server architecture. Figure (c) Kurose and Ross}} ---- * //Client//: * initiates contact with server ("speaks first") * typically requests service from server, * Web: client implemented in browser * //Server//: * provides requested service to client * e.g., Web server sends requested Web page ===== Processes communicating across network ===== * Process sends/receives messages to/from its socket {{ eg-259:l1-fig3.png?500 |Processes communicating through sockets. Figure (c) Kurose and Ross}} ---- * Socket analogous to door * sending process pushes message out of the door * sending process assumes transport infrastructure on other side of door which brings message to socket at receiving process ===== Addressing Processes ===== * For a process to receive messages, it must have a //globally unique// identifier * Every node has a unique **32-bit IP address** * Every process running on the host has a unique **port number** * Process identifier includes both the **IP address** and **port number** associated with the process on the host ---- **Notes** * Organizations are assigned groups of IP addresses for their computers * The new standard, IPv6, uses 128 bits (1998) for host addresses. ===== Domain names ===== **//Form//**: **''host-name.domain-names''** * First domain is the smallest; last is the largest * Last domain specifies the type of organization (in USA) or geographical region (rest of the world) **//Fully qualified domain name//** * The host name and all of the domain names **//DNS servers//** * Convert fully qualified domain names to IP addresses ---- * The //last domain// is controlled by naming authorities associated with the top-level ISPs. * Examples are ''com'', ''org'', ''edu'', ''gov'' -- assigned (mostly to) US commercial institutions, not-for-profit organizations, educational institutions, and government institutions respectively; * Examples of Geographical domains are ''uk'', ''fr'', ''ie'' for countries in Europe, and ''us'', ''za'', ''ca'' for other locations. * The naming authorities assign lower level domain names to ISPs or to large institutions. * Examples of ISP domains are ''ac.uk'' (the UK Joint Academic Network), ''ntlworld.com'' (a UK cable broadband supplier), ''blogspot.com'' (Google's blog hoster). * Examples of institutional domains are ''swan.ac.uk'' (Swansea University), ''microsoft.com'', ''swansea.gov.uk'' (City and County of Swansea). * Within an institutional domain, the ISP or naming authority will usually assign a group of IP addresses that can be freely used. The institution acts as its own naming authority and is free to assign host names has it wishes (in fact several host names may be assigned to a single IP address) * An institutional Domain Naming Service (DNS) server (called the **Authoritative Name Server**) is used to map local host names to IP addresses. ===== Transport Services ===== What transport service does an application need? * //Data loss// * //Timing// * //Bandwidth// ---- * //Data loss//: * some applications (e.g., audio) can tolerate some loss *other applications (e.g., file transfer, telnet) require 100% reliable data transfer * //Timing//: * some applications (e.g., Internet telephony, interactive games) require low delay to be "effective" * //Bandwidth//: * some apps (e.g., multimedia) require minimum amount of bandwidth to be "effective" * other apps ("elastic apps") make use of whatever bandwidth they get ===== Internet transport protocols services: TCP ===== * //Connection-oriented//: setup required between client and server processes * //Reliable transport// between sending and receiving process * //Flow control//: sender won't overwhelm receiver * //Congestion control//: throttle sender when network overloaded * //Does not provide//: timing, minimum bandwidth guarantees * **Web Uses TCP** ===== Internet transport protocols services: UDP ===== * Unreliable data transfer between sending and receiving process * //Does not provide//: connection setup, reliability, flow control, congestion control, timing, or bandwidth guarantee * **DNS uses UDP** ===== Origins of the Web ===== * **Problem** * By the mid-1980s, several different protocols had been invented and were being used on the Internet, all with different user interfaces (Telnet, FTP, Usenet, email, Gopher) * **Possible Solution** * The World-Wide Web is a possible solution to the proliferation of different protocols being used on the Internet. ===== The World Wide Web ===== * Tim Berners-Lee at CERN proposed the Web in 1989 * Document form: hypertext * Objects? Pages? Documents? Resources? * Hypermedia---more than just text---images, sound, etc. * Web or Internet? ---- //**Notes**// The original purpose of the world-wide web was to allow scientists to have access to many databases of scientific work through their own computers. Objects? Pages? Documents? Resources? We'll call them //documents//. Web or Internet? The Web uses the application protocol, **HTTP**, that runs on the Internet -- there are many others (telnet, ftp, email, etc.) ===== Web Browsers ===== * //Mosaic// -- NCSA (Univ. of Illinois), in early 1993 * Browsers are clients -- always initiate, servers react (although sometimes servers require responses) * Most requests are for existing documents, using HyperText Transfer Protocol (HTTP) * But some requests are for program execution, with the output being returned as a document ---- //**Notes**// //Mosaic// was the second browser to use a GUI (the first, a graphical browser that Time Berners-Lee developed at Cern, did not have a wide distribution) and led to explosion of Web use. Initially for X-Windows, under UNIX, but was ported to other platforms by late 1993 ===== Web Servers ===== * Provide responses to browser requests, either existing documents or dynamically built documents * Browser-server connection is now maintained through more than one request-response cycle * All communications between browsers and servers use Hypertext Transfer Protocol (HTTP) * Web servers run as background processes in the operating system ---- //**Notes**// * Web servers monitor a communications port on the host, accepting HTTP messages when they appear * All current Web servers came from either - The original from CERN - The second one, from NCSA ===== Web Server Configuration ===== * Web servers have two main directories conventionally called the //Document Root// and the //Server Root// * //Document root// is accessed indirectly by clients * Virtual document trees * Virtual hosts * Proxy servers * Web servers now support other Internet protocols ---- //**Notes**// * //Server root// contains the server system software * //Document root// contains the servable documents * Its actual location is set by the server configuration file * Requests are mapped to the actual location * Virtual documents trees are "aliases" for part of the document tree that is not actually located in the document root. * A //virtual host// is a web server which appears to have a different host name from the primary host. Such a host will have its own document root. DNS is used to map the virtual host name to the IP address of the actual web server. The web server will then recognize the virtual host name from the HTTP request and serve documents from the right place. * Such a facility is very useful if you are hosting many web sites as each virtual host can be administered separately from the main web server's configuration. * Proxy web servers are web servers which can serve documents which are in the document root of another web server. They are often used to reduce the traffic from an institution to and from the Internet. For example, see [[reference:network#WWW Caching|UWS Proxy Server]]. * Web servers now support other Internet protocols: * For example ftp, Gopher, News and mail. Nearly all web servers can interact with database systems through the Common Gateway Interface (CGI) programs and server-side scripts. ===== Modern Web Servers ===== **You are Invited to Update this Page by providing a link to the January 2012 results and updating the tables** Market Share for Top Servers Across All Domains November 1995 - January 2011 [[http://news.netcraft.com/archives/2011/01/31/january_2011_web_server_survey.html|{{:eg-259:wpid-overallc2-jan2011.png|Netcraft Server Share Statistics}}]] ---- //**January 2011 Data**// Data from Netcraft Server Share Statistics ([[http://news.netcraft.com/archives/web_server_survey.html|Netcraft Survey]] [[http://news.netcraft.com/archives/2011/01/31/january_2011_web_server_survey.html|January 2011]]) ^ Developer ^ December 2010 ^ Percent ^ January 2011 ^ Percent ^ Change ^ Last Year (August 2009) ^ | Apache | 151,516,152 | 59.35% | 161,591,445 | 59.13% | -0.23 | 46.30% | | Microsoft | 56,723,544 | 22.22% | 57,392,351 | 21.00% | -1.22 | 21.94% | | nginx | 16,910,205 | 6.62% | 20,504,634 | 7.50% | 0.88 | 5.09% | | Google | 14,933,865 | 5.85% | 15,112,532 | 5.53% | -0.32 | 6.29% | | lighttpd | 1,308,935 | 0.51% | 1,866,872 | 0.68% | 0.17 | 0.90% | //**September 2009 Data**// Data from Netcraft Server Share Statistics ([[http://news.netcraft.com/archives/web_server_survey.html|Netcraft Survey]] [[http://news.netcraft.com/archives/2009/08/31/august_2009_web_server_survey.html|August 2009]]) ^ Developer ^ July 2009 ^ Percent ^ August 2009 ^ Percent ^ Change ^ Last Year ^ | Apache | 113,019,868 | 47.17% | 104,611,555 | 46.30% | -0.87 | 49.82% | | Microsoft | 55,918,254 | 23.34% | 49,579,507 | 21.94% | -1.39 | 34.88% | | qq.com | 30,447,369 | 12.71% | 30,278,988 | 13.40% | 0.69 | -- | | Google | 14,226,904 | 5.94% | 14,213,976 | 6.29% | 0.35 | 5.94% | | nginx | 10,174,573 | 4.25% | 11,502,109 | 5.09% | 0.84 | -- | | lighttpd | 2,942,469 | 0.55% | 2,025,521 | 0.90% | 0.34 | 1.65% | //**September 2008 Data**// Data from Netcraft Server Share Statistics ([[http://news.netcraft.com/archives/web_server_survey.html|Netcraft Survey]] [[http://news.netcraft.com/archives/2008/08/31/august_2008_web_server_survey.html|August 2008]]) ^ Developer ^ July 2008 ^ Percent ^ August 2008 ^ Percent ^ Change ^ Last Year ^ | Apache | 86,845,154 | 49.49% | 88,047,801 | 49.82% | 0.33 | 50.48% | | Microsoft | 62,411,537 | 35.57% | 61,646,837 | 34.88% | -0.69 | 34.94% | | Google | 10,001,763 | 5.70% | 10,502,299 | 5.94% | 0.24 | 4.90% | | lighttpd | 2,942,469 | 1.68% | 2,914,867 | 1.65% | -0.03 | 1.12% | //**September 2007 Data**// Data from Netcraft Server Share Statistics ([[http://news.netcraft.com/archives/2007/09/03/september_2007_web_server_survey.html|September 2007]]) ^ Developer ^ August 2007 ^ Percent ^ September 2007 ^ Percent ^ Change ^ Last Year ^ | Apache | 65,153,417 | 50.96% | 68,228,561 | 50.48% | -0.49 | 65.52% | | Microsoft | 43,861,854 | 34.31% | 47,232,300 | 34.94% | 0.63 | 30.13% | | Google | 5,702,456 | 4.46% | 6,616,713 | 4.90% | 0.43 | - | | Sun | 2,195,495 | 1.72% | 2,212,821 | 1.64% | -0.08 | 0.37% | | lighttpd | 1,500,126 | 1.17% | 1,515,963 | 1.12% | -0.05 | - | A lot of the change would appear to be due to the growth of blogging and community web sites such as Microsoft Live Spaces. Google used to host the Blogger service on Apache web servers, it now uses its own server. Live Spaces is hosted on the Microsoft's IIS. Google and Microsoft are also busy competing in the ISP space which is fueling a general growth in the number of web sites. //**September 2006 Data**// Data from Server Share Statistics ([[http://news.netcraft.com/archives/2006/09/05/september_2006_web_server_survey.html|September 2006]]) ^ Server ^ August 2006 ^ Percent ^ September 2006 ^ Percent ^ Change ^ 2005 ^ | Apache | 57,906,817 | 62.52 | 59,699,872 | 61.64 | -0.88 | 69.15% | | Microsoft | 27,905,439 | 30.13 | 30,272,249 | 31.26 | -1.13 | 20.36% | | Zeus | 521,619 | 0.56 | 515,670 | 0.53 | -0.03 | 0.82% | | Sun | 344,862 | 0.37 | 345,834 | 0.36 | -0.01 | 2.61% | ===== Some Important Web Servers ===== * Apache (more later) * Microsoft Internet Information Server (IIS) * Apache Tomcat * lighttpd ---- //**Notes**// * Microsoft Internet Information Server (IIS) * Standard on windows professional * Windows only * Standard web services (files and CGI) * ASP technology provides server scripting in multiple languages * FrontPage server extensions * Key component of .NET * Operation is maintained through a program with a graphical user interface (GUI) * Apache Tomcat * A web server written in Java (runs on any platform that supports Java) * Standard web services (files and CGI) * A "servlet container" which uses Java as a web application programming language and Java Server Pages (JSP) for interactivity. * Often runs //behind// (i.e. is fronted by) an Apache or IIS web server, so rarely appears in server statistics. * Key component of Java Enterprise Edition (Java EE). * Lighttpd (pronounced "lighty") * "web server designed to be secure, fast, standards-compliant, and flexible while being optimized for speed-critical environments." [[wp>Lighthttpd|Wikipedia]] * small memory footprint = low server impact * can efficiently handle lots of connections * full featured * used for sites popular sites like YouTube and Wikipedia * Open source: relased under BSD license. ===== Introducing the Apache Web Server ===== Still the most popular web server in use today * First web server was built by Tim Berners-Lee at CERN * First really popular web server was developed by NCSA and was available to all. * Apache was originally developed to fix bugs in NCSA Web Server version 1.3 in 1995. * It is open source and is developed and maintained by a group of volunteers. * Runs on most common platforms. ---- //**Notes**// * Apache market share: * September 2006: 62% of the market, 59.6 Million hosts. * September 2007: 50% of the market, 68.2 Million hosts. * August 2008: just less than 50% of the market, 86.9 Million hosts. * August 2009: around 46% of the market, 113 Million hosts. ===== The Apache Web Server ===== * Official name //httpd// (HTTP //daemon//) * Open source, fast, reliable. Latest version 2.2. * Directives (operation control): ''ServerName'', ''ServerRoot'', ''ServerAdmin'', ''DocumentRoot'', ''Alias'', ''Redirect'', ''DirectoryIndex'', ''UserDir'' * Available from http://httpd.apache.org. ---- Apache configuration is usually done by editing configuration files with a text editor ===== Uniform Resource Identifier (URI) ===== * A [[wp>Uniform_Resource_Identifier|URI]] is used to uniquely identify a //resource// on the global Internet. * General form: ''scheme:object-address'' **Scheme ** * The scheme is often a communications protocol, such as ''telnet'', ''ftp'', ''http'' and ''https'' (secure HTTP) **Object-address** * For the ''http'' protocol, the object-address is: ''fully qualified domain name/document path'' * For the ''file'' protocol, only the ''document path'' is needed ===== URI Object Address ===== * Qualified domain name may include a port number, as in ''www.swan.ac.uk:80'' * URIs cannot include spaces or any of a collection of other special characters (semicolons, colons, ...) * The ''document path'' may be abbreviated as a //partial path// * If the ''document path'' ends with ''/'', it means it is a directory ---- **//Notes//** * Port 80 is the default port for a web server, so in ''www.swan.ac.uk:80'' this is redundant and is usually abbreviated to ''www.swan.ac.uk''. * //Partial paths// -- the rest of the path is furnished by the server configuration. Partial paths are also known as //relative paths//. * Directory "object": often the ''/'' character will be mapped by the web server to a file such as ''index.html'', ''index.php'' or ''Default.asp''. ===== Multipurpose Internet Mail Extensions (MIME) ===== * Originally developed for email * Used to specify to the browser the form of a file returned by the server (attached by the server to the beginning of the document) * Type specifications * Form: ''type/subtype'' * Server usually gets type from the requested file name’s suffix (''.html'' implies ''text/html'') * Browser gets the type explicitly from the server * Experimental types: subtype begins with ''x-'': e.g., ''video/x-msvideo'' ---- //**Notes**// * Examples of type specifications: ''text/plain'', ''text/html'', ''image/gif'', ''image/jpeg''. * Experimental types require the server to send a helper application or plug-in so the browser can deal with the file. ===== The Hyper Text Transfer Protocol ===== The protocol used by **ALL** web communications {{ eg-259:l1-fig4.png?500 |HTTP Request and Response. Figure (c) Kurose and Ross.}} ---- * HTTP is web's application layer protocol * client/server model * //client//: browser that requests, receives, "displays" documents * //server//: Web server sends objects in response to requests * HTTP 1.0: [[http://www.faqs.org/rfcs/rfc1945.html|RFC 1945]] * HTTP 1.1: [[http://www.faqs.org/rfcs/rfc2068.html|RFC 2068]] ===== Key Facts about HTTP ===== * //HTTP Uses TCP// * //HTTP is "stateless"// * //HTTP Supports two modes of connection// * //Non-persistent HTTP// * //Persistent HTTP// ---- //**Details**// * //HTTP Uses TCP//: * client initiates TCP connection (creates socket) to server, port 80; * server accepts TCP connection from client; * HTTP messages (application-layer protocol messages) exchanged between browser (HTTP client) and Web server (HTTP server); * TCP connection closed. * //HTTP is "stateless"// * server maintains no information about past client requests. * //HTTP Supports two modes of connection// * //Non-persistent HTTP// * At most one object is sent over a TCP connection. * HTTP/1.0 uses non-persistent HTTP. * //Persistent HTTP// * Multiple objects can be sent over single TCP connection between client and server. * HTTP/1.1 uses persistent connections in default mode. ===== Response Time Modeling ===== {{ eg-259:l1-fig5.png?500 |Response Time Modelling. Figure (c) Kurose and Ross}} ---- **Calculation of Response Time** * Definition of Round-Trip Time (RTT): time for a small data packet to travel from client to server and back. * //Response time//: * one RTT to initiate TCP connection * one RTT for HTTP request and first few bytes of HTTP response to return * file transmission time: total = 2 RTT + transmit time * //Non-persistent HTTP issues//: * requires 2 RTTs per object * operating system must work and allocate host resources for each TCP connection * but browsers often open parallel TCP connections to fetch referenced objects * //Persistent HTTP// * server leaves connection open after sending response * subsequent HTTP messages between same client/server are sent over connection * //Persistent without pipelining//: * client issues new request only when previous response has been received * one RTT for each referenced object * //Persistent with pipelining//: * default in HTTP/1.1 * client sends requests as soon as it encounters a referenced object * as little as one RTT for all the referenced objects ===== HTTP Request Phase ===== * Form HTTP method domain part of URL HTTP version Header fields blank line Message body *An example of the first line of a request: GET /ugcourses/ HTTP/1.1 ---- **//Notes//** Most commonly used request methods: * ''GET'' -- Fetch a document * ''POST'' -- Execute the document, using the data in body * ''HEAD'' -- Fetch just the header of the document * ''PUT'' -- Store a new document on the server * ''DELETE'' -- Remove a document from the server Note that although servers support all these requests, browsers only issue ''GET'' and ''PUT''. This has implications for so-called //RESTful web applications// which we will explore in a later lecture. Four categories of header fields: //General//, //request//, //response// and //entity//. * Common request fields: Accept: text/plain Accept: text/* If-Modified-Since: date * Can communicate with HTTP without a browser: telnet www.swan.ac.uk http GET /ugcourses/ HTTP/1.1 Host: www.swan.ac.uk * Linux users have access to a useful command line tool [[wp>CURL|cURL]] that can issue any web server request from the command line and gives full //programmatic// access to the header fields. cURL is very useful addition to the web developer's toolbox and is available for windows via //cygwin// (see [[eg-259:practicals:0|Partical 0]]). cURL supports other protocols besides HTTP. ===== HTTP Response Phase ===== * Form: Status line Response header fields blank line Response body * Status line format: HTTP version status code explanation * Example: HTTP/1.1 200 OK ---- //**Notes**// * Status code is a three-digit number; first digit specifies the general status: 1 => Informational 2 => Success 3 => Redirection 4 => Client error 5 => Server error * The header fields, ''Content-type'', and ''Content-length'' are required: * The ''Content-type'' field's value is a content-type specification in MIME format * The ''Content-length'' field's value is the size of the returned document in bytes. * Common header response fields: Content-length: 488 Content-type: text/html * An example of a complete response header: HTTP/1.1 200 OK Date: Tues, 18 May 2004 16:45:13 GMT Server: Apache (Red-Hat/Linux) Last-modified: Tues, 18 May 2004 16:38:38 GMT Etag: "841fb-4b-3d1a0179" Accept-ranges: bytes Content-length: 364 Connection: close Content-type: text/html, charset=ISO-8859-1 * Both request headers and response headers must be followed by a blank line ===== Web Programming ===== Concerned with three "layers" of the current web standards //stack// * [[eg-259:lecture1#the_structural_layer|Structural Layer]] * [[eg-259:lecture1#the_presentation_layer|Presentational Layer]] * [[eg-259:lecture1#the_behavioural_layer|Behavioural Layer]] ===== The Structural Layer ===== * Marked up document which is the foundation for the other layers. * Current standards are: * //HTML 4.1// * //XML 1.0// (//Extensible Markup Language//). See [[http://www.w3.org/XML/|XML]]. * //XHTML 1.0// (//Extensible Hypertext Markup Language//) and //XHTML 1.1//. See [[http://www.w3.org/MarkUp/|Mark Up]]. * //HTML5// See [[http://dev.w3.org/html5/spec/Overview.html|HTML5: A vocabulary and associated APIs for HTML and XHTML]] ===== HTML ===== * HTML describes the general form and layout of documents * Tools for creating HTML documents * HTML editors -- make document creation easier * WYSIWYG HTML editors * Plug-ins * Filters ---- //**Notes**// * An HTML document is a mix of //content// and //controls// * Controls are //tags// and their //attributes// * Tags often delimit //content// and specify something about how the content should be arranged in the document * Attributes provide additional information about the content of a tag * //HTML editors// -- make document creation easier by providing shortcuts to typing tag names, spell-checker, * //WYSIWYG HTML editors// are useful in that developers need not know HTML to create HTML documents * //Plug-ins// are often integrated into tools like word processors, effectively converting them to WYSIWYG HTML editors * //Filters// convert documents in other formats to HTML * Advantages of both filters and plug-ins: * Existing documents produced with other tools can be converted to XHTML documents * Use a tool you already know to produce HTML * Disadvantages of both filters and plug-ins: * HTML output of both is not perfect -- must be fine tuned * HTML may be non-standard * You have two versions of the document, which are difficult to synchronize ===== XML ===== * A //meta-markup language// * Used to create a new markup language for a particular purpose or area * Because the tags are designed for a specific area, they can be meaningful * No presentation details * A simple and universal way of representing data of any textual kind ===== The Presentation Layer ===== * Provides instructions on how a document should look on the screen, sound when it is read aloud and look when it is printed * Current [[http://www.w3.org/Style/|CSS]] standards are: * //Cascading Style Sheets (CSS) Level 1// * //CSS Level 2.1// * //CSS Level 3// ---- **//Notes//** CSS Level 1 has been a Recommendation since 1996 and is now fully supported by current browsers. Level 1 contains rules that control the display of text, margins and borders. CSS Level 2 is best known for the addition of absolute positioning of web page elements. Level 2 reached Recommendation status in 1998, and the 2.1 revision is currently a Candidate Recommendation. Support for CSS 2.1 is inconsistent in current browsers. CSS Level 3 builds on level 2 but is modularized to make future expansion simpler and to allow different devices to support logical subsets. This version is still in development but browsers are gradually supporting more and more of the standard, often with the use of browser-specific attributes. ===== The Behavioural Layer ===== * The scripting and programming of the behavioural layer adds interactivity and dynamic affects to a site. * Client-Side * [[eg-259:lecture1#document_object_models|Document Object Models]] * [[eg-259:lecture1#scripting_in_javascript|JavaScript]] * Server Side * [[eg-259:lecture1#java|Java]] * [[eg-259:lecture1#perl|Perl]] * [[eg-259:lecture1#php|PHP]] * [[eg-259:lecture1#ruby on rails|Ruby on Rails]] ===== Document Object Models ===== * Document Object Model (DOM) allows scripts and applications to access and update the content, structure and style of a document. * Achieved by formally naming each part of the document, its attributes, and how the document may be manipulated. * Originally specified incompatibly by each browser, now standardized by the W3C. * //Document Object Model (DOM) Level 1 (Core)// * //DOM Level 2// ---- //**Notes**// * //Document Object Model (DOM) Level 1 (Core)// covers core HTML and XML documents as well as document management and manipulation. See [[http://www.w3.org/TR/REC-DOM-Level-1/|DOM 1]]. * //DOM Level 2// includes a style sheet object model, making it possible to manipulate style information. See [[http://www.w3.org/DOM/DOMTR|DOM 2]]. ===== Scripting in JavaScript ===== * A client-side HTML-embedded scripting language * Only related to Java through syntax * Dynamically typed and not object-oriented * Provides a way to access elements of HTML documents and dynamically change them * Current Standard: //JavaScript 1.5/[[http://www.ecma-international.org/publications/standards/ECMA-262.htm|ECMAScript 262]]// ---- **//Notes//** Netscape introduced its web scripting language, JavaScript, with its Navigator 2.0 browser. It was originally called "LiveScript" but was later co-branded by Sun, and "Java" was added to the name. Microsoft countered with its own JScript while supporting some level of JavaScript in its Version 3.0 browser. The need for a cross-browser standard was clear! The W3C is developing a standardized version of JavaScript in coordination with ECMA International, an international industry association dedicated to the standardization of information and communication systems. According to the Mozilla site, Netscape's JavaScript is a superset of the ECMAScript standard scripting language, with only mild differences from the published standard. In general practice, most developers simply refer to "JavaScript" and the standard implementation is implied. ===== Java ===== * General purpose object-oriented programming language * Based on C++, but simpler and safer * Client-side: //Applets// -- compiled Java programs that are downloaded from a web server and execute in the browser * Server-side: //Servlets// -- Java programs that execute in the server. Essentially manipulate the HTTP request and return an HTTP response. JSP template mark up allows Java code to be embedded in HTML pages (similar to Microsoft's ASP) * Sebasta's book covers both applets and servlets, but we will not have time to cover Java in this module. ===== Perl ===== * Provides server-side computation for HTML documents, through CGI * Perl is good for CGI programming because: * Direct access to operating systems functions * Powerful character string pattern-matching operations * Access to database systems * Perl is highly platform independent, and has been ported to all common platforms * Perl is not just for CGI ---- Perl is useful general purpose system administrator's tool. In fact that was Larry Wall's intention, when he originally developed the language. It can be used to manage server configuration and server logs. In many ways it is the ultimate shell programming tool with the advantage that it works on systems other than Unix! ===== PHP ===== * A server-side scripting language * An alternative to CGI * Similar (in programing style) to JavaScript * Great for form processing and database access through the Web ===== Ruby on Rails ===== * //Ruby// is a scripting language * //Rails// is a web applications development framework written in Ruby * Rails exploits Ruby's features to make web app development as easy as possible * Supports a RESTful development style "out of the box" ===== Summary of this Lecture ===== * [[eg-259:lecture1#origins_of_the_internet|Revision of the History of the Internet, Application Protocols and the World Wide Web]] * [[eg-259:lecture1#applications_and_application-layer_protocols|Applications and Application-Layer Protocols]] * [[eg-259:lecture1#the_world_wide_web|The World-Wide Web]] * [[eg-259:lecture1#web_browsers|Web Browsers]] and [[eg-259:lecture1#web_servers|Web Servers]] * [[eg-259:lecture1#uniform_resource_identifier_uri|Uniform Resource Identifiers (URIs)]] * [[eg-259:lecture1#multipurpose_internet_mail_extensions_mime|Multipurpose Internet Mail Extensions (MIME)]] * [[eg-259:lecture1#the_hyper_text_transfer_protocol|The Hyper Text Transfer Protocol]] * [[eg-259:lecture1#web_programming|Web Programming]] ===== Learning Outcomes for this Lecture (1) ===== //At the end of this session you should be able to answer this selection of review questions//: * What protocol is used by all computer connections to the Internet? * What is the task of a domain name server? * In what common situation is the document returned by a Web server created **after** the request is received? * What is meant by the terms //document root//, //server root//, //virtual host// when applied to a web server? ===== Learning Outcomes for this Lecture (2) ===== //At the end of this lecture you should be able to answer this selection of lecture review questions//: * What is the purpose of a MIME type specification in a request/response transaction between a browser and a server? * Prior to HTTP 1.1, how long were connections between browsers and servers normally maintained? * What is the great advantage of XML over XHTML for describing data? * What is the purpose of the Common Gateway Interface? * Where is the code for JavaScript, Java Applet, Java Servlet, Perl CGI Script, and PHP script interpreted? After writing up your notes for this lecture, you also should be able to answer all the [[eg-259:review:lecture1|Review Questions]]. You should also try the [[eg-259:homework:1|Homework Exercises]]. ===== What's Next? ===== ** The Structural and Presentation Layers ** //Revision// * [[eg-259:xhtml|The Structural Layer: XHTML]] * [[eg-259:css|The Presentation Layer: CSS]] //New material// * [[eg-259:html5|XHTML 2 and HTML 5]] * [[eg-259:css3|CSS 3]] * Watch the video //before the session//! [[eg-259:lecture0|Previous session]] | [[eg-259:practicals:0|Set up your web development toolkit]] | [[eg-259:home|Home]] | [[eg-259:lecture2|Next session]]