~~SLIDESHOW~~
Contact Hour 10: To be discussed on Tuesday 19th February, 2013.
Lecturer: Dr Chris P. Jobling.
The slides and notes for this lecture are based on Chapter 4 of Robert W. Sebasta, Programming the World-Wide Web, 3rd Edition, Addison Wesley, 2006. There is a good discussion of JavaScript regular expressions in Sections 7.2 and 9.9 of the Chris Bates, Web Programming: Building Internet Applications, 3rd Edition, John Wiley, 2006. A good website that intruduces this topic is Regular-Expressions.info.
Text processing with regular expressions
At the end of this lecture you should be able to answer these questions:
At the end of this lecture you should be able to answer these questions:
i
pattern modifier do? String
method replace
do? String
method match
do? RegExp
objects String
objects A Little History
Regular expression pattern matching is a technique that was first developed for the text editors ed and sed which were (and still are) part of the Unix system. The ideas were extended to the program awk and eventually reached their full potential in the Perl programming language. Perl regular expressions are the inspiration for JavaScript's and a variation of the Perl form of regular expression are to be found in many other contexts such as the text editors vi and emacs, most scripting languages, and even in the standard Java library.
If you are interested, Regular expression has more to say on the subject.
The tools illustrated are both based on the JavaScript regular expression engine which itself is based on the Perl Common Regular Expression library that is used in many modern scripting languages, programmer's editors and even the Apache web server.
There is a version of RegexPal on the Blackboard site that you can download and which gives you access to the global search as a switchable option.
Normal characters (match themselves)
/ee/
matches need, greed, weed, but not wed or dead Meta-characters have special meanings in patterns – they do not match themselves:
\ | ( ) [ ] { } ^ $ * + ? .
\
).
) is a special meta-character – it matches any character except newline/c.t/
matches Ascot, cat, cut and crt but not act or cart.[abcd]
matches any of letters 'a', 'b', 'c', or 'd'.[a-z]
matches any lower-case letter (in the English alphabet).[^0-9]
matches any character that is not a decimal digit.Abbr. | Equiv. | Pattern Matches |
---|---|---|
\d | [0-9] | a digit |
\D | [^0-9] | not a digit |
\w | [A-Za-z_0-9] | a word character |
\W | [^A-Za-z_0-9] | not a word character |
\s | [ \r\t\n\f] | a whitespace character |
\S | [^ \r\t\n\f] | not a whitespace character |
(JavaScript) variables in patterns are interpolated
Quantifiers in braces
Quantifier | Meaning |
---|---|
{n} | exactly n repetitions |
{m,} | at least m repetitions |
{m, n} | at least m but not more than n repetitions |
Just abbreviations for the most commonly used quantifiers
*
means zero or more repetitions e.g., \d*
means zero or more digits +
means one or more repetitions e.g., \d+
means one or more digits ?
Means zero or one e.g., \d?
means zero or one digit
The pattern can be forced to match only at the start with ^
or at the end with $
/^Lee/
matches “Lee Ann” but not “Mary Lee Ann”/Lee Ann$/
matches “Mary Lee Ann”, but not “Mary Lee Ann is nice”^
and $
) do not match characters in the string – they match positions, at the beginning or end
The i
modifier tells the matcher to ignore the case of letters
/oak/i
matches “OAK” and “Oak”
The x
modifier tells the matcher to ignore whitespace in the pattern (allows comments in patterns)
In JavaScript we can use regular expressions to:
The most common use of these is in form validation.
search (pattern)
returns the position in the object string of the pattern (position is relative to zero);
var str = "Gluckenheimer"; var position = str.search(/n/); /* position is now 6 */
replace(pattern, string)
g
modifier can be used)g
modifier means “replace globally”, all matched strings will be replaced.$1
, $2
, etc.var str = "Some rabbits are rabid"; str.replace(/rab/g, "tim"); // str is now "Some timbits are timid" // $1 and $2 are both set to "rab"
split(parameter)
var str = "grapes:apples:oranges" var fruit = str.split(/:/) // fruit is set to ["grapes", "apples", "oranges"]
":"
and /:/
are equivalent
match(pattern)
g
modifier, it returns an array of all of the substrings that matched g
modifier, first element of the returned array has the matched substring, the other elements have the values of $1
, …
var str = "My 3 kings beat your 2 aces"; var matches = str.match(/[ab]/g); //matches is set to ["b", "a", "a"]
Common use of JavaScript is to check validity of user inputs on forms
Markup:
<!DOCTYPE html> <!-- forms_check.html A function tst_phone_num is defined and tested. This function checks the validity of phone number input from a form --> <html class="no-js" lang="en"> <head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> <title> Phone number tester </title> <meta name="viewport" content="width=device-width"> <link rel="stylesheet" href="css/bootstrap.min.css"> <link rel="stylesheet" href="css/bootstrap-responsive.min.css"> <link rel="stylesheet" href="css/main.css"> <script src="js/vendor/modernizr-2.6.2-respond-1.1.0.min.js"></script> </head> <body> <div class="container"> <h1>Phone Number Tester</h1> <p>An example of the use of Regular Expressions for form validation. View source to see the HTML code and use your browser's development tools to view the JavaScript.</p> <p>Phone numbers should match the pattern 3 digits followed by a dash followed by four digits. The regular expression for this is <code>/\d{3}-\d{4}/
.
</p> <p>The example uses the DOM 0 event model which will be discussed in the next session.</p> <form id="phone" method="post" action="/cgi-bin/echo_form.cgi" onsubmit="validate();"> <label for="phone_number">Phone number: </label> <input id="phone_number" type="text" name="phone_number" placeholder="444-4444" title="Please enter phone number using the pattern ddd-dddd."/> <input type="submit" name="Submit" value="Submit" /> </form> </div> <!-- /container -->
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script> <script>window.jQuery || document.write('<script src="js/vendor/jquery-1.9.0.min.js"><\/script>')</script>
<script src="js/vendor/bootstrap.min.js"></script> <script src="js/plugins.js"></script> <!-- The actual forms_check script --> <script src="forms_check.js"></script> </body>
</html> </code>
The script (validation function validate()
will be explained later)
/* Function tst_phone_num Parameter: A string Result: Returns true if the parameter has the form of a legal seven-digit phone number (3 digits, a dash, 4 digits) */ function test_phone_number(num) { // Use a simple pattern to check the number of digits and the dash var ok = num.search(/\d{3}-\d{4}/); if (ok === 0) { return true; } else { return false; } }// end of function tst_phone_num /* Actual form validation. Called onclick */ var validate = function() { var phoneNumber = document.getElementById("phone_number"); if (test_phone_number(phoneNumber.value)) { return true; } else { alert("Phone number is invalid. Please use format ddd-dddd."); // prevent submission return false; } };
Test code for tst_phone_num
// Test test_phone_number var test_phone_number_test = function() { var tests = ["444-5432", "444-r432", "44-1234"]; for (i = 0; i < tests.length; i++) { var test = test_phone_number(tests[i]); if (test) { console.log(tests[i] + " is a legal phone number <br />"); } else { console.error("Error in test_phone_number: " + tests[i] + " is not a legal phone number <br />"); } } };
A regular expression validator that is built-in to HTML5
pattern
attribute can be used on some modern browsers/^pattern$/
by the JavaScript engine.<input id="phone_number" type="text" name="phone_number" placeholder="444-4444" pattern="\d{3}-\d{4}" />
<html> <input id=“phone_number” type=“text” name=“phone_number” placeholder=“444-4444” pattern=“\d{3}-\d{4}” /> </html>
<!DOCTYPE html> <!-- forms_check.html A function tst_phone_num is defined and tested. This function checks the validity of phone number input from a form --> <html class="no-js" lang="en"> <head> <meta charset="utf-8"> <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"> <title> Phone number tester (HTML5)</title> <meta name="viewport" content="width=device-width"> <link rel="stylesheet" href="css/bootstrap.min.css"> <link rel="stylesheet" href="css/bootstrap-responsive.min.css"> <link rel="stylesheet" href="css/main.css"> <script src="js/vendor/modernizr-2.6.2-respond-1.1.0.min.js"></script> </head> <body> <div class="container"> <h1>Phone Number Tester (HTML5)</h1> <p>An example of the use of Regular Expressions for form validation. View source to see the code.</p> <p>Phone numbers should match the pattern 3 digits followed by a dash followed by four digits. The regular expression for this is <code>/\d{3}-\d{4}/
.
</p> <p>HTML5 provides a new form attribute <em>pattern</em> whose value is a regular expression (without the slashes). When supported, this can be used instead of JavaScript for form validation.</p> <p>In production, you would normally need to provide a JavaScript fallback for browsers that don't yet support the <em>pattern</em> attribute.</p> <!-- No need for onsubmit validator now --> <form id="phone" method="post" action="http://www.cpjobling.me/cgi-bin/echo_form.cgi"> <label for="phone_number">Phone number: </label> <input id="phone_number" type="text" name="phone_number" pattern="\d{3}-\d{4}" placeholder="444-4444" title="Please enter phone number using the pattern ddd-dddd."/> <input type="submit" name="Submit" value="Submit" /> </form> </div> <!-- end of container -->
<script src="//ajax.googleapis.com/ajax/libs/jquery/1.9.0/jquery.min.js"></script> <script>window.jQuery || document.write('<script src="js/vendor/jquery-1.9.0.min.js"><\/script>')</script>
<script src="js/vendor/bootstrap.min.js"></script> <script src="js/plugins.js"></script> <!-- Look no scripts! --> </body>
</html> </code>
Internet Options
from the Tools
menu Advanced
tab Disable script debugging
box Display a notification about every script error
box Tools → JavaScript Console
Text processing with regular expressions
At the end of this lecture you should be able to answer these questions:
At the end of this lecture you should be able to answer these questions:
i
pattern modifier do? String
method replace
do? String
method match
do? Write, test and debug (if necessary) HTML files that include JavaScript scripts for the following problems. When required to write functions, you must include a script to test the function with at least two different data sets.
prompt
; Output: either legal name or Illegal name, depending on whether the input string fits the required format, which is: Last name, first name, middle initial where neither of the names can have more than 15 characters.prompt
; Output: The words of the input text, in alphabetical ordertst_name
; Parameter: a string; Returns: true
if the given string has the form: string1, string2, letter
where both strings must be all lowercase letters except the first letter, and letter must be uppercase; false
otherwise.pattern
attribute to validate the name as defined in Exercise 3.Manipulating web documents through the Document Object Model (DOM) and the JavaScript event model.
onsubmit
event