A regular expression is a pattern or sequence of characters that has regular characters and metacharacters. The pattern serves as a template to match/find (and possibly replace) a desired arrangement in a body of text.

This section provides JavaScript-specific details about regular expressions, but for more general details about regular expressions (EG: metacharacters, etc.), see my section on Regular Expressions.

Of the JavaScript built in global objects, RegExp is a latecomer since it wasn't implemented until JavaScript 1.2.

Basic Usage

Here is the syntax for regular expressions in JavaScript.

var RegExpLiteral = /pattern/[flags];
var RegExpObject = new RegExp("pattern"[, "flags"]);

pattern contains a regular expression pattern.

flags can have any combination of the following:

  • g. Global match, i.e. find all matches. Without g, the RegExp finds just the first match.
  • i. Ignore case, as in upper or lower case.
  • m. Match over multiple lines, "^" and "$" change from matching at only the start or end of the entire string to the start or end of any line within the string..
  • y. Sticky. (Implemented: JavaScript 1.8; JScript not; ECMA-262 not.) Matches only from the index indicated by the lastIndex property of this regular expression in the target string.

Use the RegExp literal when the pattern is constant (i.e. not constructed) or you do not need to access properties of the regular expression, since the literal has better performance over the constructor function.

The RegExp has 2 major methods. Furthermore, 4 methods of String instances that can take regular expression. For perspective all 6 methods are compared here

  • myRegExp.exec(SearchedString). Looks for a match in the searched string. If the g flag is set, then this method must be applied multiple times. Returns a special array with features not in regular arrays.
  • myString.match(RegExp). Like multiple applications of exec(), except that it returns an array of the matches instead of the special array returned by .exec().
  • myRegExp.test(SearchedString). Looks for matches in the searched string. Returns true if found, other wise returns false.
  • myString.search(RegExp). Like test() method, except it returns the index of the match or -1.
  • myString.replace(RegExp, newSubStringOrFunction). Looks for matches in the searched string, replaces matches with the newSubStringOrFunction, and returns it as a new string. Operates just like the PERL s///e (substitute) operator.
  • myString.split(StringOrRegExSeparator). Tries to split the string into substrings according to the StringOrRegExSeparator, and returns an array of the substrings. If the separator is not specified or not found, then the method returns an array with 1 element for the entire string.

The RegExp underwent a major change with JavaScript 1.5, when many of the results from .exec were moved from the RegExp object to the RegExp instance. The .index property, the .compile() method, and all the .$ properties and their full name versions (except for .multiline) "went away" in JavaScript 1.5 (but JScript still has them).

var re1 = /d/g;
RegExp.global; // returns true for JavaScript 1.4
re1.global; // returns true for JavaScript 1.5

Arrays returned from RegExp related methods (re.exec(), str.match(), and str.replace()), have properties that regular arrays don't have.

var re = /d(b+)(d)/ig;
var ar = re.exec("cdbBdbsbz");

// Features of the special RegExp related array:
ar.toSource(); // returns '["dbBd", "bB", "d"]'
ar[0]; // returns the last matched characters: 'dbBd'
ar[1]; // returns the 1st parenthesized substring match: 'bB'
ar[2]; // returns the 2nd parenthesized substring match: 'd'. There can be many of these
ar.index; // returns the 0-based index of the match in the searched string: 1
ar.input; // returns the searched string: 'cdbBdbsbz'

// Properties of the RegExp instance:
re.global; // returns state of the g flag: true
re.ignoreCase; // returns state of the i flag: true
re.multiline; // returns state of the m flag: false
re.source; // returns the pattern: 'd(b+)(d)'
re.lastIndex; // returns the index where to start the next match: 5

// Static properties of the RegExp object available in JScript or prior to JavaScript 1.5:
RegEx.lastMatch; // returns the last matched characters: 'dbBd' (same as ar[0])
RegEx.leftContext; // returns the  substring preceding the most recent match: 'c'
RegEx.rightContext; // returns the  substring following the most recent match: 'bsbz'
RegEx.$1; // returns the 1st of the last 9 parenthesized substrings: 'dB' (weaker than using ar[n])
RegEx.$2; // returns the 2nd of the last 9 parenthesized substrings: 'd'
RegEx.lastParen; // returns the last of the last 9 parenthesized substrings: 'd'
RegEx.index; // returns the 0-based index of the match in the searched string: 1 (same as ar.index)
RegEx.input; // returns the searched string: 'cdbBdbsbz' (same as ar.input)

// Since the g flag was set, re.exec() could be run again (resuming at .lastIndex) as needed.

While $_, $+, and $* are deprecated, $1, ..., $9, $&, $`, and $' live on in the str.replace() method.

Properties

.$1, ..., $9

Implemented: JS 1.2; JS 1.5 deprecated; JScript 3.0; ECMA-262 1; ECMA-262 3 deprecated.
Read only
Static
Returns the last nine parenthesized (captured) sub-strings in a match.

.global

Implemented: JS 1.2; JS 1.5 global is a property of a RegExp instance, not the RegExp object; ECMA-262 3.
Read only
Returns true if the global match flag g was set with the RegExp instance. If true, then finds all matches, otherwise the RegExp finds just the first match.

.ignoreCase

Implemented: JS 1.2; JS 1.5 ignoreCase is a property of a RegExp instance, not the RegExp object; ECMA-262 3.
Read only
Returns true if the ignore case flag i was set with the RegExp instance. If true than does not distinguish between uppercase and lowercase.

.index

Implemented: JScript 5.5;
Read only
Static
Integer indicating the 1st successful match. Initially -1 in JScript.

.input or .$_

Implemented: JS 1.2; JS 1.5 deprecated; JScript 3.0; ECMA-262 1; ECMA-262 3 deprecated.
Read only
Static
Returns the body of text or target string that the RegExp is matching against.

.lastIndex

Implemented: JS 1.2; JS 1.5 lastIndex is a property of a RegExp instance, not the RegExp object; ECMA-262 3.
Static in JScript or prior to JavaScript 1.5
Gets the index of where the last search left off or 0. The lastIndex can also be set. The next search will start at the lastIndex. The following rules apply:

  • If lastIndex is greater than the length of the string, regexp.test and regexp.exec fail, and lastIndex is set to 0. (The IE documentation says -1 but they do 0.)
  • If lastIndex is equal to the length of the string and if the regular expression matches the empty string, then the regular expression matches input starting at lastIndex.
  • If lastIndex is equal to the length of the string and if the regular expression does not match the empty string, then the regular expression mismatches input, and lastIndex is reset to 0. (The IE documentation says -1 but they do 0.)
  • Otherwise (i.e. if lastIndex is less than the length of the string), lastIndex is set to the next position following the most recent match.
var re = /d/g
re.lastIndex; // returns 0 because not searched yet
re.exec("0d234d567"); // returns 'd' matched at index 1
re.lastIndex; // returns 2

.lastMatch or .$&

Implemented: JS 1.2; JS 1.5 deprecated; JScript 3.0; ECMA-262 1; ECMA-262 3 deprecated.
Read only
Static
The last matched characters.

.lastParen or .$+

Implemented: JS 1.2; JS 1.5 deprecated; JScript 3.0; ECMA-262 1; ECMA-262 3 deprecated.
Read only
Static
The last parenthesized (captured) substring match, if any.

.leftContext or .$`

Implemented: JS 1.2; JS 1.5 deprecated; JScript 3.0; ECMA-262 1; ECMA-262 3 deprecated.
Read only
Static
The substring preceding the most recent match.

.multiline or .$*

Implemented: JS 1.2; JS 1.5 multiline is a property of a RegExp instance, not the RegExp object, and .$* deprecated; ECMA-262 3.
Read only
Returns true if the multiline flag m was set with the RegExp instance. "^" and "$" change from matching at only the start or end of the entire string to the start or end of any line within the string.

.rightContext or .$'

Implemented: JS 1.2; JS 1.5 deprecated; JScript 3.0; ECMA-262 1; ECMA-262 3 deprecated.
Read only
Static
The substring following the most recent match.

.source

Implemented: JS 1.2; JS 1.5 source is a property of a RegExp instance, not the RegExp object; ECMA-262 3.
Read only
Returns the text of the pattern. They should have called this property .pattern similar to the way VB does.

var re = /(hi)?/g;
re.source; // returns '(hi)?'
re.toSource(); // returns '/(hi)?/g'

Methods

.compile(pattern[, flag])

Implemented: JS 1.2; JS 1.5 deprecated; JScript 3.0; ECMA-262 1; ECMA-262 3 deprecated.
Compiles a RegExp object. Use if pattern and flag will remain the same for multiple uses of the RegExp object. Also use compile() to change the pattern or modifier of a RegExp object.

.exec(TargetString) or .(TargetString)

Implemented: JavaScript 1.2; JScript 3.0; ECMA-262.
Searches for a match and returns a result array. For JavaScript proper (not JScript or ECMAScript), .exec is the default method for RegExp instances, so myRegEx(TargetString) is equivalent to myRegEx,exec(TargetString). If you are executing a match simply to find true or false, use test() or the String search() instead. No match returns null, otherwise the resulting array has these properties:

  • index. The zero-based index of the match in the string.
  • input. The original or searched string.
  • [0]. The portion of the string that was matched last.
  • [1], [2], ..., [n]. The parenthesized substring matches, if such exist.
  • A successful match also assigns these properties to regular expression object:
    • source. The pattern's text itself.
    • ignoreCase. Indicates if the i flag was used.
    • global. Indicates if the g flag was used.
    • multiline. Indicates if the m flag was used.
    • lastIndex. The index at which to start the next match.

.test(TargetString)

Implemented: JS 1.2; JScript 3.0; ECMA-262.
Searches for a match and returns true or false. Comparable to the String search() method which returns the index of the first match or -1.



GeorgeHernandez.comSome rights reserved