A number of characters have special meanings within regular expressions.
Wildcards
To represent any character in a pattern wildcards can be used, which are represented by the . character.
Example of a Wildcard
<script language="javascript" type="text/javascript">
var mystring = "dog";
var myregexp = new RegExp("d.g");
// This prints out true for dog, dig, dug etc.
document.write(myregexp.test(mystring));
</script>
If you wish to use the . character itself within the pattern then it needs to be escaped with the \ character as does the \ character itself if used.
Escape sequences within regular expressions follow the same rules as with strings, so are covered in the Strings Tutorial. There are also some tips on using escape sequences at the end of this tutorial.
Character Lists
Instead of using a wildcard to match any character, a list of characters inside brackets can be specified within a pattern.
Example of a Character List
<script language="javascript" type="text/javascript">
var mystring = "dog";
var myregexp = new RegExp("d[aeiou]g");
// This prints out true for dog, dig, dug etc. but not dyg
document.write(myregexp.test(mystring));
</script>
Specifying a Range of Characters
A range of characters can be tested for using the - character between the first and last character in the range e.g. [a-z] means all lowercase characters from a to z.
Example of a Range of Characters
<script language="javascript" type="text/javascript">
var mystring = "d2g";
var myregexp = new RegExp("d[0-9]g");
/* This prints out true for any number between
0 and 9 but not dog, dug etc. */
document.write(myregexp.test(mystring));
</script>
Note: If the - character is used at the beginning or the end of the character list then it's treated like any other character e.g. [-09] matches characters -, 0, or 9.
The ^ Negated Operator
A character list may specify characters that don't match by using the ^ negated operator as the first character in the character list.
Using the ^ Negated Operator
<script language="javascript" type="text/javascript">
var mystring = "dig";
var myregexp = new RegExp("d[^o]g");
// This prints out true for dig, dug etc. but not dog
document.write(myregexp.test(mystring));
</script>
Note: If the ^ character is used anywhere within the character list, except as the first character, then it's treated like any other character e.g. [i^] matches characters i and ^.
Matching Position
A regular expression can specify that the pattern occur at the start or the end of a source string.
The ^ character anchors the pattern to the start of the source string.
The $ character anchors the pattern to the end of the source string.
Example of Matching Positions
<script language="javascript" type="text/javascript">
var mystring = "www.domain.com";
var myregexp = new RegExp("^www\.");
// This prints out true if string starts with www.
document.write(myregexp.test(mystring));
var myregexp = new RegExp("\.com$");
// This prints out true if string ends with .com
document.write(myregexp.test(mystring));
</script>
Repeating Characters
JavaScript allows 3 operators in regular expressions to match a pattern of zero to many occurrences of a character.
The ? Operator
This matches zero or one occurrence of a character. e.g. "dro?p" returns true for drp or drop but false for droop.
The * Operator
This matches zero to many occurrences of a character. e.g. "dro*p" returns true for drp, drop, droop, drooop etc.
The + Operator
This matches one to many occurrences of a character. e.g. "dro+p" returns false for drp but true for drop, droop, drooop etc.
The ?, * and + operators can also be used with a wildcard or a list of characters.
Matching Positions and Using Wildcards
<script language="javascript" type="text/javascript">
var mystring = "www.domain.com";
var myregexp = new RegExp("^www\..*\.com$");
// Prints out true if string starts with www. and ends with .com
document.write(myregexp.test(mystring));
</script>
Repeating Groups of Characters
The ?, * and + operators can be used on character groups as well as individual characters by the use of parenthesis.
Example of Groups of Characters
<script language="javascript" type="text/javascript">
var mystring = "abc";
var myregexp = new RegExp("(abc)+");
// Prints out true for abc, abcabc, abcabcabc etc.
document.write(myregexp.test(mystring));
</script>
The | or Operator
The | or operator allows alternatives within a pattern.
Example of the | or Operator
<script language="javascript" type="text/javascript">
var mystring = "Dear Sir";
var myregexp = new RegExp("^Dear (Sir|Madame)");
/* Prints out true if mystring starts with
Dear Sir or Dear Madame */
document.write(myregexp.test(mystring));
</script>
The {} Braces Syntax
A fixed number or occurrences of characters from a list can be specified in braces
e.g. '[1-5]{2}' returns true if a string contains a two-digit number that contains the digits 1 through to 5.
You can also specify a maximum and minimum number of occurrences within the braces.
e.g. '[1-6]{2,4}' returns true if a string contains a two to four digit number that contains the digits 1 through to 6.
Example of Using {} Braces
<script language="javascript" type="text/javascript">
var mystring = "54";
var myregexp = new RegExp("[1-5]{2,4}");
// This prints out true
document.write(myregexp.test(mystring));
var mystring = "9";
var myregexp = new RegExp("[1-5]{2,4}");
// This prints out false
document.write(myregexp.test(mystring));
</script>
Escape Sequences
One of the biggest problems when using regular expressions is readability. The problem is compounded when characters within patterns are the same as the special characters used as operators.
To avoid these characters being interpreted as operators, they must be escaped and this makes patterns less readable.
One way to avoid unreadable escape sequences is to place characters with special meanings within a list where they don't need to be escaped.
Using Lists to Escape Special Characters
<script language="javascript" type="text/javascript">
var mystring = "Operator ?*+";
// The characters ?, * and + need to be escaped
var myregexp = new RegExp(" \? \*\ +");
// But in a list they don't
var myregexp = new RegExp("[?*+]");
</script>