We're planting a tree for every job application! Click here to learn more

Working with Regulars Expressions (RegExp) in JavaScript: Features and Practical Examples

Rafa Romero Dios

29 Nov 2021

•

8 min read

Working with Regulars Expressions (RegExp) in JavaScript: Features and Practical Examples
  • JavaScript

Today we're gonna talk about one of the most useful and powerful tools of any developer - Regulars Expressions (a.k.a RegExp or Regexp)

Regexp are the kind of tool which the more you use, the more you like it. It's very common to see junior developers writing complex blocks of code to make "find and replace" string operations that could be done with Regexp probably in a couple of lines.

So be ready to travel through the Regexp world. In our journey we will see the main features that the Regexp tools provide, such as search options, quantifiers, operators, character classes and more... and of course practical examples.

It's important to clarify that Regexp tools and their syntax are language agnostic, so all you learn in this tutorial it will be useful for you, whatever language you're working with.

However, we must note that this tutorial is JavaScript oriented, that's why the code snippets as well as the practical use cases at the end of the tutorial will be written in JavaScript. All code snippets are functional and you can copy paste and see them working.

We'll also provide you with some useful resources when working with RegExp.

Definition

A regular expression is a pattern that the regular expression engine attempts to match in input text. A pattern consists of one or more character literals, operators, or constructs.

Available methods in JavaScript

Once we know what Regexp means, and before starting with the theory, we need to know what are the main and most commonly used JavaScript methods related to regular expressions.

RegExp.prototype.test()

This is the main method that provide the RegExp API. The test() method executes a search for a match between a regular expression and a specified string. Returns true or false. Its syntax is:

/regexp/.test(stringToTest:string)

You can find a simple example of test usage:

/hello/i.test('hello my friend') // true

String.prototype.match()

This methods does not belong to the Regexp API, but to the String API. However, it's still commonly used when working with Regexp. The match() method retrieves the result of matching a string against a regular expression. It return an Array with one (first match) or all matches depending on the presence or absence of the global (g) flag, or null if no matches are found. Don't worry if you don't understand what we're talking about with "the global flag" stuff, we'll look at it soon.

Next you can find a simple example of match usage:

'This is a Test'.match(/Test/) // ["Test"]

Capturing group

Capturing groups are the way we have to group our search criteria, for instance, group characters

Note that the following examples return totally different results

"ba baba baa baabaa babaaba bababa".match(/(ba)+/g) // ["ba", "baba", "ba", "ba", "ba", "baba", "ba", "bababa"] 
"ba baba baa baabaa babaaba bababa".match(/ba+/g) // ["ba", "ba", "ba", "baa", "baa", "baa", "ba", "baa", "ba", "ba", "ba", "ba"] 

Search modifiers

Search modifiers are that kind of flags that allow us to set how precise our regular expression is.

Case insensitive modifier [i]

A case insensitive match is performed, meaning capital letters will be matched by non-capital letters and vice versa.

'There is a cat in the street'.match(/cat/) // ['cat']
'There is a Cat in the street'.match(/cat/) // null
'There is a Cat in the street'.match(/cat/i) // ['Cat']

Global modifier [g]

Also known as global mode, it tells the engine not to stop after the first match has been found, i.e., to continue until no more matches can be found.

'I have one cat and she have two cats'.match(/cat/) // ['cat']
'I have one cat and she have two cats'.match(/cat/g) // ['cat', 'cat']

Operators

Operator are the modifiers that allow us to define our search criteria.

Length result operator[?]

Matching regular expresions always find for the longest match, but we can switch to the shortest one by using the operator "?":

"tarantula".match(/t[a-z]*a/) // ["tarantula"]
"tarantula".match(/t[a-z]*?a/) // ["ta"]

Matching start of a string [^]

The caret character (^) inside a character set is used to create a negated character set in the form [^thingsThatWillNotBeMatched]. Outside of a character set, the caret is used to search for patterns at the beginning of strings.

"california".match(/^ca/) // ["ca"]
"california".match(/^ca.*/) // ["california"]
"state of california".match(/^ca/) // null

Matching end of a string [$]

The caret character (^) inside a character set is used to create a negated character set in the form [^thingsThatWillNotBeMatched]. Outside of a character set, the caret is used to search for patterns at the beginning of strings.

"california".match(/fornia$/) // ["fornia"]
"california".match(/.*fornia$/) // ["california"]
"california is great".match(/.*fornia$/)// null

Quantifiers

Quantifiers are the tools that allow us to define our search criteria in terms of character repetition

One or more times matches [+]

Matches one or more consecutive characters. Always will return as more characters as possible.

"a aa aaa aaaa aaaaa".match(/a/g); // ["e", "e", "e", "e", "e"]
"my sample senteence".match(/e+/g) // ["e", "e", "ee", "e"]

Zero or more times matches [*]

Matches zero or more consecutive characters. Always will return as few characters as possible.

'a ba baa aaa ba b'.match(/ba/g) // ["ba", "ba", "ba"]
'a ba baa aaa ba b'.match(/ba*/g) // ["ba", "baa", "ba", "b"]

Zero or one matches [?]

Matches zero or one matches. Always will return as few characters as possible.

'a ba baa baba aaa ba b'.match(/ba/g) // ["ba", "ba", "ba", "ba", "ba"]
'a ba baa baba aaa ba b'.match(/ba?/g) // ["ba", "ba", "ba", "ba", "ba", "b"]

Number of matches

We can match number of consecutives matches by specifing it between curly braces. Let's see some examples before code snippets:

  • {2,4} between 2 and 4 matches
  • {2,} at least 2 matches
  • {3} exactly 3 matches

We'll see clearer with some examples:

"ba baba baa baabaa babaaba bababa".match(/ba{2,}/g); // ["baa", "baa", "baa", "baa"] 
"ba baba baa baabaa babaaba bababa".match(/(ba){2,}/g); // ["baba", "baba", "bababa"]
"ba baba baa baabaa babaaba bababa".match(/(ba){3}/g); // ["bababa"] 
"ba baba baa baabaa babaaba bababa".match(/(ba){1,3}/g); // ["ba", "baba", "ba", "ba", "ba", "baba", "ba", "bababa"]

Character classes

Character classes are the way we have to search for a literal pattern with some flexibility. Character classes allow you to define a group of characters you wish to match by placing them inside square [ and ] brackets.

Alphanumerical characters

We can search for character groups (alphate based) very easily. Let's see some examples:

  • [a-d] matches the following characters: a, b, c, d
"my sample sentence is admirable".match(/[a-d]/g) // ["a", "c", "a", "d", "a", "b"]

"my sample sentence is admirable".match(/[a-d]+/g) // ["a", "c", "ad", "ab"]
  • [0-7] matches the following characters: 1, 2, 3, 4, 5, 6, 7
"My favourite player has scored 35 goals and wears the number 9".match(/[1-7]/g) // ["3", "5"]

"My favourite player has scored 35 goals and wears the number 9".match(/[1-7]+/g) // ["35"]
  • [a-d1-7] matches the following characters: a, b, c, d, 1, 2, 3, 4, 5, 6, 7
"I like to play badminton with 2 more people at 5 o'clock".match(/[a-d1-7]/g) // ["a", "b", "a", "d", "2", "a", "5", "c", "c"]

"I like to play badminton with 2 more people at 5 o'clock".match(/[a-d1-7]+/g) // ["a", "bad", "2", "a", "5", "c", "c"]

Here you have some character matching shorthands:

\w === [A-Za-z0-9_]
\W === [^A-Za-z0-9_]
\d === [0-9]
\D === [^0-9]

Let's see some related examples:

// get non alphanumerical characters
"myemail@gmail.com".match(/\W/) // ["@"]

// get numerical characters
"my favourite numbers are 1, 3, and 7".match(/\d/g) // ["1", "3", "7"]

// get non numerical characters
"my favourite numbers are 1, 3, and 7".match(/\D/g) // ["m", "y", " ", "f", "a", "v", "o", "u", "r", "i", "t", "e", " ", "n", "u", "m", "b", "e", "r", "s", " ",  "a", "r", "e", " ", ",", " ", ",", " ", "a", "n", "d", " "]

Match whitespace with [\s]

We can match whitespace, but also carriage return, tab, form feed, and new line characters with [\s]. [\s] matches any whitespace character (equivalent to [\r\n\t\f\v \u00a0\u1680\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff])

The opposite is [\S], which will match to [^\r\t\f\n\v]

// count number of spaces in a sentence
"this is a sample sentence".match(/\s/g).length // 4

// get all non space characters
"Hello friend".match(/\S/g) // ["H", "e", "l", "l", "o", "f", "r", "i", "e", "n"]

// count all space characters
"Hello friend".match(/\S/g).length // 11

Negating character classes [^]

There is a way to use character classes backwards, by defining a group of characters that you don't want to match just by adding ^ at the beginning of the character class definition

"The sample sentence where there are lots of e vowel".match(/^[e]/g); // return all characters that are not 'e'

Lookarounds

Last but not least, to finish this tutorial let's see one of the tricky parts of RegExp: LookAheads and LookBehind.

Lookarounds are the way we have to search just after some criteria, i.e. pattern that are followed by another pattern.

LookAhead

LookAheads are the way we have to search just after some criteria, i.e. pattern that are followed by another pattern.

Positive LookAhead

As always, it is easier to understand with an example. Let's see how we can find for currency values in a text but not for just single numbers, for instance, to get the product price

To do that we should match numbers followed by, for instance, € and $.

"I have bought 3 bananas for 6€ and 2 apples for 4$ in the international market".match(/\d(?=€|\$)/g) // ["4", "6"]

Combining the regular expression with the reduce() method of JS, we can get the total amount:

"I have bought 3 bananas for 6€ and 2 apples for 4€ in the international market".match(/\d(?=€)/g).reduce((previousValue, currentValue) => parseInt(previousValue) +  parseInt(currentValue), 0) + "€" // 10€

Negative LookAhead

Let's continue with the last example to explain the negative LookAhead. In that case, instead of getting the total cost, we want to get the total quantity of bought items.

"I have bought 3 bananas for 6€ and 2 apples for 4$ in the international market".match(/\d(?!€|\$)/g) // ["3", "2"]

Also, combining the regular expression with the reduce() method of JS, we can get the total amount of bought items

"I have bought 3 bananas for 6€ and 2 apples for 4€ in the international market".match(/\d(?!€)/g).reduce((previousValue, currentValue) => parseInt(previousValue) +  parseInt(currentValue), 0) // 5

LookBehind

Once we know the concept of LookAhead, we can briefly explain what is LookBehind.

LookAheads are the way we have to search just before some criteria, i.e. pattern that are preceded by another pattern.

So let's see both a positive and a negative LookBehind examples:

// positive LookBehind - get the domain name
"fakeemail@gmail.com".match(/(?<=@)\D*/g) // gmail.com

// negative LookBehind - get the quantity (other currency format). Note that we have to scape the dollar character
"I have bought 3 bananas for $6 and 2 apples for $4".match(/(?<!\$)\d/g)

Captouring group

So far, in the sample we've seen about LookArounds, we are not getting the pattern in the match result. If we would want to get the pattern as well we just need to wrap it into parenthesis. To illustrate that, we will see again the last samples, now using captouring groups:

"I have bought 3 bananas for 6€".match(/\d(?=(€|\$))/) // ["6", "€"]

Closing

We have seen a very comprehensive look to RegExp, with lots of examples. From now on, you will not stop using RegExp in your code hehe!

To finish we're gonna share some useful resources related to RegExp, including games!!

Documentation about RegExp:

Online Testers:

  • Regex1010 (Includes also documentation and examples)
  • Regexr (Includes also documentation and examples)
  • Regex Tester (Includes also documentation and examples)

Cheat Sheets:

Games:

Did you like this article?

Rafa Romero Dios

Software Engineer specialized in Front End. Back To The Future fan

See other articles by Rafa

Related jobs

See all

Title

The company

  • Remote

Title

The company

  • Remote

Title

The company

  • Remote

Title

The company

  • Remote

Related articles

JavaScript Functional Style Made Simple

JavaScript Functional Style Made Simple

Daniel Boros

•

12 Sep 2021

JavaScript Functional Style Made Simple

JavaScript Functional Style Made Simple

Daniel Boros

•

12 Sep 2021

WorksHub

CareersCompaniesSitemapFunctional WorksBlockchain WorksJavaScript WorksAI WorksGolang WorksJava WorksPython WorksRemote Works
hello@works-hub.com

Ground Floor, Verse Building, 18 Brunswick Place, London, N1 6DZ

108 E 16th Street, New York, NY 10003

Subscribe to our newsletter

Join over 111,000 others and get access to exclusive content, job opportunities and more!

© 2024 WorksHub

Privacy PolicyDeveloped by WorksHub