A regular expression (or Regex) is a pattern that can be used to search a string for matches to that pattern. It is commonly used in find-and-replace operations, as well as form input validations and different types of string manipulation.
Regular expression can look a bit strange and bizarre when you’re just getting started and can require some getting used to.
To fully understand this topic, it is important that we explore some of the reasons why we use regular expressions in our projects.
Often times, we are tasked the with the responsibility of validating an input that must conform to a pattern. One way to go about this (I’d argue not a good method) would be to attempt to write a number of if
statements; but, as we continue, we would find out that it can be quite difficult to read and maintain — hence the need for a better solution, like regular expressions (or RegExp).
In this article, we’re going to be taking a look at regular expressions and how you can use Magic-RegExp to improve and simplify creating regular expressions for your projects.
Let’s jump in!
Jump ahead:
Creating in RegExp
There are primarily two ways we create regular expressions in JavaScript.
The first, which is the preferred way, is by using the literal notation where we write our expression in between two forward slashes:
const pattern = /[a-zA-Z]+ing$/;
We can create an instance of the RegExp object using the new keyword and its constructor function:
const pattern = new RegExp('[a-zA-Z]+ing');
We primarily use the second method if the pattern is going be to entered as input from the user, because it supports assigning variables, which isn’t possible with the first method.
Limitations of working with RegExp
Validating a URL
Let’s consider that we want to validate a URL — let’s look at the different formats a valid URL can come in.
http://www.google.com www.google.com google.com telegram.org egghead.io w3schools.com
W3Schools Free Online Web Tutorials
With the world's largest web developer site. HTML Tutorial This is a heading This is a paragraph. Try it Yourself W3Schools' famous color picker Play Game Test your skills! Browse our selection of free responsive HTML Templates BROWSE TEMPLATES Get certified by completing a course Get started w 3 s c h o o l s C E R T I F I E D .
JavaScript With Syntax For Types.
TypeScript extends JavaScript by adding types to the language. TypeScript speeds up your development experience by catching errors and providing fixes before you even run your code.
We will be limiting our exploration of valid URL to the above for the sake of brevity and simplicity within the scope of this article.
Developing a Pseudocode
Let’s develop a pseudocode to that will help us write the regular expressions that validate similar patterns.
- It can start with a
http
orhttps
(this seems to beoptional
) - It is followed by colon
:
and two forward slashes//
(this isoptional
as well) - It is immediately followed by a
www.
(this seems to beoptional
as well) - It is followed by a letters that can contain digits
- It is the followed by a
.
(period) - It can end with either
io
,org
, orcom
To develop our pattern from the above pseudocode, we have to familiarize ourselves with some regular expression syntax.
To follow along and see the matches as they happen in real time, you can use the tool, regex101. Feel free to add more URLs and play around with the tool.
Pseudocodes 1 and 2
It must start with http
or https
:
We use the caret symbol to indicate the start, with ^
followed by the letters http
, so we have ^http
, but then we can have an optional s
. To make a letter optional, we append the special character ?
after it, so our expression is now ^https?
.
This is then followed by a colon :
— we simply attach this that so we have ^https?:
, which is then followed by two forward slashes //
. Ideally. we should add this in just like we did the colon :
, but we can’t because they are special characters, and so we have to escape each of them with a \
backslash.
Hence, our expression now becomes ^https?:\/\/
, but from the URL samples we have above we can see that a URL can start with “https://” or “http://”, meaning the entire expression is optional. From the example we have already established above, we make a letter optional by appending a ?
after it. If we do this, it will make only the last letter optional; but what we want is to make the entire group optional — hence we wrap it with a parenthesis to make it a capturing group (in effect, to make it all to function as a unit) and then we append the ?
.
Here is our resulting expression:
(^https?:\/\/)?
Pseudocode 3
We append the “www” to our previous expression and we have (^https?:\/\/)?www
, but this is the followed by a .
, which is a special character, hence we have to escape it so our expression will now be (^https?:\/\/)?www\.
. From our analysis, the entire “www.” is optional, so we put it in the group and then make it optional:
(^https?:\/\/)?(www\.)?
Pseudocode 4
A valid URL is then followed by a word which can contain digits. Any word character can contain a-z (a, b, c to z)
or A-Z (A to Z)
or 0-9
or even an _
— this grouped together are like this: [A-Za-z0-9_]
; or, under a special character: \w
.
The two are equivalent but the latter is preferred because of brevity. we will be adding this to our expression:
^(https?:\/\/)?(www\.)?[A-Za-z0-9_]
^(https?:\/\/)?(www\.)?\w
The two above patterns are equivalent, however we will stick to the latter for the sake of brevity.
if we run a test using an online tool like Regex101, we would see that it only matches just one word character, what we need to match is one or more occurrences of any word character. There is a special character that lets us do this, which is plus +
, which means one or more occurrence of the preceding character.
Hence, our expression will now be:
^(https?:\/\/)?(www\.)?\w+
Pseudocode 5
It is then followed by a .
, which is a special character that we have to escape, adding this in modifies our expression to the following:
^(https?:\/\/)?(www\.)?\w+\.
Pseudocode 6
This says our URL can end with “com”, “io”, or “org”. For this, we need another special character, |
. This is used to indicate an alternative in a group.
We now have:
^(https?:\/\/)?(www\.)?\w+\.(com|org|io)$/
In addition, we can add the flags that modify the default behavior of our RegExp. These flags are as follows:
- i:
ignoreCase
this means it should be case insensitive (would match any combination of “hi” (“HI”, “iH”, or “Ih”) - g:
global
means it should make as many matches as possible; by default, it stops after the first match - m:
multiline
means it should continue searching for matches across multiple lines
Adding these to the mix gives the expression:
/^(https?:\/\/)?(www\.)?\w+\.(com|org|io)$/mig
We can test that our expression works as expected using a couple of methods:
const pattern = /^(https?:\/\/)?(www\.)?\w+\.(com|org|io)$/mig pattern.test('www.google.com') // returns true pattern.test('lovelife') // return false
Search the world's information, including webpages, images, videos and more. Google has many special features to help you find exactly what you're looking for.
www.google.com google.com telegram.org egghead.io w3schools.com
W3Schools Free Online Web Tutorials
With the world's largest web developer site. HTML Tutorial This is a heading This is a paragraph. Try it Yourself W3Schools' famous color picker Play Game Test your skills! Browse our selection of free responsive HTML Templates BROWSE TEMPLATES Get certified by completing a course Get started w 3 s c h o o l s C E R T I F I E D .
https://www.typescriptlang.org`.match(/^(https?:\/\/)?(www\.)?\w+\.(com|org|io)$/gmi) ['http://www.google.com', 'www.google.com', 'google.com', 'telegram.org', 'egghead.io', 'w3schools.com', 'https://www.w3schools.com', 'https://www.typescriptlang.org'] The match method retunrs an array of all matches.
From the above, it is evident that working with RegExp can be a little unsettling at first, as one needs to understand a fair bit of jargon and how to apply them. This is where Magic-RegExp comes in.
Getting Started with Magic-RegExp
Magic-RegExp simplifies the process of creating regular expressions by making it read like plain English — it is a compiled away, type-safe, readable RegExp alternative.
To get started, we can simply install it using our favorite package manager; npm
, pnpm
, or yarn
npm install magic-regexp
We can also include it in popular frameworks like Nuxt (Vue.js) or Next.js (React). I would suggest visiting the official page for more details.
We could attempt to recreate the regular expression above using the pseudocode above, but first we will spin up a new Node.js application to test drive this.
mkdir regExp // make a project directory cd regExp // go into the directory touch app.js // create an entry point for our application npm init -y // set default options npm i magic-regexp
Now that we have our project all set, we can open up app.js and start writing our code.
import { createRegExp, exactly, wordChar, oneOrMore, anyOf, } from "magic-regexp"; const regExp = createRegExp( exactly("http") .and(exactly("s").optionally()) .and("://") .optionally() .and(exactly("www.").optionally()) .and(oneOrMore(wordChar)) .and(exactly(".")) .and(anyOf("com", "org", "io")), ["g", "m", "i"] ); console.log(regExp); /(https?:\/\/)?(www\.)?\w+\.(com|org|io)/gmi
We import the createRegExp
function and we use accompanying functions to compose our expressions as seen above.
From the above, we can see that it reads much closer to plain English, and we can easily write it just from the pseudocode.
Even though the RegExp that was generated wasn’t exactly the same (the differences are the opening ^
and the ending $
); they were similar enough and it passes all our test conditions without issue.
It is very simple and easy to come by as it can be achieved with little understanding of how RegExp works internally. Neat, huh?
For a more comprehensive list of what is possible with magic-regexp visit the documentation
Conclusion
It is clear that we can achieve a lot with little or no knowledge of RegExp by using the Magic-RegExp library.
It is faster to come up with expressions, easier to debug and read, and while it has the overhead cost of bundle size, this is minimal and can be easily negated.
The post Understanding Magic-RegExp as a RegExp alternative appeared first on LogRocket Blog.
from LogRocket Blog https://ift.tt/FGmrEZj
Gain $200 in a week
via Read more