mirror of
https://github.com/dholerobin/Lecture_Notes.git
synced 2025-09-13 13:52:12 +00:00
reorganize files
This commit is contained in:
114
Akash Articles/rendered/RegEx/Alternation.html
Normal file
114
Akash Articles/rendered/RegEx/Alternation.html
Normal file
@@ -0,0 +1,114 @@
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
<h2 id="alternationoroperator">Alternation (OR operator)</h2>
|
||||
|
||||
<p><strong>Character class</strong> can be used to match a single character out of several possible characters. Alternation is more generic than character class. It can also be used to match an expression out of several possible expressions.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtoa" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>In the above example, <code>cat|dog|lion</code> basically means 'either cat or dog or lion'. Here, we have used specific expression(cat, dog & lion), but we can use any regular expression. For example,</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtod" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<h3 id="problem">Problem</h3>
|
||||
|
||||
<ul>
|
||||
<li>Find a regex to match boot or bot.
|
||||
Answer: There more than one possible answers: <code>boot|bot</code>, <code>b(o|oo)t</code>. Last expression is using a group.</li>
|
||||
</ul>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtog" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<h3 id="problemwithoroperator">Problem with OR operator:</h3>
|
||||
|
||||
<p>Suppose, you want to match two words <strong>Set</strong> and <strong>SetValue</strong>. What will be the regular expression?</p>
|
||||
|
||||
<p>From whatever we have learned so far, you will say, <code>Set|SetValue</code> will be the answer. But it is not correct.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtoj" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>If you try <code>SetValue|Set</code>, then it is working. </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtom" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Can you observe anything from it?</p>
|
||||
|
||||
<p><strong>OR operator</strong> tries to match a substring starting from the first word(or expression)-in the regex. If it is a match, then it will not try to match the next word(or expression) at the same place in text.</p>
|
||||
|
||||
<h3 id="problem-1">Problem</h3>
|
||||
|
||||
<p>Find out an regex which matches each and every word in the following set: <code>{bat, cat, hat, mat, nat, oat, pat, Pat, ot}</code>. The regex should be as small as possible.</p>
|
||||
|
||||
<p><strong>Hint:</strong> Use character-class, ranges and or-operator together.</p>
|
||||
|
||||
<p>Answer: <code>[b-chm-pP]at|ot</code></p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtos" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
214
Akash Articles/rendered/RegEx/Appendix_BonusProblem.html
Normal file
214
Akash Articles/rendered/RegEx/Appendix_BonusProblem.html
Normal file
@@ -0,0 +1,214 @@
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
|
||||
table {
|
||||
border:1px solid black;
|
||||
border-collapse: collapse;
|
||||
min-width:60%;
|
||||
}
|
||||
|
||||
th, td{
|
||||
border:1px solid black;
|
||||
border-collapse: collapse;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
tr:nth-child(even) {background-color: #f2f2f2;}
|
||||
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
|
||||
<h2 id="characterswithspecialmeaning">Characters with special meaning</h2>
|
||||
|
||||
<p>We have seen that, we are using <code>*</code>, <code>+</code>, <code>.</code>, <code>$</code>, etc for different purposes. Now, if we want to match them themselves, we have to escape them using escape character(backslash-\) .</p>
|
||||
|
||||
<p>Below is the table for these kind of characters and their escaped version, along with their usages.</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<td>Character</td>
|
||||
<td>Usage</td>
|
||||
<td>Escaped version</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>\</td>
|
||||
<td>escape character</td>
|
||||
<td>\\</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>.</td>
|
||||
<td>predefined character class</td>
|
||||
<td>\.</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>|</td>
|
||||
<td>OR operator</td>
|
||||
<td>\\</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>*</td>
|
||||
<td>as quantifier</td>
|
||||
<td>\*</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>+</td>
|
||||
<td>as quantifier</td>
|
||||
<td>\+</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>?</td>
|
||||
<td>as quantifier</td>
|
||||
<td>\?</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>^</td>
|
||||
<td>boundary matcher</td>
|
||||
<td>\^</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>$</td>
|
||||
<td>boundary matcher</td>
|
||||
<td>\$</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>{</td>
|
||||
<td>in quantifier notation</td>
|
||||
<td>\{</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>}</td>
|
||||
<td>in quantifier notation</td>
|
||||
<td>\}</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>[</td>
|
||||
<td>in character class notation</td>
|
||||
<td>\[</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>]</td>
|
||||
<td>in character class notation</td>
|
||||
<td>\]</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>(</td>
|
||||
<td>in group notation</td>
|
||||
<td>\(</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>)</td>
|
||||
<td>in group notation</td>
|
||||
<td>\)</td>
|
||||
|
||||
</tr>
|
||||
<tr>
|
||||
<td>-</td>
|
||||
<td>range operator</td>
|
||||
<td>NA</td>
|
||||
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>Sometimes, it is also preferred to use escaped forward slash(<code>/</code>).</p>
|
||||
|
||||
<h2 id="bonusproblems">Bonus Problems</h2>
|
||||
|
||||
<ol>
|
||||
<li><p>Predict the output of the following regex:</p>
|
||||
|
||||
<p><strong>RegEx:</strong> <code>\b(0|(1(01*0)*1))*\b</code> <br>
|
||||
<strong>Text:</strong> <code>This RegEx denotes the set of binary numbers divisible by 3:
|
||||
0,11,1010, 1100, 1111, 1001</code></p>
|
||||
|
||||
<p>Answer: </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtqu" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p>Find a regular expression to match whole lines in the text containing either Apple, Orange, Grape as a word.</p>
|
||||
|
||||
<p>Hint: Use <code>^</code> and <code>$</code> to match whole lines.</p>
|
||||
|
||||
<p>Answer: <code>^.*\b(apple|orange|grape)\b.*$</code></p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtra" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p>Find an expression which matches <code>cat.</code>, <code>896.</code>, <code>?=++</code> but not <code>abc1</code>.</p>
|
||||
|
||||
<p>Answer: <code>...\W</code></p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu8e" class="embed"></iframe>
|
||||
</div></li>
|
||||
</ol>
|
||||
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
132
Akash Articles/rendered/RegEx/Backreferencing.html
Normal file
132
Akash Articles/rendered/RegEx/Backreferencing.html
Normal file
@@ -0,0 +1,132 @@
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
|
||||
|
||||
<h2 id="backreferencing">Backreferencing</h2>
|
||||
|
||||
<p>Backreferencing is used to match same text again. Backreferences match the same text as previously matched by a capturing group. Let's look at an example:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuap" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p><strong>Note:</strong> <code>\/</code> is escaped <code>/</code>character, check it out in the appendix.</p>
|
||||
|
||||
<p>The first captured group is (<code>\w+</code>), now we can use this group again by using a backreference (<code>\1</code>) at the closing tag, which matches the same text as in captured group <code>\w+</code>.</p>
|
||||
|
||||
<p>You can backreference any captured group by using <code>\group_no</code>.</p>
|
||||
|
||||
<p>Let's have two more examples:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuas" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vube" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<h3 id="backreferencingandcharacterclass">Backreferencing and character class</h3>
|
||||
|
||||
<p>Backreferencing can not be used in character class. Let's see an example:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu7p" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<h3 id="backreferencingandquantifiers">Backreferencing and Quantifiers</h3>
|
||||
|
||||
<p>When we are using a backreference for an expression with quantifiers, then we have to be careful. Let's observe it:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuet" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Note that <code>(\d)+</code> and <code>(\d+)</code> both are different. So, what will happen for <code>(\d)+ -- \1</code> expression and same text above?</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu85" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Can you observe something?</p>
|
||||
|
||||
<p>For <code>(\d)+ -- \1</code> expression and <code>123 -- 3</code> string, first time 1 was stored in \1, then 2 was stored in \1 and at last 3 was stored. So, it will show a match if and only if the last character before <code>--</code> is exactly same as the character after <code>--</code>.</p>
|
||||
|
||||
<p><strong>Problems:</strong></p>
|
||||
|
||||
<ol>
|
||||
<li><p>Match any palindrome string of length 6, having only lowercase letters.
|
||||
Answer: <code>([a-z])([a-z])([a-z])\3\2\1</code></p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuoc" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p><strong>RegEx</strong>: <code>(\w+)oo\1le</code> <br>
|
||||
<strong>Text:</strong> <code>google, doodle jump, ggooggle, ssoosle</code></p>
|
||||
|
||||
<p>Answer: </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vubk" class="embed"></iframe>
|
||||
</div></li>
|
||||
</ol>
|
||||
|
||||
<p><strong>Note:</strong> For group numbers more than 9, there is a syntax difference.</p>
|
||||
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
190
Akash Articles/rendered/RegEx/Groups_capturing.html
Normal file
190
Akash Articles/rendered/RegEx/Groups_capturing.html
Normal file
@@ -0,0 +1,190 @@
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
|
||||
|
||||
|
||||
<h2 id="groupscapturing">Groups & Capturing</h2>
|
||||
|
||||
<p>Grouping is the most useful feature of regex. Grouping can be done by placing regular expression inside round brackets. In this article, we will see how to extract and replace data using groups. </p>
|
||||
|
||||
<p>It unifies the regular expressions inside it as a single unit. Let's look at its usages one by one:</p>
|
||||
|
||||
<ol>
|
||||
<li><p>It makes the regular expression more readable and sometimes it is an inevitable thing.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuaa" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Suppose, we want to match both the sentences in the above text, then grouping is the inevitable thing.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuad" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p>To apply quantifiers to one or more expressions.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuag" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Similarly, you can use other quantifiers.</p>
|
||||
|
||||
<li><p>To extract and replace substrings using groups. So, we call groups <strong>Capturing groups</strong>, becuase we are capturing data(substrings) using groups.</p>
|
||||
|
||||
<p><strong>Data Extraction</strong></p>
|
||||
|
||||
<p>Observe the code below.</p>
|
||||
|
||||
<pre><code class="js language-js">var str = "2020-01-20";
|
||||
|
||||
// Pattern string
|
||||
var pattern = /(\d{4})-(\d{2})-(\d{2})/g;
|
||||
|
||||
// ^ ^ ^
|
||||
//group-no: 1 2 3
|
||||
|
||||
var result = pattern.exec(str);
|
||||
|
||||
// printing
|
||||
console.log(result);
|
||||
|
||||
// Data extraction
|
||||
console.log(result[1]); // First group
|
||||
console.log(result[2]); // Second group
|
||||
console.log(result[3]); // Third group
|
||||
</code></pre>
|
||||
|
||||
The output will be:
|
||||
<pre><code class="js language-js">[
|
||||
'2020-01-20',
|
||||
'2020',
|
||||
'01',
|
||||
'20',
|
||||
index: 0,
|
||||
input: '2020-01-20',
|
||||
groups: undefined
|
||||
]
|
||||
2020
|
||||
01
|
||||
20
|
||||
|
||||
</code></pre>
|
||||
|
||||
<p>In the output array, the first data is a match string followed by the matched groups in the order.</p>
|
||||
|
||||
<p><strong>Data Replacement</strong></p>
|
||||
|
||||
<p><code>Replace</code> is another function, which can be used to replace and rearrange the data using regex. Observe the code below.</p>
|
||||
|
||||
<pre><code class="js language-js">var str = "2020-01-20";
|
||||
|
||||
// Pattern string
|
||||
var pattern = /(\d{4})-(\d{2})-(\d{2})/g;
|
||||
|
||||
// ^ ^ ^
|
||||
//group-no: 1 2 3
|
||||
|
||||
// Data replacement using $group_no
|
||||
var ans=str.replace(pattern, '$3-$2-$1');
|
||||
|
||||
console.log(ans);
|
||||
// Output will be: 20-01-2020
|
||||
</code></pre>
|
||||
|
||||
<p>As you can see, we have used <code>$group_no</code> to indicate the capturing group.</p></li>
|
||||
</ol>
|
||||
|
||||
<h2 id="problems">Problems</h2>
|
||||
|
||||
<p>Predict the output of the following regex:</p>
|
||||
|
||||
<ol>
|
||||
<li><p><strong>RegEx</strong>: <code>([abc]){2,}(one|two)</code> <br>
|
||||
<strong>Text</strong>:
|
||||
<code>aone
|
||||
cqtwo
|
||||
abone
|
||||
actwo
|
||||
abcbtwoone
|
||||
abbcccone
|
||||
</code></p>
|
||||
|
||||
<p>Answer: </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuaj" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p><strong>RegEx</strong>: <code>([\dab]+(r|c)){2}</code> <br>
|
||||
<strong>Text</strong>:
|
||||
<code>1r2c
|
||||
ar4ccc
|
||||
12abr12abc
|
||||
acac, accaca, acaaca
|
||||
aaar1234234c, aaa1234234c
|
||||
194brar, 134bcbb-c </code></p>
|
||||
|
||||
<p>Answer: </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuam" class="embed"></iframe>
|
||||
</div></li>
|
||||
</ol>
|
||||
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
122
Akash Articles/rendered/RegEx/Intro_Basic.html
Normal file
122
Akash Articles/rendered/RegEx/Intro_Basic.html
Normal file
@@ -0,0 +1,122 @@
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
<h2 id="regularexpressionregex">Regular Expression (RegEx)</h2>
|
||||
<p>While filling online forms, haven't you come across errors like "Please enter valid email address" or "Please enter valid phone number".</p>
|
||||
<p>Annoying as they may be, there's a lot of black magic that the computer does before it determines that, the details you've entered are incorrect.</p>
|
||||
<p>Can you think out, what is that black magic? If you are familiar with algorithms, then you will say that we can write an algorithm for the same.</p>
|
||||
<p>Yes, we can write an algorithm to verify different things, but we have a standard tool designed for similar kinds of purposes.</p>
|
||||
<p>It is <strong>Regular Expression</strong>. We call it <strong>RegEx</strong> for short. RegEx makes our work a lot easier. Let's see some basic examples where RegEx becomes handy.</p>
|
||||
<p>Suppose, you are in search of an averge price of a particular product on amazon. The following regular expression will find you any price(ex. <code>$12</code>, <code>$75.50</code>) on the webpage: <code>\$([0-9]+)\.([0-9]+)</code>. In the Image below, yellowish part shows the matched prices.</p>
|
||||
<p><img src="https://lh3.googleusercontent.com/zH7_h0NJ9s5Kf22US-NJDnXL7-QqPXrHc45dc8pCITaIWqJcupyoBuh-ukQiCRCpQQgAIe0UIpBL=s1000" alt="enter image description here" /></p>
|
||||
<p>Quite interesting!</p>
|
||||
<p>Let's look at another example. You have a long list of documents with different kinds of extensions. You are particularly looking for data files having <strong>.dat</strong> extension. </p>
|
||||
<p><code>^.*\.dat$</code> is a regular expression which represents a set of string ending with <strong>.dat</strong>. Regular expression is a standardized way to encode such patterns. </p>
|
||||
<p>Below in the image, you can see that all three files having .dat extension are extracted from the list of five files.</p>
|
||||
<p><img src="https://lh3.googleusercontent.com/6tFqDs2lLl9h2GiDtPrKZxPGk2hsaykfNwzLQLm1bhW1fOw16uTALMwCKfsKEsnXi5v_hm9NgI23=s1000" alt="enter image description here" /></p>
|
||||
<p>Well. What does the name <strong>Regular Expression(RegEx)</strong> represent? Regular Expression represents the sequence of characters that defines a regular search pattern.</p>
|
||||
<p>RegEx is a standardized tool to do the following works:</p>
|
||||
<ol>
|
||||
<li>Find and verify patterns in a string.</li>
|
||||
<li>Extract particular data present in the text.</li>
|
||||
<li>Replace, split and rearrange particular parts of a string.</li>
|
||||
</ol>
|
||||
<p>We are going to look at all the three things above.</p>
|
||||
<p>Let's begin the journey with RegEx!</p>
|
||||
<p><strong>Note:</strong> </p>
|
||||
<ol>
|
||||
<li><strong>Alpha-numeric character</strong> belongs to anyone of the $0-9,A-Z,a-z$ ranges.</li>
|
||||
<li>String is a sequence of characters and substring is a contiguous part of a string.</li>
|
||||
</ol>
|
||||
<p><strong>In this article series, we are going to show all examples using live interactive playground, so that you can play with regex.</strong> You can activate playground by just clicking on it.</p>
|
||||
<h2 id="simplealphanumericcharactermatching">Simple Alpha-numeric character matching</h2>
|
||||
<p>Simple matching of a specific word can be done as following:</p>
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4v1lb" class="embed"></iframe>
|
||||
</div>
|
||||
<p>As you can see it matches "Reg" in the text. Similarly, what will be the match for "Ex" in the same text above?</p>
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtfd" class="embed"></iframe>
|
||||
</div>
|
||||
<p>Do you notice anything? It is a <strong>case sensitive</strong>.</p>
|
||||
<h2 id="implementation">Implementation</h2>
|
||||
<p>Most of the programming languages have libraries for RegEx. They have almost similar kind of syntax. Here, we will see how to implement it in <strong>Javascript</strong>.</p>
|
||||
<p>Below is a basic code in Javascript, showing how to implement regex. The patterns are written in <code>/_____/g</code>. Where <code>g</code> is a modifier, which is used to find all matches rather than stopping at the first match.</p>
|
||||
<p>The function <strong>exec</strong> returns null, if there is no match and match data otherwise.</p>
|
||||
<pre><code class="js language-js">var text_to_search_in = "RegEx stands for Regular Expression!";
|
||||
|
||||
var pattern = /Reg/g;
|
||||
|
||||
// This will print all the data of matches across the whole string
|
||||
while(result = pattern.exec(text_to_search_in)) {
|
||||
console.log(result);
|
||||
}
|
||||
</code></pre>
|
||||
The output will be:
|
||||
<pre><code class="js language-js">[
|
||||
'Reg',
|
||||
index: 0,
|
||||
input: 'RegEx stands for Regular Expression!',
|
||||
groups: undefined
|
||||
]
|
||||
[
|
||||
'Reg',
|
||||
index: 17,
|
||||
input: 'RegEx stands for Regular Expression!',
|
||||
groups: undefined
|
||||
]
|
||||
</code></pre>
|
||||
<p><strong>Note:</strong> <strong>Groups</strong> in the above output is a RegEx concept.</p>
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
158
Akash Articles/rendered/RegEx/Practial_applications.html
Normal file
158
Akash Articles/rendered/RegEx/Practial_applications.html
Normal file
@@ -0,0 +1,158 @@
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
|
||||
|
||||
table {
|
||||
border:1px solid black;
|
||||
border-collapse: collapse;
|
||||
min-width:60%;
|
||||
}
|
||||
|
||||
th, td{
|
||||
border:1px solid black;
|
||||
border-collapse: collapse;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
tr:nth-child(even) {background-color: #f2f2f2;}
|
||||
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
|
||||
|
||||
<h2 id="practicalapplicationsofregex">Practical Applications of RegEx</h2>
|
||||
|
||||
<ol>
|
||||
<li>Syntax highlighting systems</li>
|
||||
|
||||
<li>Data scraping and wrangling</li>
|
||||
|
||||
<li>In find and replace facility of text editors</li>
|
||||
</ol>
|
||||
|
||||
<p>Let's look at some classical examples of RegEx.</p>
|
||||
|
||||
<h2 id="numberranges">Number Ranges</h2>
|
||||
|
||||
<p>Can you find a regex matching all integers from 0 to 255?</p>
|
||||
|
||||
<p>First, Let's look at how can we match all integers from 0 to 59:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vubt" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>As you can see, we have used <code>?</code> quantifier to make the first digit(0-5) optional. Now, can you solve it for 0-255?</p>
|
||||
|
||||
<p>Hint : Use OR operator.</p>
|
||||
|
||||
<p>We can divide the range 0-255 into three ranges: 0-199, 200-249 & 250-255. Now, creating an expression, for each of them independently, is easy.</p>
|
||||
|
||||
<table>
|
||||
<tr>
|
||||
<td>Range</td>
|
||||
<td>RegEx</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>0-199</td>
|
||||
<td>[01][0-9][0-9]</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>200-249</td>
|
||||
<td>2[0-4][0-9]</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>250-255</td>
|
||||
<td>25[0-5]</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>Now, by using OR operator, we can match the whole 0-255 range.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuc0" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>As you can see, the above regex is not going to match 0, but 000. So, how can you modify the regex which matches 0 as well, rather than matching 001 only?</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuc3" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>We have just used <code>?</code> quantifier.</p>
|
||||
|
||||
<h2 id="validateanipaddress">Validate an IP address</h2>
|
||||
|
||||
<p>IP address consists of digits from 0-255 and 3 points(<code>.</code>). Valid IP address format is <br> (0-255).(0-255).(0-255).(0-255).</p>
|
||||
|
||||
<p>For example, 10.10.11.4, 255.255.255.255, 234.9.64.43, 1.2.3.4 are Valid IP addresses.</p>
|
||||
|
||||
<p>Can you find a regex to match an IP-address?</p>
|
||||
|
||||
<p>We have already seen, how to match number ranges and to match a point, we use escaped-dot(<code>\.</code>). But in IP address, we don't allow leading zeroes in numbers like 001. </p>
|
||||
|
||||
<p>So, We have to divide the range in four sub-ranges: 0-99, 100-199, 200-249, 250-255. And finally we use OR-operator.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuc9" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>So, Regex to match IP Address is as below:
|
||||
<img src="https://lh3.googleusercontent.com/n2CaC-8Q8NH-H5RkDrCM4AQQkV2PamIAlA3dwljdRsW33WWoj18qJEIN5iyzjLzfifHj-dh-IW-u=s1600" alt="enter image description here" /></p>
|
||||
|
||||
<p><strong>Note:</strong> The whole expression is contiguous, for the shake of easy understanding it is shown the way it is.</p>
|
||||
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
158
Akash Articles/rendered/RegEx/Quantifier.html
Normal file
158
Akash Articles/rendered/RegEx/Quantifier.html
Normal file
@@ -0,0 +1,158 @@
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
|
||||
|
||||
|
||||
<h2 id="quantifiersrepetition">Quantifiers (Repetition)</h2>
|
||||
|
||||
<p>To match 3 digit patterns, we can use <code>[0-9][0-9][0-9]</code>. What if we have n digit patterns? We have to write <code>[0-9]</code> n times, but that is a waste of time. Here is when quantifiers come for help.</p>
|
||||
|
||||
<ol>
|
||||
<li><p><strong>Limiting repetitions(<code>{min, max}</code>):</strong> To match n digit patterns, we can simply write <code>[0-9]{n}</code>. Instead of n, by providing minimum and maximum values as-<code>[0-9]{min, max}</code>, we can match a pattern repeating min to max times.</p>
|
||||
|
||||
<p>Let's see an example to match all numbers between 1 to 999.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtp2" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p><strong>Note:</strong> If you don't write the upper bound(<code>{min,}</code>), then it basically means, there is no limit for maximum repetitions.</p>
|
||||
</li>
|
||||
|
||||
<li><p><strong><code>+</code> quantifier:</strong> It is equivalent to <code>{1,}</code>-at least one occurrence.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtp5" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p><strong><code>*</code>quantifier:</strong> It is equivalent to <code>{0,}</code>-zero or more occurrences. </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtpb" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<li><strong><code>?</code> quantifier:</strong> It is equivalent to <code>{0,1}</code>, either zero or one occurrence. <code>?</code> is very useful for optional occurrences in patterns.</li>
|
||||
|
||||
<p>Let's see an example to match negative and positive numbers.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtph" class="embed"></iframe>
|
||||
</div></li>
|
||||
</ol>
|
||||
|
||||
<p><strong>Now, you may be thinking, what if we want to match characters like <code>*, ?, +, {, }</code> in the text, they are special characters. Check it out in the appendix.</strong></p>
|
||||
|
||||
<h3 id="problems">Problems</h3>
|
||||
|
||||
<ol>
|
||||
<li><p>Find out a regex to match positive integers or floating point numbers with exactly two characters after the decimal point.</p>
|
||||
|
||||
<p>Answer: <code>\d+(\.\d\d)?</code></p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtpk" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p>Predict the output of the following regex:<br>
|
||||
<strong>RegEx</strong>: <code>[abc]{2,}</code> <br>
|
||||
<strong>Text</strong>:
|
||||
<code>aaa
|
||||
abc
|
||||
abbccc
|
||||
avbcc
|
||||
</code></p>
|
||||
|
||||
<p>Answer: </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtpt" class="embed"></iframe>
|
||||
</div></li>
|
||||
</ol>
|
||||
|
||||
<h3 id="natureofquantifiers">Nature of Quantifiers</h3>
|
||||
|
||||
<p>HTML tag is represented as <code><tag_name>some text</tag_name></code>. For example, <code><title>Regular expression</title></code></p>
|
||||
|
||||
<p>So, can you figure out an expression that will match both <code><tag_name></code> & <code></tag_name></code>?</p>
|
||||
|
||||
<p>Most of the people will say, it is <code><.*></code>. But it gives different result.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtq0" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>So, rather than matching up till first <code>></code>, it matches the whole tag. So, quantifiers are greedy by default. It is called <strong>Greediness!</strong>.</p>
|
||||
|
||||
<p>To solve this issue, we use <code>?</code> quantifier and it is called <strong>lazy matching</strong>. We will discuss it next.</p>
|
||||
|
||||
<p>Predict the output of the following regex: <br>
|
||||
<strong>RegEx</strong>: <code>(var|let)\s[a-zA-Z0-9_]\w* =\s"?\w+"?;</code> <br>
|
||||
<strong>Text</strong>:
|
||||
<code>var carname = "volvo";
|
||||
console.log(carname);
|
||||
let age = 8;
|
||||
var date = "23-03-2020";</code></p>
|
||||
|
||||
<p>Answer: </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtq3" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
178
Akash Articles/rendered/RegEx/boundary_matchers.html
Normal file
178
Akash Articles/rendered/RegEx/boundary_matchers.html
Normal file
@@ -0,0 +1,178 @@
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
|
||||
|
||||
|
||||
<h2 id="boundarymatchers">Boundary Matchers</h2>
|
||||
|
||||
<p>Now, we will learn how to match patterns at specific positions, like before, after or between some characters. For this purpose we use special characters like <code>^</code>,<code>$</code>,<code>\b & \B</code>,<code>\A</code>,<code>\z & \Z</code>, which are known as anchors.</p>
|
||||
|
||||
<p><strong>Notes:</strong> </p>
|
||||
|
||||
<ul>
|
||||
<li><p>Line is a string which ends at a line-break or a new-line character <code>\n</code>.</p></li>
|
||||
|
||||
<li><p>There is a slight change in javascript code, we were using up till now. Instead of <code>/____/g</code>, we will now use <code>/____/gm</code>. Modifier 'm' is used to perform multiline search. Notice it in next images!</p></li>
|
||||
|
||||
<li><p>Word character can be represented by, <code>[A-Za-z0-9_]</code>.</p></li>
|
||||
</ul>
|
||||
|
||||
<ol>
|
||||
<li><p><strong>Anchor <code>^</code></strong>: It is used to match patterns at the very start of a line.
|
||||
For example,</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuof" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<p>It will show a match, only if the pattern is occuring at the start of the line.</p>
|
||||
|
||||
|
||||
<li><p><strong>Anchor <code>$</code></strong>: Similarly, <code>$</code> is used to match patterns at the very end of a line.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuol" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>It will show a match, only if the pattern is occuring at the end of a line.</p>
|
||||
|
||||
<p>Example, both <code>^</code> and <code>$</code>,</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtsb" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
|
||||
<li><p><strong>Anchors <code>\b</code> & <code>\B</code></strong>: <code>\b</code> is called <strong>word boundary character</strong>. </p>
|
||||
|
||||
<p>Below is a list of positions, which qualifies as a <strong>boundary</strong> for <code>\b</code>:
|
||||
If Regex-pattern is ending(or starting) with,</p>
|
||||
<ul>
|
||||
<li>A word character, then boundary is itself(word character). Let's call it a <strong>word boundary</strong>.</li>
|
||||
|
||||
<li>A non-word character, then boundary is the next word-character. Let's call it a <strong>non-word boundary</strong>.</li></ul>
|
||||
<p>So, in short <code>\b</code> is only looking for word-character at boundaries, so it is called "word boundary character"</strong>.</p>
|
||||
|
||||
<p>Let's first observe some examples to understand it's working:</p></li>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu99" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>What did you observe? Our regex-pattern is starting and ending with a word character. So, the match occurs only if there is a substring starting and ending at word characters, which are required in our regex <code>[a-z]</code> and <code>\d</code> respectively.</p>
|
||||
|
||||
<p>Now, let's look at one more example.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu9c" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Here <code>\+</code> will show a match for <code>+</code>, check it out in appendix.</p>
|
||||
|
||||
<p>What did you observe? <br>
|
||||
<strong>First observation:</strong> Our pattern is starting with a non-word character and ending with a word character. So, the match occurs only if there is a substring having a non-word boundary at starting and word boundary at the ending.</p>
|
||||
|
||||
<p><strong>Second observation:</strong> Non-word character after a word-boundary does not affect the result. </p>
|
||||
|
||||
<p><code>\b</code> need not be used in pair. You can use a single <code>\b</code>. </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu9o" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p><code>\B</code> is just a complement of <code>\b</code>. <code>\B</code> matches at all the positions that is not a word boundary. Observe two examples below:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu9r" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu9u" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
</ol>
|
||||
<p><strong>Note:</strong> <code>\A</code> and <code>\z & \Z</code> are another anchors, which are used to match at the very start of input text and at very end of input text respectively. But it is not supported in Javascript.</p>
|
||||
|
||||
<p>Predict the output of the following regex:</p>
|
||||
|
||||
<ol>
|
||||
<li><strong>RegEx:</strong> <code>^[\wYou can't use 'macro parameter character #' in math mode</code> <br>
|
||||
<strong>Text:</strong>
|
||||
<code>This is matching passwords of length between 6 to 18:
|
||||
Abfah45$
|
||||
gadfaJ%33
|
||||
Abjapda454&1 spc
|
||||
bjaphgu12$
|
||||
</code><br>
|
||||
Answer:
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vua4" class="embed"></iframe>
|
||||
</div>
|
||||
</li>
|
||||
|
||||
<li><strong>RegEx:</strong> <code>\b\w+:\B</code> <br>
|
||||
<strong>Text:</strong> <code>1232: , +1232:, abc:, abc:a, abc89, (+abc::)</code><br>
|
||||
Answer:
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vua7" class="embed"></iframe>
|
||||
</div>
|
||||
</li>
|
||||
</ol>
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
233
Akash Articles/rendered/RegEx/character_classes.html
Normal file
233
Akash Articles/rendered/RegEx/character_classes.html
Normal file
@@ -0,0 +1,233 @@
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
|
||||
<h2 id="characterclasses">Character classes</h2>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtj8" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>What if you want to match both "soon" and "moon" or basically words ending with "oon"?</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtjb" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>What did you observe? You can see that, adding <code>[sm]</code> matches both "soon" and "moon". Here <code>[sm]</code> is called character class, which is basically a list of characters we want to match.</p>
|
||||
|
||||
<p>More formally, <code>[abc]</code> is basically 'either a or b or c'.</p>
|
||||
|
||||
<p>Predict the output of the following:</p>
|
||||
|
||||
<ol>
|
||||
<li><p><strong>RegEx:</strong> <code>[ABC][12]</code> <br>
|
||||
<strong>Text</strong>: <code>A1 grade is the best, but I scored A2.</code></p>
|
||||
|
||||
<p>Answer:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtk3" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p><strong>RegEx:</strong> <code>[0123456789][12345]:[abcdef][67890]:[1234589][abcdef]</code><br>
|
||||
<strong>Text</strong>: <code>Let's match 14:f6:3c mac address type of pattern.
|
||||
Other patterns are 51:a6:c5, 44:t6:3d, 72:c8:8e.</code></p>
|
||||
|
||||
<p>Answer:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtkf" class="embed"></iframe>
|
||||
</div></li>
|
||||
</ol>
|
||||
|
||||
<h3 id="negation">Negation</h3>
|
||||
|
||||
<p>Now, if we put <code>^</code>, then it will show a match for characters other than the ones in the bracket.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtl1" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Predict the output for the following:</p>
|
||||
|
||||
<p><strong>RegEx:</strong> <code>[^13579]A[^abc]z3[590*-]</code>
|
||||
<br> <strong>Text</strong>: <code>1Abz33 will match or 2Atz30 and 8Adz3*.</code></p>
|
||||
|
||||
<p>Answer:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtl7" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Writing every character (like <code>[0123456789]</code> or <code>[abcd]</code>) is somewhat slow and also erroneous, what is the short-cut?</p>
|
||||
|
||||
<h2 id="ranges">Ranges</h2>
|
||||
|
||||
<p>Ranges make our work easier. Consecutive characters can be included in a character class using the dash operator, for example, numbers from 0 to 9 can be simply written as 0-9. Similarly, <code>abcdef</code> can be replaced by <code>a-f</code>.</p>
|
||||
|
||||
<p>Examples: <code>456</code> --> <code>4-6</code>, <code>abc3456</code> --> <code>a-c3-6</code>, <code>c367980</code> --> <code>c36-90</code>.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtld" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Predict the output of the following regex:</p>
|
||||
|
||||
<ol>
|
||||
<li><p><strong>RegEx:</strong> <code>[a-d][^l-o][12][^5-7][l-p]</code>
|
||||
<br> <strong>Text</strong>: co13i, ae14p, eo30p, ce33l, dd14l.</p>
|
||||
|
||||
<p>Answer:</p>
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtlj" class="embed"></iframe>
|
||||
|
||||
</div>
|
||||
<p><strong>Note:</strong> If you write the range in reverse order (ex. 9-0), then it is an error.</p></li>
|
||||
<li><strong>RegEx:</strong> <code>[a-zB-D934][A-Zab0-9]</code><br>
|
||||
<strong>Text:</strong> t9, da, A9, zZ, 99, 3D, aCvcC9.
|
||||
Answer:
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtlm" class="embed"></iframe>
|
||||
</div></li>
|
||||
</ol>
|
||||
|
||||
<h2 id="predefinedcharacterclasses">Predefined Character Classes</h2>
|
||||
|
||||
<p>Some character classes are used so frequently that there are shorthand notations defined for them. Let's see one by one.</p>
|
||||
|
||||
<ol>
|
||||
<li><p><strong><code>\w</code> & <code>\W</code></strong>: <code>\w</code> is just a short form of a character class <code>[A-Za-Z0-9_]</code>. <code>\w</code> is called word character class.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtls" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p><code>\W</code> is equivalent to <code>[^\w]</code>. <code>\W</code> matches everything other than word characters.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtm2" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p><strong><code>\d</code> & <code>\D</code></strong>: <code>\d</code> matches any digit character. It is equivalent to character class <code>[0-9]</code>.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtm5" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p><code>\D</code> is equivalent to <code>[^\d]</code>. <code>\D</code> matches everything other than digits.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtmk" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<li><strong><code>\s</code> & <code>\S</code></strong>: <code>\s</code> matches whitespace characters. Tab(<code>\t</code>), newline(<code>\n</code>) & space(<code></code>) are whitespace characters. These characters are called non-printable characters.</li>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtmn" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Similarly, <code>\S</code> is equivalent to <code>[^\s]</code>. <code>\S</code> matches everything other than whitespace characters.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtmq" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p><strong>dot(<code>.</code>)</strong>: Dot matches any character except <code>\n</code>(line-break or new-line character) and <code>\r</code>(carriage-return character). Dot(<code>.</code>) is known as a <strong>wildcard</strong>.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtmt" class="embed"></iframe>
|
||||
</div></li>
|
||||
</ol>
|
||||
|
||||
<p><strong>Note:</strong> <code>\r</code> is known as a windows style new-line character.</p>
|
||||
|
||||
<h3 id="problems">Problems</h3>
|
||||
|
||||
<ol>
|
||||
<li><p>Predict the output of the following regex:
|
||||
<strong>RegEx:</strong> <code>[01][01][0-1]\W\s\d</code>
|
||||
<br> <strong>Text</strong>: <code>Binary to decimal data: 001- 1, 010- 2, 011- 3, a01- 4, 100- 4.</code></p>
|
||||
|
||||
<p>Answer: </p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtn0" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
|
||||
<li><p>Write a regex to match 28th February of any year. Date is in dd-mm-yyyy format.</p>
|
||||
|
||||
<p>Answer: <code>28-02-\d\d\d\d</code></p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtn3" class="embed"></iframe>
|
||||
</div></li>
|
||||
|
||||
<li><p>Write a regex to match dates that are not in March. Consider that, the dates are valid and no proper format is given, i.e. it can be in dd.mm.yyyy, dd\mm\yyyy, dd/mm/yyyy format.</p>
|
||||
|
||||
<p>Answer: <code>\d\d\W[10][^3]\W\d\d\d\d</code></p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtn9" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Note that, the above regex will also match dd-mm.yyyy or dd/mm\yyyy kind of wrong format, this problem can be solved by using backreferencing, which is a regex concept.</p></li>
|
||||
</ol>
|
||||
|
||||
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
147
Akash Articles/rendered/RegEx/lazy_matching.html
Normal file
147
Akash Articles/rendered/RegEx/lazy_matching.html
Normal file
@@ -0,0 +1,147 @@
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
|
||||
|
||||
table {
|
||||
border:1px solid black;
|
||||
border-collapse: collapse;
|
||||
min-width:60%;
|
||||
}
|
||||
|
||||
th, td{
|
||||
border:1px solid black;
|
||||
border-collapse: collapse;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
tr:nth-child(even) {background-color: #f2f2f2;}
|
||||
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
|
||||
|
||||
|
||||
<h2 id="lazymatching">Lazy matching:</h2>
|
||||
|
||||
<p>As we have seen, the default nature of quantifiers is greedy, so it will match as many characters as possible.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtqc" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>To make it lazy, we use <code>?</code> quantifier, which turns the regex engine to match as less characters as possible which satisfies the expression.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtqf" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p>Below is a table showing lazy version of all quantifiers:</p>
|
||||
<table>
|
||||
<tr>
|
||||
<td>Quantifier</td>
|
||||
<td>Lazy version</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>{n,m}</td>
|
||||
<td>{n,m}?</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>{n,}</td>
|
||||
<td>{n,}?</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>+</td>
|
||||
<td>+?</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>*</td>
|
||||
<td>*?</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>?</td>
|
||||
<td>??</td>
|
||||
</tr>
|
||||
</table>
|
||||
|
||||
<p>So, now we can match html tags as below:</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtqr" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
<p><strong>Problem</strong></p>
|
||||
|
||||
<ol>
|
||||
<li>
|
||||
<p>Find an expression to match <code>href="url"</code> in html file. Note that url can be anything, like <code>https://xyz.com</code>, <code>http://abc.io/app</code>, <code>https://cde.org</code>.</p>
|
||||
|
||||
<p>Answer: <code>href=".*?"</code></p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu96" class="embed"></iframe>
|
||||
</div>
|
||||
</li>
|
||||
<li>
|
||||
What will be the match for expression <code>\w+? \w+?</code> in <code>abc cde, 123 456</code>.
|
||||
<br>Answer: <code>123 4</code> and <code>abc d</code>
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vuek" class="embed"></iframe>
|
||||
</div>
|
||||
</li>
|
||||
<p>We will see how to extract things(like, urls) from the text using regex, in the "group and capturing" concept.</p>
|
||||
|
||||
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
88
Akash Articles/rendered/RegEx/named_groups.html
Normal file
88
Akash Articles/rendered/RegEx/named_groups.html
Normal file
@@ -0,0 +1,88 @@
|
||||
|
||||
<html>
|
||||
<head>
|
||||
<style type="text/css">
|
||||
.container {
|
||||
position: static;
|
||||
width: 800px;
|
||||
height: 350px;
|
||||
overflow: hidden;
|
||||
}
|
||||
.embed {
|
||||
height: 100%;
|
||||
width: 100%;
|
||||
min-width: 1000px;
|
||||
margin-left: -360px;
|
||||
margin-top: -57px;
|
||||
overflow: hidden;
|
||||
}
|
||||
body {
|
||||
width: 800px;
|
||||
margin: auto;
|
||||
padding: 1em;
|
||||
font-family: "Open Sans", sans-serif;
|
||||
line-height: 150%;
|
||||
letter-spacing: 0.1pt;
|
||||
}
|
||||
img {
|
||||
width: 90%;
|
||||
text-align: center;
|
||||
margin: auto;
|
||||
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
|
||||
}
|
||||
pre, code {
|
||||
padding: 1em;
|
||||
}
|
||||
</style>
|
||||
<script>
|
||||
document.addEventListener('readystatechange', event => {
|
||||
if (event.target.readyState === "complete")
|
||||
document.activeElement.blur();
|
||||
});
|
||||
</script>
|
||||
|
||||
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
|
||||
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
|
||||
</head>
|
||||
<body>
|
||||
|
||||
|
||||
|
||||
<h2 id="namedgroups">Named Groups</h2>
|
||||
|
||||
<p>Regular expressions with lots of groups and backreferencing can be difficult to maintain, as adding or removing a capturing group in the middle of the regex turns to change the numbers of all the groups that follow the added or removed group.</p>
|
||||
|
||||
<p>In regex, we have facility of named groups, which solves the above issue. Let's look at it.</p>
|
||||
|
||||
<p>We can name a group by putting <code>?<name></code> just after opening the paranthesis representing a group. For example, <code>(?<year>\d{4})</code> is a named group.</p>
|
||||
|
||||
<p>Below is a code, we have already looked in <strong>capturing groups</strong> part. You can see, the code is more readable now.</p>
|
||||
|
||||
<pre><code class="js language-js">var str = "2020-01-20";
|
||||
|
||||
// Pattern string
|
||||
var pattern = /(?<year>\d{4})-(?<month>\d{2})-(?<day>\d{2})/g;
|
||||
|
||||
// Data replacement using $<group_name>
|
||||
var ans=str.replace(pattern, '$<day>-$<month>-$<year>');
|
||||
|
||||
console.log(ans);
|
||||
// Output will be: 20-01-2020
|
||||
</code></pre>
|
||||
|
||||
<p>Backreference syntax for numbered groups works for named capture groups as well. <code>\k<name></code> matches the string that was previously matched by the named capture group <code>name</code>, which is a standard way to backreference named group.</p>
|
||||
|
||||
<div class="container">
|
||||
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu8k" class="embed"></iframe>
|
||||
</div>
|
||||
|
||||
|
||||
<script type="text/javascript">
|
||||
document.addEventListener('DOMContentLoaded', (event) => {
|
||||
document.querySelectorAll('pre code').forEach((block) => {
|
||||
hljs.highlightBlock(block);
|
||||
});
|
||||
});
|
||||
</script>
|
||||
</body>
|
||||
</html>
|
Reference in New Issue
Block a user