2020-03-07 23:36:24 +05:30
< html >
< head >
< style type = "text/css" >
.container {
position: static;
width: 800px;
height: 350px;
overflow: hidden;
}
.embed {
height: 100%;
width: 100%;
min-width: 1000px;
margin-left: -360px;
margin-top: -57px;
overflow: hidden;
}
body {
width: 800px;
margin: auto;
padding: 1em;
font-family: "Open Sans", sans-serif;
line-height: 150%;
letter-spacing: 0.1pt;
}
img {
width: 90%;
text-align: center;
margin: auto;
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
}
pre, code {
padding: 1em;
}
< / style >
< script >
document.addEventListener('readystatechange', event => {
if (event.target.readyState === "complete")
document.activeElement.blur();
});
< / script >
< link rel = "stylesheet" href = "https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css" >
< script src = "https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js" > < / script >
< / head >
< body >
2020-03-07 23:38:36 +05:30
< h2 id = "characterclasses" > Character classes:< / h2 >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtj8" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > What if you want to match both "soon" and "moon" or basically words ending with "oon"?< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtjb" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > What did you observe? You can see that, adding < code > [sm]< / code > matches both $soon$ and $moon$. Here < code > [sm]< / code > is called character class, which is basically a list of characters we want to match.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > More formally, < code > [abc]< / code > is basically 'either a or b or c'.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Predict the output of the following:< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< ol >
< li > < p > < strong > RegEx:< / strong > < code > [ABC][12]< / code > < br >
< strong > Text< / strong > : A1 grade is the best, but I scored A2.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Answer:< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtk3" class = "embed" > < / iframe >
< / div > < / li >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< li > < p > < strong > RegEx:< / strong > < code > [0123456789][12345]:[abcdef][67890]:[0123456789][67890]:[1234589][abcdef]< / code > < br >
< strong > Text< / strong > : Let's match 14:f6:89:3c mac address type of pattern. Other patterns are 51:a6:90:c5, 44:t6:u9:3d, 72:c8:39:8e.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Answer:< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtkf" class = "embed" > < / iframe >
< / div > < / li >
< / ol >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< h3 id = "negation" > Negation< / h3 >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Now, if we put < code > ^< / code > , then it will show a match for characters other than the ones in the bracket.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtl1" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Predict the output for the following:< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > < strong > RegEx:< / strong > < code > [^13579]A[^abc]z3[590*-]< / code >
< br > < strong > Text< / strong > : 1Abz33 will match or 2Atz30 and 8Adz3*.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Answer:< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtl7" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Writing every character (like < code > [0123456789]< / code > or < code > [abcd]< / code > ) is somewhat slow and also erroneous, what is the short-cut?< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< h2 id = "ranges" > Ranges< / h2 >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Ranges make our work easier. Consecutive characters can be included in a character class using the dash operator, for example, numbers from 0 to 9 can be simply written as 0-9. Similarly, < code > abcdef< / code > can be replaced by < code > a-f< / code > .< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Examples: < code > 456< / code > --> < code > 4-6< / code > , < code > abc3456< / code > --> < code > a-c3-6< / code > , < code > c367980< / code > --> < code > c36-90< / code > .< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtld" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Predict the output of the following regex:< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< ol >
< li > < p > < strong > RegEx:< / strong > < code > [a-d][^l-o][12][^5-7][l-p]< / code >
< br > < strong > Text< / strong > : co13i, ae14p, eo30p, ce33l, dd14l.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Answer:
< div class = "container" > < / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtlj" class = "embed" > < / iframe >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > < / div > < / p > < / li >
< / ol >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > < strong > Note:< / strong > If you write the range in reverse order (ex. 9-0), then it is an error.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< ol >
< li > < strong > RegEx:< / strong > < code > [a-zB-D934][A-Zab0-9]< / code > < br >
< strong > Text:< / strong > t9, da, A9, zZ, 99, 3D, aCvcC9.
Answer:
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtlm" class = "embed" > < / iframe >
< / div > < / li >
< / ol >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< h2 id = "predefinedcharacterclasses" > Predefined Character Classes< / h2 >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< ol >
< li > < p > < strong > < code > \w< / code > & < code > \W< / code > < / strong > : < code > \w< / code > is just a short form of a character class < code > [A-Za-Z0-9_]< / code > . < code > \w< / code > is called word character class.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtls" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > < code > \W< / code > is equivalent to < code > [^\w]< / code > . < code > \W< / code > matches everything other than word characters.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtm2" class = "embed" > < / iframe >
< / div > < / li >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< li > < p > < strong > < code > \d< / code > & < code > \D< / code > < / strong > : < code > \d< / code > matches any digit character. It is equivalent to character class < code > [0-9]< / code > .< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtm5" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > < code > \D< / code > is equivalent to < code > [^\d]< / code > . < code > \D< / code > matches everything other than digits.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtmk" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< ol >
< li > < strong > < code > \s< / code > & < code > \S< / code > < / strong > : < code > \s< / code > matches whitespace characters. Tab(< code > \t< / code > ), newline(< code > \n< / code > ) & space(< code > < / code > ) are whitespace characters. These characters are called non-printable characters.< / li > < / ol >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtmn" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Similarly, < code > \S< / code > is equivalent to < code > [^\s]< / code > . < code > \S< / code > matches everything other than whitespace characters.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtmq" class = "embed" > < / iframe >
< / div > < / li >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< li > < p > < strong > dot(< code > .< / code > )< / strong > : Dot matches any character except < code > \n< / code > (line-break or new-line character) and < code > \r< / code > (carriage-return character). Dot(< code > .< / code > ) is known as a < strong > wildcard< / strong > .< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtmt" class = "embed" > < / iframe >
< / div > < / li >
< / ol >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > < strong > Note:< / strong > < code > \r< / code > is known as a windows style new-line character.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Predict the output of the following regex:< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< ol >
< li > < p > < strong > RegEx:< / strong > < code > [01][01][0-1]\W\s\d< / code >
< br > < strong > Text< / strong > : Binary to decimal data: 001- 1, 010- 2, 011- 3, a01- 4, 100- 4.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Answer: < / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtn0" class = "embed" > < / iframe >
< / div > < / li >
< / ol >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< h3 id = "problems" > Problems< / h3 >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< ol >
< li > < p > Write a regex to match 28th February of any year. Date is in dd-mm-yyyy format.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Answer: < code > 28-02-\d\d\d\d< / code > < / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtn3" class = "embed" > < / iframe >
< / div > < / li >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< li > < p > Write a regex to match dates that are not in March. Consider that, the dates are valid and no proper format is given, i.e. it can be in dd.mm.yyyy, dd\mm\yyyy, dd/mm/yyyy format.< / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Answer: < code > \d\d\W[10][^3]\W\d\d\d\d< / code > < / p >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< div class = "container" >
< iframe scrolling = "no" style = "position: absolute; top: -9999em; visibility: hidden;" onload = "this.style.position='static'; this.style.visibility='visible';" src = "https://regexr.com/4vtn9" class = "embed" > < / iframe >
< / div >
2020-03-07 23:36:24 +05:30
2020-03-07 23:38:36 +05:30
< p > Note that, the above regex will also match dd-mm.yyyy or dd/mm\yyyy kind of wrong format, this problem can be solved by using backreferencing, which is a regex concept.< / p > < / li >
< / ol >
2020-03-07 23:36:24 +05:30
< script type = "text/javascript" >
document.addEventListener('DOMContentLoaded', (event) => {
document.querySelectorAll('pre code').forEach((block) => {
hljs.highlightBlock(block);
});
});
< / script >
< / body >
< / html >