Lecture_Notes/Akash Articles/RegEx/lazy_matching.html
2020-03-08 12:57:24 +05:30

100 lines
3.3 KiB
HTML

<html>
<head>
<style type="text/css">
.container {
position: static;
width: 800px;
height: 350px;
overflow: hidden;
}
.embed {
height: 100%;
width: 100%;
min-width: 1000px;
margin-left: -360px;
margin-top: -57px;
overflow: hidden;
}
body {
width: 800px;
margin: auto;
padding: 1em;
font-family: "Open Sans", sans-serif;
line-height: 150%;
letter-spacing: 0.1pt;
}
img {
width: 90%;
text-align: center;
margin: auto;
box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19);
}
pre, code {
padding: 1em;
}
</style>
<script>
document.addEventListener('readystatechange', event => {
if (event.target.readyState === "complete")
document.activeElement.blur();
});
</script>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/styles/default.min.css">
<script src="https://cdnjs.cloudflare.com/ajax/libs/highlight.js/9.18.1/highlight.min.js"></script>
</head>
<body>
<h2 id="lazymatching">Lazy matching:</h2>
<p>As we have seen, the default nature of quantifiers is greedy, so it will match as many characters as possible.</p>
<div class="container">
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtqc" class="embed"></iframe>
</div>
<p>To make it lazy, we use <code>?</code> quantifier, which turns the regex engine to match as less characters as possible which satisfies the regex.</p>
<div class="container">
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtqf" class="embed"></iframe>
</div>
<p>So, now we can match html tags as below:</p>
<div class="container">
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vtqr" class="embed"></iframe>
</div>
<p>Let's have one more example,</p>
<div class="container">
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vthc" class="embed"></iframe>
</div>
<p><strong>Problem</strong></p>
<p>Find an expression to match <code>href="url"</code> in html file. Note that url can be anything, like <code>https://xyz.com</code>, <code>http://abc.io/app</code>, <code>https://cde.org</code>.</p>
<p>Answer: <code>href=".*?"</code></p>
<div class="container">
<iframe scrolling="no" style="position: absolute; top: -9999em; visibility: hidden;" onload="this.style.position='static'; this.style.visibility='visible';" src="https://regexr.com/4vu96" class="embed"></iframe>
</div>
<p>We will see how to extract things(like, urls) from the text using regex, in the "group and capturing" concept.</p>
<script type="text/javascript">
document.addEventListener('DOMContentLoaded', (event) => {
document.querySelectorAll('pre code').forEach((block) => {
hljs.highlightBlock(block);
});
});
</script>
</body>
</html>