From 42924ba356a5bd296a834093911ef1ae9c5b0345 Mon Sep 17 00:00:00 2001 From: Aakash Panchal <51417248+Aakash-Panchal27@users.noreply.github.com> Date: Sat, 7 Mar 2020 23:38:36 +0530 Subject: [PATCH] Update character_classes.html --- Akash Articles/RegEx/character_classes.html | 236 ++++++++++---------- 1 file changed, 118 insertions(+), 118 deletions(-) diff --git a/Akash Articles/RegEx/character_classes.html b/Akash Articles/RegEx/character_classes.html index 7d0ddfa..cbc0278 100644 --- a/Akash Articles/RegEx/character_classes.html +++ b/Akash Articles/RegEx/character_classes.html @@ -46,183 +46,183 @@ -

Character classes:

+

Character classes:

-
- -
+
+ +
-

What if you want to match both "soon" and "moon" or basically words ending with "oon"?

+

What if you want to match both "soon" and "moon" or basically words ending with "oon"?

-
- -
+
+ +
-

What did you observe? You can see that, adding [sm] matches both $soon$ and $moon$. Here [sm] is called character class, which is basically a list of characters we want to match.

+

What did you observe? You can see that, adding [sm] matches both $soon$ and $moon$. Here [sm] is called character class, which is basically a list of characters we want to match.

-

More formally, [abc] is basically 'either a or b or c'.

+

More formally, [abc] is basically 'either a or b or c'.

-

Predict the output of the following:

+

Predict the output of the following:

-
    -
  1. RegEx: [ABC][12]
    - Text: A1 grade is the best, but I scored A2.

    +
      +
    1. RegEx: [ABC][12]
      + Text: A1 grade is the best, but I scored A2.

      -

      Answer:

      +

      Answer:

      -
      - -
    2. +
      + +
      -
    3. RegEx: [0123456789][12345]:[abcdef][67890]:[0123456789][67890]:[1234589][abcdef]
      - Text: Let's match 14:f6:89:3c mac address type of pattern. Other patterns are 51:a6:90:c5, 44:t6:u9:3d, 72:c8:39:8e.

      +
    4. RegEx: [0123456789][12345]:[abcdef][67890]:[0123456789][67890]:[1234589][abcdef]
      + Text: Let's match 14:f6:89:3c mac address type of pattern. Other patterns are 51:a6:90:c5, 44:t6:u9:3d, 72:c8:39:8e.

      -

      Answer:

      +

      Answer:

      -
      - -
    5. -
    +
    + +
  2. +
-

Negation

+

Negation

-

Now, if we put ^, then it will show a match for characters other than the ones in the bracket.

+

Now, if we put ^, then it will show a match for characters other than the ones in the bracket.

-
- -
+
+ +
-

Predict the output for the following:

+

Predict the output for the following:

-

RegEx: [^13579]A[^abc]z3[590*-] -
Text: 1Abz33 will match or 2Atz30 and 8Adz3*.

+

RegEx: [^13579]A[^abc]z3[590*-] +
Text: 1Abz33 will match or 2Atz30 and 8Adz3*.

-

Answer:

+

Answer:

-
- -
+
+ +
-

Writing every character (like [0123456789] or [abcd]) is somewhat slow and also erroneous, what is the short-cut?

+

Writing every character (like [0123456789] or [abcd]) is somewhat slow and also erroneous, what is the short-cut?

-

Ranges

+

Ranges

-

Ranges make our work easier. Consecutive characters can be included in a character class using the dash operator, for example, numbers from 0 to 9 can be simply written as 0-9. Similarly, abcdef can be replaced by a-f.

+

Ranges make our work easier. Consecutive characters can be included in a character class using the dash operator, for example, numbers from 0 to 9 can be simply written as 0-9. Similarly, abcdef can be replaced by a-f.

-

Examples: 456 --> 4-6, abc3456 --> a-c3-6, c367980 --> c36-90.

+

Examples: 456 --> 4-6, abc3456 --> a-c3-6, c367980 --> c36-90.

-
- -
+
+ +
-

Predict the output of the following regex:

+

Predict the output of the following regex:

-
    -
  1. RegEx: [a-d][^l-o][12][^5-7][l-p] -
    Text: co13i, ae14p, eo30p, ce33l, dd14l.

    +
      +
    1. RegEx: [a-d][^l-o][12][^5-7][l-p] +
      Text: co13i, ae14p, eo30p, ce33l, dd14l.

      -

      Answer: -

      +

      Answer: +

      - + -

    2. -
    +

  2. +
-

Note: If you write the range in reverse order (ex. 9-0), then it is an error.

+

Note: If you write the range in reverse order (ex. 9-0), then it is an error.

-
    -
  1. RegEx: [a-zB-D934][A-Zab0-9]
    - Text: t9, da, A9, zZ, 99, 3D, aCvcC9. - Answer: -
    - -
  2. -
+
    +
  1. RegEx: [a-zB-D934][A-Zab0-9]
    + Text: t9, da, A9, zZ, 99, 3D, aCvcC9. + Answer: +
    + +
  2. +
-

Predefined Character Classes

+

Predefined Character Classes

-
    -
  1. \w & \W: \w is just a short form of a character class [A-Za-Z0-9_]. \w is called word character class.

    +
      +
    1. \w & \W: \w is just a short form of a character class [A-Za-Z0-9_]. \w is called word character class.

      -
      - -
      +
      + +
      -

      \W is equivalent to [^\w]. \W matches everything other than word characters.

      +

      \W is equivalent to [^\w]. \W matches everything other than word characters.

      -
      - -
    2. +
      + +
      -
    3. \d & \D: \d matches any digit character. It is equivalent to character class [0-9].

      +
    4. \d & \D: \d matches any digit character. It is equivalent to character class [0-9].

      -
      - -
      +
      + +
      -

      \D is equivalent to [^\d]. \D matches everything other than digits.

      +

      \D is equivalent to [^\d]. \D matches everything other than digits.

      -
      - -
      +
      + +
      -
        -
      1. \s & \S: \s matches whitespace characters. Tab(\t), newline(\n) & space() are whitespace characters. These characters are called non-printable characters.
      +
        +
      1. \s & \S: \s matches whitespace characters. Tab(\t), newline(\n) & space() are whitespace characters. These characters are called non-printable characters.
      -
      - -
      +
      + +
      -

      Similarly, \S is equivalent to [^\s]. \S matches everything other than whitespace characters.

      +

      Similarly, \S is equivalent to [^\s]. \S matches everything other than whitespace characters.

      -
      - -
    5. +
      + +
      -
    6. dot(.): Dot matches any character except \n(line-break or new-line character) and \r(carriage-return character). Dot(.) is known as a wildcard.

      +
    7. dot(.): Dot matches any character except \n(line-break or new-line character) and \r(carriage-return character). Dot(.) is known as a wildcard.

      -
      - -
    8. -
    +
    + +
  2. +
-

Note: \r is known as a windows style new-line character.

+

Note: \r is known as a windows style new-line character.

-

Predict the output of the following regex:

+

Predict the output of the following regex:

-
    -
  1. RegEx: [01][01][0-1]\W\s\d -
    Text: Binary to decimal data: 001- 1, 010- 2, 011- 3, a01- 4, 100- 4.

    +
      +
    1. RegEx: [01][01][0-1]\W\s\d +
      Text: Binary to decimal data: 001- 1, 010- 2, 011- 3, a01- 4, 100- 4.

      -

      Answer:

      +

      Answer:

      -
      - -
    2. -
    +
    + +
  2. +
-

Problems

+

Problems

-
    -
  1. Write a regex to match 28th February of any year. Date is in dd-mm-yyyy format.

    +
      +
    1. Write a regex to match 28th February of any year. Date is in dd-mm-yyyy format.

      -

      Answer: 28-02-\d\d\d\d

      +

      Answer: 28-02-\d\d\d\d

      -
      - -
    2. +
      + +
      -
    3. Write a regex to match dates that are not in March. Consider that, the dates are valid and no proper format is given, i.e. it can be in dd.mm.yyyy, dd\mm\yyyy, dd/mm/yyyy format.

      +
    4. Write a regex to match dates that are not in March. Consider that, the dates are valid and no proper format is given, i.e. it can be in dd.mm.yyyy, dd\mm\yyyy, dd/mm/yyyy format.

      -

      Answer: \d\d\W[10][^3]\W\d\d\d\d

      +

      Answer: \d\d\W[10][^3]\W\d\d\d\d

      -
      - -
      +
      + +
      -

      Note that, the above regex will also match dd-mm.yyyy or dd/mm\yyyy kind of wrong format, this problem can be solved by using backreferencing, which is a regex concept.

    5. -
    +

    Note that, the above regex will also match dd-mm.yyyy or dd/mm\yyyy kind of wrong format, this problem can be solved by using backreferencing, which is a regex concept.

  2. +