Professor Martin Paul Eve, Birkbeck, University of London
. | Any character |
\n | Matches a newline character |
\t | Matches a tab |
\d | Matches a digit |
\w | Matches an alphanumeric character |
\W | Matches a non-alphanumeric character |
\s | Matches a whitespace character |
\S | Matches a non-whitespace character |
\ | Escapes special characters |
^ | Matches the start of a string |
$ | Matches the end of a string |
* | Matches the preceding element 0 or more times |
+ | Matches the preceding element 1 or more times |
? | Matches the preceding element 0 or 1 times |
{x} | Matches the preceding element x times |
{x,y} | Matches the preceding element between x and y times |
{x,} | Matches the preceding element at least x times |
{,y} | Matches the preceding element between 0 and y times |
Match the last word of a string that ends in a full stop.
\w+\.$
Regular expressions are greedy. They will match the most they can.
<.+\/.+>
Because the regex is greedy. It is matching the whole string.
We need to use the lazy quantifier: ?
<.+?>.*?<\/.+?>
Match a set of characters: e.g. [abc] will match a, b, or c
[ | Begins a character group. |
] | Closes a character group. |
All but ^-]\ | Matches literally. |
\ | A literal backslash. |
\ and ^-]\ | Escapes. |
^ | Negate the character group. That is, match everything NOT in the group. |
- | A character range. e.g. a-z. |
Write a single expression that will match both verbs:
\w+i[sz]e
(?P<group_name>.+)
Work on this text: Is this equal to that? Is this equal to this?
Write an expression that captures the word "this" following the first "Is"
Capture the sentence where the second part reads "equal to this" without using the word "this"
Is (\w+) equal to \1\?
Match characters ahead or behind without including them in the match
(?=zzz) | Lookahead. True if the next part of the string is “zzz”. |
(?<=zzz) | Lookbehind. True if the preceding part of the string is “zzz”. |
(?!zzz) | Negative lookahead. True if the next part of the string is not “zzz”. |
(?<!zzz) | Negative lookbehind. True if the preceding part of the string is not “zzz”. |
Use the text: In the year 210AD there were 394 chickens roaming the plains.
Match the 210AD date but not the 394 number, extracting just the numbers, without literally matching.
\d+(?=AD)
Thank you!
Presentation licensed under a CC BY-SA 3.0 license. All institutional images excluded from CC license. Available to view online at http://meve.io/Regex2017.