Regex Cheat Sheet
Basic matching
Each symbol matches a single character:
. | anything (except line breaks) |
\d | digit (0123456789) |
\D | non-digit |
\w | “word”-characters (i.e. letters and digits and _ ) |
\W | non-word |
␣ | space |
\s | whitespace (␣,\t,\r,\n) |
\S | non-whitespace |
\t | tab |
\r \n \r\n |
return new line line break line break encoding might be all of these depending on system |
Character classes
[…] | match any of the characters in the class: [aeiou] matches vowels |
[^…] | specifies complement set: [^aeiou] matches non-vowels (including non-letters!) |
[…-…] | specifies range: [a-e] matches abcde, [0-9a-f] matches 0123456789abcdef |
POSIX Classes | |
[[:alpha:]] | A-Z and a-z |
[[:alnum:]] | digits and letters A-Z and a-z |
[[:punct:]] | punctuation marks: ?!.,:; |
Boundaries
Boundary characters anchor pattern to some edge, but do not select any characters themselves.
\b | word boundaries (any edge between \w and \W) |
\B | non word boundaries |
^ | beginning of line/string |
$ | end of line/string |
Disjunction
(X|Y) | X or Y: \b(cat|dog)s\b matches cats and dogs |
Quantifiers
X* | 0 or more repeditions of X |
X+ | 1 or more repeditions of X |
X? | 0 or 1 instances of X |
X{m} | exactly m instances of X |
X{m,} | at least m instances of X |
X{m,n} | between m and n (inclusive) instances of X |
Quantifiers just apply to one character. Use (…) to specify quantifier scope. ab+ matches ab, abb,abbb, abbbb, … ; (ab)+ matches ab,abab, ababab, …
Quantifiers are by default greedy. Add ? after quantifier to make it lazy:
Greedy: ^.*b aabaaba
Lazy: ^.*?b aabaaba
Special characters
The characters {}[]()^$.|*+?\ (and – inside […]) have special meaning and must be ‘escaped’ using \ to match them, e.g.:
\. matches period .
\\ matches the backslash \