Regex Cheat Sheet

Basic matching

Each symbol matches a single character:

.	anything (except line breaks)
\d	digit (0123456789)
\D	non-digit
\w	“word”-characters (i.e. letters and digits and _ )
\W	non-word
␣	space
\s	whitespace (␣,\t,\r,\n)
\S	non-whitespace
\t	tab
\r \n \r\n	return new line line break line break encoding might be all of these depending on system

Character classes

[…]	match any of the characters in the class: [aeiou] matches vowels
[^…]	specifies complement set: [^aeiou] matches non-vowels (including non-letters!)
[…-…]	specifies range: [a-e] matches abcde, [0-9a-f] matches 0123456789abcdef
	POSIX Classes
[[:alpha:]]	A-Z and a-z
[[:alnum:]]	digits and letters A-Z and a-z
[[:punct:]]	punctuation marks: ?!.,:;

Boundaries

Boundary characters anchor pattern to some edge, but do not select any characters themselves.

\b	word boundaries (any edge between \w and \W)
\B	non word boundaries
^	beginning of line/string
$	end of line/string

Disjunction

(X|Y)

X or Y: \b(cat|dog)s\b matches cats and dogs

Quantifiers

X*	0 or more repeditions of X
X+	1 or more repeditions of X
X?	0 or 1 instances of X
X{m}	exactly m instances of X
X{m,}	at least m instances of X
X{m,n}	between m and n (inclusive) instances of X

Quantifiers just apply to one character. Use (…) to specify quantifier scope. ab+ matches ab, abb,abbb, abbbb, … ; (ab)+ matches ab,abab, ababab, …

Quantifiers are by default greedy. Add ? after quantifier to make it lazy:
Greedy: ^.*b aabaaba
Lazy: ^.*?b aabaaba

Special characters

The characters {}[]()^$.|*+?\ (and – inside […]) have special meaning and must be ‘escaped’ using \ to match them, e.g.:
\. matches period .
\\ matches the backslash \