Syntaxe
- char matches itself, unless it is a special character (metachar): . \ [ ] * + ^ $ and ( ) in posix mode.
- . matches any character.
- matches the character following it, except:
* \a, \b, \f, \n, \r, \t, \v match the corresponding C escape char, respectively BEL, BS, FF, LF, CR, TAB and VT; Note that \r and \n are never matched because in Scintilla, regular expression searches are made line per line (stripped of end-of-line chars). * if not in posix mode, when followed by a left or right round bracket (see [7]); * when followed by a digit 1 to 9 (see [8]); * when followed by a left or right angle bracket (see [9]); * when followed by d, D, s, S, w or W (see [10]); * when followed by x and two hexa digits (see [11]); Backslash is used as an escape character for all other meta-characters, and itself.
- [set]
matches one of the characters in the set. If the first character in the set is ^, it matches the characters NOT in the set, i.e. complements the set. A shorthand S-E (start dash end) is used to specify a set of characters S up to E, inclusive. The special characters ] and - have no special meaning if they appear as the first chars in the set. To include both, put - first: [-]A-Z] (or just backslash them).
example match [-]|] matches these 3 chars, []-|] matches from ] to | chars [a-z] any lowercase alpha [^-]] any char except - and ] [^A-Z] any char except uppercase alpha [a-zA-Z] any alpha
- *
any regular expression form [1] to [4] (except [7], [8] and [9] forms of [3]), followed by closure char (*) matches zero or more matches of that form.
- +
same as [5], except it matches one or more. Both [5] and [6] are greedy (they match as much as possible).
a regular expression in the form [1] to [12], enclosed as \(form\) (or (form) with posix flag) matches what form matches. The enclosure creates a set of tags, used for [8] and for pattern substitution. The tagged forms are numbered starting from 1.
a \ followed by a digit 1 to 9 matches whatever a previously tagged regular expression ([7]) matched.
- \< \>
a regular expression starting with a \< construct and/or ending with a \> construct, restricts the pattern matching to the beginning of a word, and/or the end of a word. A word is defined to be a character string beginning and/or ending with the characters A-Z a-z 0-9 and _. Scintilla extends this definition by user setting. The word must also be preceded and/or followed by any character outside those mentioned.
- \l
a backslash followed by d, D, s, S, w or W, becomes a character class (both inside and outside sets []).
- d: decimal digits
- D: any char except decimal digits
- s: whitespace (space, \t \n \r \f \v)
- S: any char except whitespace (see above)
- w: alphanumeric & underscore (changed by user setting)
- W: any char except alphanumeric & underscore (see above)
- \xHH
a backslash followed by x and two hexa digits, becomes the character whose Ascii code is equal to these digits. If not followed by two digits, it is 'x' char itself.
a composite regular expression xy where x and y are in the form [1] to [10] matches the longest match of x followed by a match for y. [13] ^ $
a regular expression starting with a ^ character and/or ending with a $ character, restricts the pattern matching to the beginning of the line, or the end of line. [anchors] Elsewhere in the pattern, ^ and $ are treated as ordinary characters.