Vim Regular Expressions

vi(1) predates Perl and the PCRE. The regular expression engine used in the entire family of programs is custom, and has some unique quirks.


Character Classes

The regular expression engine supports 'standard character classes'. As in...

There are also a number of special character classes.

Some are just short forms of commonly used classes.

Pattern

Short Form Of

\a

[a-zA-Z]

\A

[^a-zA-Z]

\d

[0-9]

\D

[^0-9]

\x

[0-9a-fA-F]

\X

[^0-9a-fA-F]

\w

[a-zA-Z0-9_]

\W

[^a-zA-Z0-9_]

Others have key functionality.

Pattern

Use

\s

any white-space character (see :help whitespace)

\S

any character besides white-space

\t

represents tabs in search/replace

\r

represents newlines in search/replace

Try :help character-classes for an exhaustive list.


Boundaries

\< and \> represent word start and word end, respectively.


Lookaround

Lookaround patterns are used to express a pattern that must follow (for a lookbehind) or precede (for a lookahead) some other pattern.

These subpatterns are not considered part of the match.

Note that the regular expression engine first searches for the primary pattern, and then for each match searches for the lookaround pattern. This process is computationally more complex than searching for a single pattern, and many searches can be more efficiently expressed using \zs and \ze.

Note also that due to the order of operations, these subpatterns cannot be referenced in the primary pattern.

Positive Lookbehind

A positive lookbehind looks for matches that follow a subpattern. The operator is \@<=, and the subpattern is whatever immediately precedes the operator. The subpattern and operator should be placed at the beginning of a pattern.

For example, _\@<=ice matches ice only if it follows _. Both _ice and (_ice) match this pattern, but ice and _(ice) will not.

Negative Lookbehind

A negative lookbehind looks for matches that do not follow a subpattern. The operator is \@<!, and the subpattern is whatever immediately precedes the operator. The subpattern and operator should be placed at the beginning of a pattern.

For example, \(cat.*\)\@<!dog matches dog only if cat is not present earlier in the line. fox,parrot,dog,cat matches this pattern, but fox,cat,dog,parrot will not.

Positive Lookahead

A positive lookahead looks for matches that precede a subpattern. The operator is \@, and the subpattern is whatever immediately precedes the operator. The subpattern and operator should be placed at the end of a pattern.

For example, s/ice\d\@=/X/g replaces ice with X only if it does not precede a digit. ice ice_2 ice2 iced becomes ice ice_2 X2 iced.

Negative Lookahead

A negative lookahead looks for matches that do not precede a subpattern. The operator is \@!, and the subpattern is whatever immediately precedes the operator. The subpattern and operator should be placed at the end of a pattern.

For example, s/par\(.*\<par\>\)\@!/X/g replaces par with X only if the word par does not match later in the line. parse and par and sparse becomes parse and X and sXse.

Limit Lookaround Scans

To limit on how much of a line can searched for a subpattern, try:

syn match xmlTagName +\%(<\|</\)\@2<=[^ /!?<>"']\++ ...

The 2 in \@2<= restricts the lookbehind to just 2 bytes.


CategoryRicottone

Vim/RegularExpressions (last edited 2022-09-15 18:43:49 by DominicRicottone)