= Stata Regular Expressions = <> ---- == Operators == Stata tries to support the POSIX.2 standard. ||'''Operator'''||'''Effect''' || ||`*` ||match zero or more of the preceding expression || ||`+` ||match one or more of the preceding expression || ||`?` ||match either zero or one of the preceding expression || ||`a-z` ||when between two characters (not operators), a dash means match a range of characters or numbers || ||`.` ||match any character || ||`\` ||escape a character to match the literal character that would otherwise be interpreted as an operator|| ||`^` ||when at the beginning of a pattern, a caret means match the beginning of string || ||`$` ||when at the end of a regular expression, a dollar sign means match the end of string || ||`|` ||match either the preceding expression or the following expression || ||`[` and `]` ||denote a set of characters that can be matched || ||`(` and `)` ||denote a subexpression || ---- == Functions == There are two sets of regular expression functions in Stata. [[Stata/StringFunctions#RegexM|regexm]] tests a string for a pattern. [[Stata/StringFunctions#RegexR|regexr]] replaces the first matching substring in a string. [[Stata/StringFunctions#RegexS|regexs]] extracts a matching subtring (up to the 9th) from a string. These functions all assume that the string is strict ASCII; does not contain null bytes (`char(0)`); and are restricted in terms of how many matching substrings can be accessed or manipulated. [[Stata/StringFunctions#UstrRegexM|ustrregexm]] tests a string for a pattern. [[Stata/StringFunctions#UstrRegexRf|ustrregexrf]] replaces the first matching substring in a string. [[Stata/StringFunctions#UstrRegexRa|ustrregexra]] replaces all matching substrings in a string. [[Stata/StringFunctions#UStrRegexS|ustrregexs]] extracts a matching subtring from a string. These functions bypass all of the above restrictions. ---- CategoryRicottone