UNIX Regular Expressions | ||
| UNIX regular expressions are defined as follows: | ||
| ^ | Matches beginning of line. | |
| $ | Matches end of line. | |
| . | Matches any character except newline. | |
| X+ | Maximal match of one or more occurrences of X. See "Minimal Versus Maximal Matching for more information on minimal and maximal matching. | |
| X* | Maximal match of zero or more occurrences of X. | |
| X? | Maximal match of zero or one occurrences of X. | |
| X{n1} | Match exactly n1 occurrences of X. | |
| X{n1,} | Maximal match of at least n1 occurrences of X. | |
| X{,n2} | Maximal match of at least 0 occurrences but not more than n2 occurrences of X. | |
| X{n1,n2} | Maximal match of at least n1 occurrences but not more than n2 occurrences of X. | |
| X+? | Minimal match of one or more occurrences of X. | |
| X*? | Minimal match of zero or more occurrences of X. | |
| X?? | Minimal match of zero or one occurrences of X. | |
| X{n1}? | Matches exactly n1 occurrences of X. | |
| X{n1,}? | Minimal match of at least n1 occurrences of X. | |
| X{,n2}? | Minimal match of at least 0 occurrences but not more than n2 occurrences of X. | |
| X{n1,n2}? | Minimal match of at least n1 occurrences but not more than n2 occurrences of X. | |
| (?!X) | Search fails if expression X is matched. The expression ^(?!if) matches the beginning of all lines that do not start with "if". | |
| (X) | Matches sub-expression X and specifies a new tagged expression. See "Tagged Search Expressions for more information. No more tagged expressions are defined once an explicit tagged expression number is specified as shown below. | |
| (?dX) | Matches sub-expression X and specifies to use tagged expression number d where 0<=d<=9. No more tagged expressions are defined by the sub-expression syntax "(X)" once this sub-expression syntax is used. This is the best way to make sure you have enough tagged expressions. | |
| (?:X) | Matches sub-expression X but does not define a tagged expression. | |
| X|Y | Matches X or Y. | |
| [char-set] | Matches any one of the characters specified by char-set. A '-' character may be used to specify ranges. The expression [A-Z] matches any uppercase letter. '\' may be used inside the square brackets to define literal characters or define ASCII characters. For example, "\-" specifies a literal dash character. The expression [\d0-\d27] matches ASCII character codes 0..27. The expression []] matches a right bracket. In MicroEdge regular expressions, [] matches no characters. In both syntax, the expression [\]] matches a right bracket. The expression [^] matches a '^' character but this does not work for MicroEdge regular expressions. In both syntax, [\^] matches a '^' character. | |
| [^char-set] | Matches any character not specified by char-set. A '-' character may be used to specify ranges. The expression [^A-Z] matches all characters except uppercase letters. | |
| \d | Defines a back reference to tagged expression number d where 0<=d<=9. For example, "{abc}def\1" matches the string "abcdefabc". If the tagged expression has not been set, the search path fails. | |
| \c | Specifies cursor position if match is found. If the expression xyz\c is found the cursor is placed after the z. | |
| \n | Matches newline character sequence. Useful for matching multi-line search strings. What this matches depends on whether the buffer is a DOS (ASCII 13,10 or just ASCII 10), UNIX (ASCII 10), Macintosh (ASCII 13), or user defined ASCII file. Use "\d10" if you want to match a 10 character. | |
| \r | Matches carriage return (ASCII 13). | |
| \t | Matches tab character. | |
| \f | Matches form feed character. | |
| \om | Turns on multi-line matching. This enhances the match character set, or match any character primitives to support matching end of line characters. For example, "\om.+" matches the rest of the buffer. WARNING: Test your regular expression on a very small file before using your regular expression on a large file. This option may cause the editor to use A LOT OF MEMORY. | |
| \ol | Turns off multi-line matching (default). You can still use "\n" to create regular expressions which match one or more lines. However, expressions like ".+" will not match multiple lines. This is much safer and usually faster than using the "\om" option. | |
| \xhh | Matches hexadecimal character hh. | |
| \dddd | Matches decimal character ddd. | |
| \char | Declares character after slash to be literal. For example, '\*' represents the star character. | |
| \:char | Matches predefined expression corresponding to char. | |
| The predefined expressions are: | ||
| \:a [A-Za-z0-9] | Matches an alphanumeric character | |
| \:b ([ \t]+) | Matches blanks | |
| \:c [A-Za-z] | Matches an alphabeticcharacter | |
| \:d [0-9] | Matches a digit | |
| \:f ([^\[\]\:\\/<>|=+;, \t"']+) | non-UNIX platforms: Matches a filename part | |
| \:f ([^/ \t"']+) | UNIX: Matches a filename part | |
| \:h ([0-9A-Fa-f]+) | Matches a hex number | |
| \:i ([0-9]+) | Matches an integer | |
| \:n (([0-9]+(\.[0-9]+|)|\.[0-9]+)([Ee](\+|-|)[0-9]+|)) | ||
| Matches a floating number | ||
| \:p (([A-Za-z]:|)(\\|/|)(\:f(\\|/))*\:f) | non-UNIX platforms: Matches a path | |
| \:p ((/|)(:f(/))*\:f) | UNIX: Matches a path | |
| \:q (\"[^\"]*\"|'[^']*') | Matches a quoted string | |
| \:v ([A-Za-z_$][A-Za-z0-9_$]*) | Matches a C variable | |
| \:w ([A-Za-z]+) | Matches a word | |
| NOTE: The \:f and \:p predefined expressions are not intended to support all operating system file names. Instead they are intended to be useful in practical cases. For non-UNIX platforms \:f is designed for FAT (DOS) file systems. The space character is not in :f because it is more typically a file name separator. | ||
| Precedence of operators from highest to lowest. | ||
| +,*,?, {},+?,*?,??, {}? | These operators have the same precedence | |
| concatenation | ||
| | | ||
| Sample Regular Expressions: | ||
| ^defproc | Matches lines that begin with the word defproc. | |
| ^definit$ | Matches lines that only contain the word definit. | |
| ^\*name | Matches lines that begin with the string "*name". Notice that the backslash must prefix the special character '*'. | |
| [\t ] | Matches tab and space characters. | |
| [\d9\d32] | Matches tab and space characters. | |
| [\x9\x20] | Matches tab and space characters. | |
| p.t | Matches any three letter string starting with the letter 'p' and ending with the letter 't'. Two possible matches are "pot" and "pat". | |
| s.*?t | Matches the letter 's' followed by any number of characters followed by the nearest letter 't'. Two possible matches are "seat" and "st". | |
| for|while | Matches the strings "for" or "while". | |
| ^\:p | Matches lines beginning with a file name. | |
| xy+z | Matches x followed by one or more occurrences of y followed by z. | |