The Hyades Adapter Configuration Editor allows you to use regular expressions to describe how log files should be transformed into Common Base Event records. The following tables are a guideline to regular expression usage.
| Expression | Matches |
|---|---|
| {n,m} | at least n but not more than m times |
| {n,} | at least n times |
| {n} | exactly n times |
| * | 0 or more times |
| + | 1 or more times |
| ? | 0 or 1 times |
| . | everything except \n in a regular expression within parentheses |
| ^ | a null token matching the beginning of a string or line (i.e., the position right after a newline or right before the beginning of a string) in a regular expression within parentheses |
| $ | a null token matching the end of a string or line (that is, the position right before a newline or right after the end of a string) in a regular expression within parentheses |
| \b | backspace inside a character class ([abcd]) |
| \b | null token matching a word boundary (\w on one side and \W on the other) |
| \B | null token matching a boundary that isn't a word boundary |
| \A | only at beginning of string |
| \Z | only at end of string (or before newline at the end) |
| \ | newline |
| \r | carriage return |
| \t | tab |
| \f | formfeed |
| \d | digit [0-9] |
| \D | non-digit [^0-9] |
| \w | word character [0-9a-z_A-Z] |
| \W | non-word character [^0-9a-z_A-Z] |
| \s | a whitespace character [ \t\n\r\f] |
| \S | a non-whitespace character [^ \t\n\r\f] |
| \xnn | the hexadecimal representation of character nn |
| \cD | the corresponding control character |
| \nn or \nnn | the octal representation of character nn unless a backreference. |
| \1, \2, \3 ... | whatever the first, second, third, and so on, parenthesized group matched. This is called a backreference. If there is no corresponding group, the number is interpreted as an octal representation of a character. |
| \0 | the null character. Any other backslashed character matches itself . |
| *? | 0 or more times |
| +? | 1 or more times |
| ?? | 0 or 1 times |
| {n}? | exactly n times |
| {n,}? | at least n times |
| {n,m}? | at least n but not more than m times |
To group parts of an expression, use the metacharacters ( ). This allows the regular expression in the parantheses to be treated as a single unit. For example, the regular expression
severity:(1|2)matches the pattern severity:1 or severity:2.
To extract parts of a string that have been matched using the grouping metacharacters, use the special variables $1, $2, etc.
# Extract the name and URL from $pattern = <a href="secure_logon.html">Logon form</a> $pattern =~ <a href=\"(.*)\">(.*)</a> ; # match using grouping $url = $1; # $1 equals secure_logon.html $pagename = $2; # $2 equals Logon form
| Expression | Matches |
|---|---|
| (?#text) | An embedded comment causing text to be ignored. |
| (?:regexp) | Groups things like "()" but doesn't cause the group match to be saved. |
| (?=regexp) | A zero-width positive lookahead assertion. For example, \w+(?=\s) matches a word followed by whitespace, without including whitespace in the MatchResult |
| (?!regexp) | A zero-width negative lookahead assertion. For example foo(?!bar) matches any occurrence of foo that isn't followed by bar. This is a zero-width assertion, which means that a(?!b)d matches ad because a is followed by a character that is not b (the d) and d follows the zero-width assertion. |
| (?imsx) | One or more embedded pattern-match modifiers: i enables case insensitivity m enables multiline treatment of the input s enables single-line treatment of the input x enables extended whitespace comments |
Related Concepts
Overview of the Hyades Generic Log Adapter
Common Base Event format specification
Related tasks
Creating a log parser
Creating a rules-based adapter
Creating a static adapter
Related references
Adapter Configuration File structure
Common Base Event format specification
Adapter Configuration Editor
Regular expression grammar
(C) Copyright IBM Corporation 2000, 2004. All Rights Reserved.