| 1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
|
|---|
| 2 |
|
|---|
| 3 | <html>
|
|---|
| 4 | <head>
|
|---|
| 5 | <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
|
|---|
| 6 | <meta name="Author" content="Thomas Bretz">
|
|---|
| 7 | <title>MARS: Magic Analysis and Reconstruction Software</title>
|
|---|
| 8 | <link rel="stylesheet" type="text/css" href="../mars.css">
|
|---|
| 9 | </head>
|
|---|
| 10 |
|
|---|
| 11 | <body background="background.gif" text="#000000" bgcolor="#000099" link="#1122FF" vlink="#8888FF" alink="#FF0000">
|
|---|
| 12 |
|
|---|
| 13 |
|
|---|
| 14 | <center>
|
|---|
| 15 | <table class="Main" CELLPADDING=0>
|
|---|
| 16 |
|
|---|
| 17 | <tr>
|
|---|
| 18 | <td class="Edge"><img SRC="../ecke.gif" ALT=""></td>
|
|---|
| 19 | <td class="Header">
|
|---|
| 20 | <B>M A R S</B><BR><B>M</B>agic <B>A</B>nalysis and <B>R</B>econstruction <B>S</B>oftware
|
|---|
| 21 | </td>
|
|---|
| 22 | </tr>
|
|---|
| 23 |
|
|---|
| 24 | <tr>
|
|---|
| 25 | <td COLSPAN=2 BGCOLOR="#FFFFFF">
|
|---|
| 26 | <hr SIZE=1 NOSHADE WIDTH="80%">
|
|---|
| 27 | <center><table class="Inner" CELLPADDING=15>
|
|---|
| 28 |
|
|---|
| 29 | <tr class="Block">
|
|---|
| 30 | <td><b><u><A NAME="OVERVIEW">MySQL Regular Expressions</A>:</u></b>
|
|---|
| 31 | <P>
|
|---|
| 32 | A <B>regular expression (regex)</B> is a powerful way of specifying a complex search. <P>
|
|---|
| 33 |
|
|---|
| 34 | MySQL uses Henry Spencer's implementation of regular expressions, which is aimed at conformance with POSIX
|
|---|
| 35 | 1003.2. MySQL uses the extended version. <P>
|
|---|
| 36 |
|
|---|
| 37 | This is a simplistic reference that skips the details. To get more exact information, see
|
|---|
| 38 | Henry Spencer's <A HREF="#REGEX">regex(7)</A><P>
|
|---|
| 39 |
|
|---|
| 40 | A regular expression describes a set of strings. The simplest regexp is one that has no special characters in it. For
|
|---|
| 41 | example, the regexp <b>hello</B> matches <B>hello</B> and nothing else. <P>
|
|---|
| 42 |
|
|---|
| 43 | Non-trivial regular expressions use certain special constructs so that they can match more than one string. For
|
|---|
| 44 | example, the regexp hello|word matches either the string hello or the string word. <P>
|
|---|
| 45 |
|
|---|
| 46 | As a more complex example, the regexp B[an]*s matches any of the strings Bananas, Baaaaas, Bs, and any
|
|---|
| 47 | other string starting with a B, ending with an s, and containing any number of a or n characters in between. <P>
|
|---|
| 48 |
|
|---|
| 49 | A regular expression may use any of the following special characters/constructs: <P>
|
|---|
| 50 | <pre>
|
|---|
| 51 | ^ Match the beginning of a string.
|
|---|
| 52 | mysql> SELECT "fo\nfo" REGEXP "^fo$"; -> 0
|
|---|
| 53 | mysql> SELECT "fofo" REGEXP "^fo"; -> 1
|
|---|
| 54 |
|
|---|
| 55 | $ Match the end of a string.
|
|---|
| 56 | mysql> SELECT "fo\no" REGEXP "^fo\no$"; -> 1
|
|---|
| 57 | mysql> SELECT "fo\no" REGEXP "^fo$"; -> 0
|
|---|
| 58 |
|
|---|
| 59 | . Match any character (including newline).
|
|---|
| 60 | mysql> SELECT "fofo" REGEXP "^f.*"; -> 1
|
|---|
| 61 | mysql> SELECT "fo\nfo" REGEXP "^f.*"; -> 1
|
|---|
| 62 |
|
|---|
| 63 | a* Match any sequence of zero or more a characters.
|
|---|
| 64 | mysql> SELECT "Ban" REGEXP "^Ba*n"; -> 1
|
|---|
| 65 | mysql> SELECT "Baaan" REGEXP "^Ba*n"; -> 1
|
|---|
| 66 | mysql> SELECT "Bn" REGEXP "^Ba*n"; -> 1
|
|---|
| 67 |
|
|---|
| 68 | a+ Match any sequence of one or more a characters.
|
|---|
| 69 | mysql> SELECT "Ban" REGEXP "^Ba+n"; -> 1
|
|---|
| 70 | mysql> SELECT "Bn" REGEXP "^Ba+n"; -> 0
|
|---|
| 71 |
|
|---|
| 72 | a? Match either zero or one a character.
|
|---|
| 73 | mysql> SELECT "Bn" REGEXP "^Ba?n"; -> 1
|
|---|
| 74 | mysql> SELECT "Ban" REGEXP "^Ba?n"; -> 1
|
|---|
| 75 | mysql> SELECT "Baan" REGEXP "^Ba?n"; -> 0
|
|---|
| 76 |
|
|---|
| 77 | de|abc Match either of the sequences de or abc.
|
|---|
| 78 | mysql> SELECT "pi" REGEXP "pi|apa"; -> 1
|
|---|
| 79 | mysql> SELECT "axe" REGEXP "pi|apa"; -> 0
|
|---|
| 80 | mysql> SELECT "apa" REGEXP "pi|apa"; -> 1
|
|---|
| 81 | mysql> SELECT "apa" REGEXP "^(pi|apa)$"; -> 1
|
|---|
| 82 | mysql> SELECT "pi" REGEXP "^(pi|apa)$"; -> 1
|
|---|
| 83 | mysql> SELECT "pix" REGEXP "^(pi|apa)$"; -> 0
|
|---|
| 84 |
|
|---|
| 85 | (abc)* Match zero or more instances of the sequence abc.
|
|---|
| 86 | mysql> SELECT "pi" REGEXP "^(pi)*$"; -> 1
|
|---|
| 87 | mysql> SELECT "pip" REGEXP "^(pi)*$"; -> 0
|
|---|
| 88 | mysql> SELECT "pipi" REGEXP "^(pi)*$"; -> 1
|
|---|
| 89 |
|
|---|
| 90 | {1} The is a more general way of writing regexps that match many
|
|---|
| 91 | {2,3} occurrences of the previous atom.
|
|---|
| 92 | a* Can be written as a{0,}.
|
|---|
| 93 | a+ Can be written as a{1,}.
|
|---|
| 94 | a? Can be written as a{0,1}.
|
|---|
| 95 |
|
|---|
| 96 | To be more precise, an atom followed by a bound containing one
|
|---|
| 97 | integer i and no comma matches a sequence of exactly i matches
|
|---|
| 98 | of the atom. An atom followed by a bound containing one integer i
|
|---|
| 99 | and a comma matches a sequence of i or more matches of the atom.
|
|---|
| 100 | An atom followed by a bound containing two integers i and j matches
|
|---|
| 101 | a sequence of i through j (inclusive) matches of the atom.
|
|---|
| 102 |
|
|---|
| 103 | Both arguments must be in the range from 0 to RE_DUP_MAX (default 255),
|
|---|
| 104 | inclusive. If there are two arguments, the second must be greater
|
|---|
| 105 | than or equal to the first.
|
|---|
| 106 |
|
|---|
| 107 | [a-dX] Matches any character which is (or is not, if ^ is used) either a, b, c,
|
|---|
| 108 | [^a-dX] d or X. To include a literal ] character, it must immediately follow
|
|---|
| 109 | the opening bracket [. To include a literal - character, it must be
|
|---|
| 110 | written first or last. So [0-9] matches any decimal digit. Any character
|
|---|
| 111 | that does not have a defined meaning inside a [] pair has no special
|
|---|
| 112 | meaning and matches only itself.
|
|---|
| 113 | mysql> SELECT "aXbc" REGEXP "[a-dXYZ]"; -> 1
|
|---|
| 114 | mysql> SELECT "aXbc" REGEXP "^[a-dXYZ]$"; -> 0
|
|---|
| 115 | mysql> SELECT "aXbc" REGEXP "^[a-dXYZ]+$"; -> 1
|
|---|
| 116 | mysql> SELECT "aXbc" REGEXP "^[^a-dXYZ]+$"; -> 0
|
|---|
| 117 | mysql> SELECT "gheis" REGEXP "^[^a-dXYZ]+$"; -> 1
|
|---|
| 118 | mysql> SELECT "gheisa" REGEXP "^[^a-dXYZ]+$"; -> 0
|
|---|
| 119 |
|
|---|
| 120 | [[.characters.]]
|
|---|
| 121 | The sequence of characters of that collating element. characters is
|
|---|
| 122 | either a single character or a character name like newline. You can
|
|---|
| 123 | find the full list of character names in 'regexp/cname.h'.
|
|---|
| 124 |
|
|---|
| 125 | [ =character_class=]
|
|---|
| 126 | An equivalence class, standing for the sequences of characters of all
|
|---|
| 127 | collating elements equivalent to that one, including itself.
|
|---|
| 128 |
|
|---|
| 129 | For example, if o and (+) are the members of an equivalence class,
|
|---|
| 130 | then [[=o=]], [[=(+)=]], and [o(+)] are all synonymous. An equivalence
|
|---|
| 131 | class may not be an endpoint of a range.
|
|---|
| 132 |
|
|---|
| 133 | [:character_class:]
|
|---|
| 134 | Within a bracket expression, the name of a character class enclosed
|
|---|
| 135 | in [: and :] stands for the list of all characters belonging to that
|
|---|
| 136 | class. Standard character class names are:
|
|---|
| 137 |
|
|---|
| 138 | These stand for the character classes defined in the ctype(3) manual
|
|---|
| 139 | page. A locale may provide others. A character class may not be used
|
|---|
| 140 | as an endpoint of a range.
|
|---|
| 141 | mysql> SELECT "justalnums" REGEXP "[[:alnum:]]+"; -> 1
|
|---|
| 142 | mysql> SELECT "!!" REGEXP "[[:alnum:]]+"; -> 0
|
|---|
| 143 |
|
|---|
| 144 | [[:<:]] These match the null string at the beginning and end of a word
|
|---|
| 145 | [[:>:]] respectively. A word is defined as a sequence of word characters
|
|---|
| 146 | which is neither preceded nor followed by word characters. A word
|
|---|
| 147 | character is an alnum character (as defined by ctype(3)) or an
|
|---|
| 148 | underscore (_).
|
|---|
| 149 | mysql> SELECT "a word a" REGEXP "[[:<:]]word[[:>:]]"; -> 1
|
|---|
| 150 | mysql> SELECT "a xword a" REGEXP "[[:<:]]word[[:>:]]"; -> 0
|
|---|
| 151 |
|
|---|
| 152 | mysql> SELECT "weeknights" REGEXP "^(wee|week)(knights|nights)$"; -> 1
|
|---|
| 153 | </pre>
|
|---|
| 154 | </td></tr>
|
|---|
| 155 | <tr class="Block">
|
|---|
| 156 | <td>
|
|---|
| 157 | <center><h3>--- <A NAME="REGEX"><U>REGEX</U></A>(7) ---</h3></center>
|
|---|
| 158 | <B>NAME</B><BR>
|
|---|
| 159 | regex - POSIX 1003.2 regular expressions<P>
|
|---|
| 160 |
|
|---|
| 161 | <B>DESCRIPTION</B><BR>
|
|---|
| 162 | Regular expressions (``RE''s), as defined in POSIX 1003.2,
|
|---|
| 163 | come in two forms: modern REs (roughly those of egrep;
|
|---|
| 164 | 1003.2 calls these ``extended'' REs) and obsolete REs
|
|---|
| 165 | (roughly those of ed; 1003.2 ``basic'' REs). Obsolete REs
|
|---|
| 166 | mostly exist for backward compatibility in some old pro-
|
|---|
| 167 | grams; they will be discussed at the end. 1003.2 leaves
|
|---|
| 168 | some aspects of RE syntax and semantics open; `' marks
|
|---|
| 169 | decisions on these aspects that may not be fully portable
|
|---|
| 170 | to other 1003.2 implementations.<P>
|
|---|
| 171 |
|
|---|
| 172 | A (modern) RE is one or more non-empty branches, separated
|
|---|
| 173 | by `|'. It matches anything that matches one of the
|
|---|
| 174 | branches.<P>
|
|---|
| 175 |
|
|---|
| 176 | A branch is one or more pieces, concatenated. It matches
|
|---|
| 177 | a match for the first, followed by a match for the second,
|
|---|
| 178 | etc.<P>
|
|---|
| 179 |
|
|---|
| 180 | A piece is an atom possibly followed by a single `*', `+',
|
|---|
| 181 | `?', or bound. An atom followed by `*' matches a sequence
|
|---|
| 182 | of 0 or more matches of the atom. An atom followed by `+'
|
|---|
| 183 | matches a sequence of 1 or more matches of the atom. An
|
|---|
| 184 | atom followed by `?' matches a sequence of 0 or 1 matches
|
|---|
| 185 | of the atom.<P>
|
|---|
| 186 |
|
|---|
| 187 | A bound is `{' followed by an unsigned decimal integer,
|
|---|
| 188 | possibly followed by `,' possibly followed by another
|
|---|
| 189 | unsigned decimal integer, always followed by `}'. The
|
|---|
| 190 | integers must lie between 0 and RE_DUP_MAX (255) inclu-
|
|---|
| 191 | sive, and if there are two of them, the first may not
|
|---|
| 192 | exceed the second. An atom followed by a bound containing
|
|---|
| 193 | one integer i and no comma matches a sequence of exactly i
|
|---|
| 194 | matches of the atom. An atom followed by a bound contain-
|
|---|
| 195 | ing one integer i and a comma matches a sequence of i or
|
|---|
| 196 | more matches of the atom. An atom followed by a bound
|
|---|
| 197 | containing two integers i and j matches a sequence of i
|
|---|
| 198 | through j (inclusive) matches of the atom.<P>
|
|---|
| 199 |
|
|---|
| 200 | An atom is a regular expression enclosed in `()' (matching
|
|---|
| 201 | a match for the regular expression), an empty set of `()'
|
|---|
| 202 | (matching the null string), a bracket expression (see
|
|---|
| 203 | below), `.' (matching any single character), `^' (match-
|
|---|
| 204 | ing the null string at the beginning of a line), `$'
|
|---|
| 205 | (matching the null string at the end of a line), a `\'
|
|---|
| 206 | followed by one of the characters `^.[$()|*+?{\' (matching
|
|---|
| 207 | that character taken as an ordinary character), a `\' fol-
|
|---|
| 208 | lowed by any other character (matching that character
|
|---|
| 209 | taken as an ordinary character, as if the `\' had not been
|
|---|
| 210 | present), or a single character with no other significance
|
|---|
| 211 | (matching that character). A `{' followed by a character
|
|---|
| 212 | other than a digit is an ordinary character, not the
|
|---|
| 213 | beginning of a bound. It is illegal to end an RE with
|
|---|
| 214 | `\'.<P>
|
|---|
| 215 |
|
|---|
| 216 | A bracket expression is a list of characters enclosed in
|
|---|
| 217 | `[]'. It normally matches any single character from the
|
|---|
| 218 | list (but see below). If the list begins with `^', it
|
|---|
| 219 | matches any single character (but see below) not from the
|
|---|
| 220 | rest of the list. If two characters in the list are sepa-
|
|---|
| 221 | rated by `-', this is shorthand for the full range of
|
|---|
| 222 | characters between those two (inclusive) in the collating
|
|---|
| 223 | sequence, e.g. `[0-9]' in ASCII matches any decimal digit.
|
|---|
| 224 | It is illegal for two ranges to share an endpoint, e.g.
|
|---|
| 225 | `a-c-e'. Ranges are very collating-sequence-dependent,
|
|---|
| 226 | and portable programs should avoid relying on them.<P>
|
|---|
| 227 |
|
|---|
| 228 | To include a literal `]' in the list, make it the first
|
|---|
| 229 | character (following a possible `^'). To include a lit-
|
|---|
| 230 | eral `-', make it the first or last character, or the sec-
|
|---|
| 231 | ond endpoint of a range. To use a literal `-' as the
|
|---|
| 232 | first endpoint of a range, enclose it in `[.' and `.]' to
|
|---|
| 233 | make it a collating element (see below). With the excep-
|
|---|
| 234 | tion of these and some combinations using `[' (see next
|
|---|
| 235 | paragraphs), all other special characters, including `\',
|
|---|
| 236 | lose their special significance within a bracket expres-
|
|---|
| 237 | sion.<P>
|
|---|
| 238 |
|
|---|
| 239 | Within a bracket expression, a collating element (a char-
|
|---|
| 240 | acter, a multi-character sequence that collates as if it
|
|---|
| 241 | were a single character, or a collating-sequence name for
|
|---|
| 242 | either) enclosed in `[.' and `.]' stands for the sequence
|
|---|
| 243 | of characters of that collating element. The sequence is
|
|---|
| 244 | a single element of the bracket expression's list. A
|
|---|
| 245 | bracket expression containing a multi-character collating
|
|---|
| 246 | element can thus match more than one character, e.g. if
|
|---|
| 247 | the collating sequence includes a `ch' collating element,
|
|---|
| 248 | then the RE `[[.ch.]]*c' matches the first five characters
|
|---|
| 249 | of `chchcc'.<P>
|
|---|
| 250 |
|
|---|
| 251 | Within a bracket expression, a collating element enclosed
|
|---|
| 252 | in `[=' and `=]' is an equivalence class, standing for the
|
|---|
| 253 | sequences of characters of all collating elements equiva-
|
|---|
| 254 | lent to that one, including itself. (If there are no
|
|---|
| 255 | other equivalent collating elements, the treatment is as
|
|---|
| 256 | if the enclosing delimiters were `[.' and `.]'.) For
|
|---|
| 257 | example, if o and ^ are the members of an equivalence
|
|---|
| 258 | class, then `[[=o=]]', `[[=^=]]', and `[o^]' are all syn-
|
|---|
| 259 | onymous. An equivalence class may not be an endpoint of a
|
|---|
| 260 | range.<P>
|
|---|
| 261 |
|
|---|
| 262 | Within a bracket expression, the name of a character class
|
|---|
| 263 | enclosed in `[:' and `:]' stands for the list of all char-
|
|---|
| 264 | acters belonging to that class. Standard character class
|
|---|
| 265 | names are:<P>
|
|---|
| 266 | <table>
|
|---|
| 267 | <tr><td>alnum</TD><td>digit</td><td>punct</td></tr>
|
|---|
| 268 | <tr><td>alpha</TD><td>graph</TD><td>space</td></tr>
|
|---|
| 269 | <tr><td>blank</TD><td>lower</TD><td>upper</td></tr>
|
|---|
| 270 | <tr><td>cntrl</TD><td>print</TD><td>xdigit</td></tr>
|
|---|
| 271 | </table>
|
|---|
| 272 | <P>
|
|---|
| 273 | These stand for the character classes defined in ctype(3).
|
|---|
| 274 | A locale may provide others. A character class may not be
|
|---|
| 275 | used as an endpoint of a range.<P>
|
|---|
| 276 |
|
|---|
| 277 | There are two special cases of bracket expressions: the
|
|---|
| 278 | bracket expressions `[[:<:]]' and `[[:>:]]' match the null
|
|---|
| 279 | string at the beginning and end of a word respectively. A
|
|---|
| 280 | word is defined as a sequence of word characters which is
|
|---|
| 281 | neither preceded nor followed by word characters. A word
|
|---|
| 282 | character is an alnum character (as defined by ctype(3))
|
|---|
| 283 | or an underscore. This is an extension, compatible with
|
|---|
| 284 | but not specified by POSIX 1003.2, and should be used with
|
|---|
| 285 | caution in software intended to be portable to other sys-
|
|---|
| 286 | tems.<P>
|
|---|
| 287 |
|
|---|
| 288 | In the event that an RE could match more than one sub-
|
|---|
| 289 | string of a given string, the RE matches the one starting
|
|---|
| 290 | earliest in the string. If the RE could match more than
|
|---|
| 291 | one substring starting at that point, it matches the
|
|---|
| 292 | longest. Subexpressions also match the longest possible
|
|---|
| 293 | substrings, subject to the constraint that the whole match
|
|---|
| 294 | be as long as possible, with subexpressions starting ear-
|
|---|
| 295 | lier in the RE taking priority over ones starting later.
|
|---|
| 296 | Note that higher-level subexpressions thus take priority
|
|---|
| 297 | over their lower-level component subexpressions.<P>
|
|---|
| 298 |
|
|---|
| 299 | Match lengths are measured in characters, not collating
|
|---|
| 300 | elements. A null string is considered longer than no
|
|---|
| 301 | match at all. For example, `bb*' matches the three middle
|
|---|
| 302 | characters of `abbbc', `(wee|week)(knights|nights)'
|
|---|
| 303 | matches all ten characters of `weeknights', when `(.*).*'
|
|---|
| 304 | is matched against `abc' the parenthesized subexpression
|
|---|
| 305 | matches all three characters, and when `(a*)*' is matched
|
|---|
| 306 | against `bc' both the whole RE and the parenthesized
|
|---|
| 307 | subexpression match the null string.<P>
|
|---|
| 308 |
|
|---|
| 309 | If case-independent matching is specified, the effect is
|
|---|
| 310 | much as if all case distinctions had vanished from the
|
|---|
| 311 | alphabet. When an alphabetic that exists in multiple
|
|---|
| 312 | cases appears as an ordinary character outside a bracket
|
|---|
| 313 | expression, it is effectively transformed into a bracket
|
|---|
| 314 | expression containing both cases, e.g. `x' becomes `[xX]'.
|
|---|
| 315 | When it appears inside a bracket expression, all case
|
|---|
| 316 | counterparts of it are added to the bracket expression, so
|
|---|
| 317 | that (e.g.) `[x]' becomes `[xX]' and `[^x]' becomes
|
|---|
| 318 | `[^xX]'.<P>
|
|---|
| 319 |
|
|---|
| 320 | No particular limit is imposed on the length of REs. Pro-
|
|---|
| 321 | grams intended to be portable should not employ REs longer
|
|---|
| 322 | than 256 bytes, as an implementation can refuse to accept
|
|---|
| 323 | such REs and remain POSIX-compliant.<P>
|
|---|
| 324 |
|
|---|
| 325 | Obsolete (``basic'') regular expressions differ in several
|
|---|
| 326 | respects. `|', `+', and `?' are ordinary characters and
|
|---|
| 327 | there is no equivalent for their functionality. The
|
|---|
| 328 | delimiters for bounds are `\{' and `\}', with `{' and `}'
|
|---|
| 329 | by themselves ordinary characters. The parentheses for
|
|---|
| 330 | nested subexpressions are `\(' and `\)', with `(' and `)'
|
|---|
| 331 | by themselves ordinary characters. `^' is an ordinary
|
|---|
| 332 | character except at the beginning of the RE or the begin-
|
|---|
| 333 | ning of a parenthesized subexpression, `$' is an ordinary
|
|---|
| 334 | character except at the end of the RE or the end of a
|
|---|
| 335 | parenthesized subexpression, and `*' is an ordinary char-
|
|---|
| 336 | acter if it appears at the beginning of the RE or the
|
|---|
| 337 | beginning of a parenthesized subexpression (after a possi-
|
|---|
| 338 | ble leading `^'). Finally, there is one new type of atom,
|
|---|
| 339 | a back reference: `\' followed by a non-zero decimal digit
|
|---|
| 340 | d matches the same sequence of characters matched by the
|
|---|
| 341 | dth parenthesized subexpression (numbering subexpressions
|
|---|
| 342 | by the positions of their opening parentheses, left to
|
|---|
| 343 | right), so that (e.g.) `\([bc]\)\1' matches `bb' or `cc'
|
|---|
| 344 | but not `bc'.<P>
|
|---|
| 345 |
|
|---|
| 346 | <B>SEE ALSO</B><BR>
|
|---|
| 347 | POSIX 1003.2, section 2.8 (Regular Expression Notation).<P>
|
|---|
| 348 |
|
|---|
| 349 | <B>BUGS</B><BR>
|
|---|
| 350 | Having two kinds of REs is a botch.<P>
|
|---|
| 351 |
|
|---|
| 352 | The current 1003.2 spec says that `)' is an ordinary char-
|
|---|
| 353 | acter in the absence of an unmatched `('; this was an
|
|---|
| 354 | unintentional result of a wording error, and change is
|
|---|
| 355 | likely. Avoid relying on it.<P>
|
|---|
| 356 |
|
|---|
| 357 | Back references are a dreadful botch, posing major prob-
|
|---|
| 358 | lems for efficient implementations. They are also some-
|
|---|
| 359 | what vaguely defined (does `a\(\(b\)*\2\)*d' match
|
|---|
| 360 | `abbbd'?). Avoid using them.<P>
|
|---|
| 361 |
|
|---|
| 362 | 1003.2's specification of case-independent matching is
|
|---|
| 363 | vague. The ``one case implies all cases'' definition
|
|---|
| 364 | given above is current consensus among implementors as to
|
|---|
| 365 | the right interpretation.<P>
|
|---|
| 366 |
|
|---|
| 367 | The syntax for word boundaries is incredibly ugly.<P>
|
|---|
| 368 |
|
|---|
| 369 | <B>AUTHOR</B><BR>
|
|---|
| 370 | This page was taken from Henry Spencer's regex package.
|
|---|
| 371 | </td>
|
|---|
| 372 | </tr>
|
|---|
| 373 |
|
|---|
| 374 | </table></center>
|
|---|
| 375 |
|
|---|
| 376 | <center>
|
|---|
| 377 | <hr NOSHADE WIDTH="80%"><i><font color="#000099"><font size=-1>This Web Site is
|
|---|
| 378 | hosted by Apache for OS/2 and done by <a href="mailto:tbretz@astro.uni-wuerzburg.de">Thomas Bretz</a>.</font></font></i><BR>
|
|---|
| 379 | <BR>
|
|---|
| 380 | <a href="http://validator.w3.org/check/referer"><img border="0"
|
|---|
| 381 | src="../../valid-html40.png" alt="Valid HTML 4.0!" height="20" width="66"></a>
|
|---|
| 382 | </center>
|
|---|
| 383 | </tr>
|
|---|
| 384 | </table>
|
|---|
| 385 |
|
|---|
| 386 | </center>
|
|---|
| 387 |
|
|---|
| 388 | </body>
|
|---|
| 389 | </html>
|
|---|