GNUtools/emacs/info/regex - annotate

Return to regex CVS log
Up to [Apple XNU] / GNUtools / emacs / info
Annotation of GNUtools/emacs/info/regex, revision 1.1

1.1     ! root        1: Info file regex, produced by Makeinfo, -*- Text -*- from input file
        !             2: regex.texinfo.
        !             3: 
        !             4: 
        !             5: 
        !             6: File: regex,  Node: top,  Next: syntax,  Up: (dir)
        !             7: 
        !             8: "regex" regular expression matching library.
        !             9: ********************************************
        !            10: 
        !            11: Overview
        !            12: ========
        !            13: 
        !            14: Regular expression matching allows you to test whether a string fits
        !            15: into a specific syntactic shape.  You can also search a string for a
        !            16: substring that fits a pattern.
        !            17: 
        !            18: A regular expression describes a set of strings.  The simplest case
        !            19: is one that describes a particular string; for example, the string
        !            20: `foo' when regarded as a regular expression matches `foo' and nothing
        !            21: else.  Nontrivial regular expressions use certain special constructs
        !            22: so that they can match more than one string.  For example, the
        !            23: regular expression `foo\|bar' matches either the string `foo' or the
        !            24: string `bar'; the regular expression `c[ad]*r' matches any of the
        !            25: strings `cr', `car', `cdr', `caar', `cadddar' and all other such
        !            26: strings with any number of `a''s and `d''s.
        !            27: 
        !            28: The first step in matching a regular expression is to compile it. 
        !            29: You must supply the pattern string and also a pattern buffer to hold
        !            30: the compiled result.  That result contains the pattern in an internal
        !            31: format that is easier to use in matching.
        !            32: 
        !            33: Having compiled a pattern, you can match it against strings.  You can
        !            34: match the compiled pattern any number of times against different
        !            35: strings.
        !            36: 
        !            37: * Menu:
        !            38: 
        !            39: * syntax::     Syntax of regular expressions
        !            40: * directives:: Meaning of characters as regex string directives.
        !            41: * emacs::      Additional character directives available
        !            42:                  only for use within Emacs.
        !            43: * programming:: Using the regex library from C programs
        !            44: * unix::       Unix-compatible entry-points to regex library
        !            45: 
        !            46: 
        !            47: 
        !            48: File: regex,  Node: syntax,  Next: directives,  Prev: top,  Up: top
        !            49: 
        !            50: Syntax of Regular Expressions
        !            51: =============================
        !            52: 
        !            53: Regular expressions have a syntax in which a few characters are
        !            54: special constructs and the rest are "ordinary".  An ordinary
        !            55: character is a simple regular expression which matches that character
        !            56: and nothing else.  The special characters are `$', `^', `.', `*',
        !            57: `+', `?', `[', `]' and `\'.  Any other character appearing in a
        !            58: regular expression is ordinary, unless a `\' precedes it.
        !            59: 
        !            60: For example, `f' is not a special character, so it is ordinary, and
        !            61: therefore `f' is a regular expression that matches the string `f' and
        !            62: no other string.  (It does *not* match the string `ff'.)  Likewise,
        !            63: `o' is a regular expression that matches only `o'.
        !            64: 
        !            65: Any two regular expressions A and B can be concatenated.  The result
        !            66: is a regular expression which matches a string if A matches some
        !            67: amount of the beginning of that string and B matches the rest of the
        !            68: string.
        !            69: 
        !            70: As a simple example, we can concatenate the regular expressions `f'
        !            71: and `o' to get the regular expression `fo', which matches only the
        !            72: string `fo'.  Still trivial.
        !            73: 
        !            74: Note: for Unix compatibility, special characters are treated as
        !            75: ordinary ones if they are in contexts where their special meanings
        !            76: make no sense.  For example, `*foo' treats `*' as ordinary since
        !            77: there is no preceding expression on which the `*' can act.  It is
        !            78: poor practice to depend on this behavior; better to quote the special
        !            79: character anyway, regardless of where is appears.
        !            80: 
        !            81: 
        !            82: 
        !            83: File: regex,  Node: directives,  Next: emacs,  Prev: syntax,  Up: top
        !            84: 
        !            85: The following are the characters and character sequences which have
        !            86: special meaning within regular expressions.  Any character not
        !            87: mentioned here is not special; it stands for exactly itself for the
        !            88: purposes of searching and matching.  *Note syntax::.
        !            89: 
        !            90: `.'
        !            91:      is a special character that matches anything except a newline. 
        !            92:      Using concatenation, we can make regular expressions like `a.b'
        !            93:      which matches any three-character string which begins with `a'
        !            94:      and ends with `b'.
        !            95: 
        !            96: `*'
        !            97:      is not a construct by itself; it is a suffix, which means the
        !            98:      preceding regular expression is to be repeated as many times as
        !            99:      possible.  In `fo*', the `*' applies to the `o', so `fo*'
        !           100:      matches `f' followed by any number of `o''s.
        !           101: 
        !           102:      The case of zero `o''s is allowed: `fo*' does match `f'.
        !           103: 
        !           104:      `*' always applies to the *smallest* possible preceding
        !           105:      expression.  Thus, `fo*' has a repeating `o', not a repeating
        !           106:      `fo'.
        !           107: 
        !           108:      The matcher processes a `*' construct by matching, immediately,
        !           109:      as many repetitions as can be found.  Then it continues with the
        !           110:      rest of the pattern.  If that fails, backtracking occurs,
        !           111:      discarding some of the matches of the `*''d construct in case
        !           112:      that makes it possible to match the rest of the pattern.  For
        !           113:      example, matching `c[ad]*ar' against the string `caddaar', the
        !           114:      `[ad]*' first matches `addaa', but this does not allow the next
        !           115:      `a' in the pattern to match.  So the last of the matches of
        !           116:      `[ad]' is undone and the following `a' is tried again.  Now it
        !           117:      succeeds.
        !           118: 
        !           119: `+'
        !           120:      `+' is like `*' except that at least one match for the preceding
        !           121:      pattern is required for `+'.  Thus, `c[ad]+r' does not match
        !           122:      `cr' but does match anything else that `c[ad]*r' would match.
        !           123: 
        !           124: `?'
        !           125:      `?' is like `*' except that it allows either zero or one match
        !           126:      for the preceding pattern.  Thus, `c[ad]?r' matches `cr' or
        !           127:      `car' or `cdr', and nothing else.
        !           128: 
        !           129: `[ ... ]'
        !           130:      `[' begins a "character set", which is terminated by a `]'.  In
        !           131:      the simplest case, the characters between the two form the set. 
        !           132:      Thus, `[ad]' matches either `a' or `d', and `[ad]*' matches any
        !           133:      string of `a''s and `d''s (including the empty string), from
        !           134:      which it follows that `c[ad]*r' matches `car', etc.
        !           135: 
        !           136:      Character ranges can also be included in a character set, by
        !           137:      writing two characters with a `-' between them.  Thus, `[a-z]'
        !           138:      matches any lower-case letter.  Ranges may be intermixed freely
        !           139:      with individual characters, as in `[a-z$%.]', which matches any
        !           140:      lower case letter or `$', `%' or period.
        !           141: 
        !           142:      Note that the usual special characters are not special any more
        !           143:      inside a character set.  A completely different set of special
        !           144:      characters exists inside character sets: `]', `-' and `^'.
        !           145: 
        !           146:      To include a `]' in a character set, you must make it the first
        !           147:      character.  For example, `[]a]' matches `]' or `a'.  To include
        !           148:      a `-', you must use it in a context where it cannot possibly
        !           149:      indicate a range: that is, as the first character, or
        !           150:      immediately after a range.
        !           151: 
        !           152: `[^ ... ]'
        !           153:      `[^' begins a "complement character set", which matches any
        !           154:      character except the ones specified.  Thus, `[^a-z0-9A-Z]'
        !           155:      matches all characters *except* letters and digits.
        !           156: 
        !           157:      `^' is not special in a character set unless it is the first
        !           158:      character.  The character following the `^' is treated as if it
        !           159:      were first (it may be a `-' or a `]').
        !           160: 
        !           161: `^'
        !           162:      is a special character that matches the empty string -- but only
        !           163:      if at the beginning of a line in the text being matched. 
        !           164:      Otherwise it fails to match anything.  Thus, `^foo' matches a
        !           165:      `foo' which occurs at the beginning of a line.
        !           166: 
        !           167: `$'
        !           168:      is similar to `^' but matches only at the end of a line.  Thus,
        !           169:      `xx*$' matches a string of one or more `x''s at the end of a line.
        !           170: 
        !           171: `\'
        !           172:      has two functions: it quotes the above special characters
        !           173:      (including `\'), and it introduces additional special constructs.
        !           174: 
        !           175:      Because `\' quotes special characters, `\$' is a regular
        !           176:      expression which matches only `$', and `\[' is a regular
        !           177:      expression which matches only `[', and so on.
        !           178: 
        !           179:      For the most part, `\' followed by any character matches only
        !           180:      that character.  However, there are several exceptions:
        !           181:      characters which, when preceded by `\', are special constructs. 
        !           182:      Such characters are always ordinary when encountered on their own.
        !           183: 
        !           184:      No new special characters will ever be defined.  All extensions
        !           185:      to the regular expression syntax are made by defining new
        !           186:      two-character constructs that begin with `\'.
        !           187: 
        !           188: `\|'
        !           189:      specifies an alternative.  Two regular expressions A and B with
        !           190:      `\|' in between form an expression that matches anything that
        !           191:      either A or B will match.
        !           192: 
        !           193:      Thus, `foo\|bar' matches either `foo' or `bar' but no other
        !           194:      string.
        !           195: 
        !           196:      `\|' applies to the largest possible surrounding expressions. 
        !           197:      Only a surrounding `\( ... \)' grouping can limit the grouping
        !           198:      power of `\|'.
        !           199: 
        !           200:      Full backtracking capability exists when multiple `\|''s are used.
        !           201: 
        !           202: `\( ... \)'
        !           203:      is a grouping construct that serves three purposes:
        !           204: 
        !           205:        1. To enclose a set of `\|' alternatives for other operations.
        !           206:           Thus, `\(foo\|bar\)x' matches either `foox' or `barx'.
        !           207: 
        !           208:        2. To enclose a complicated expression for the postfix `*' to
        !           209:           operate on.  Thus, `ba\(na\)*' matches `bananana', etc.,
        !           210:           with any (zero or more) number of `na''s.
        !           211: 
        !           212:        3. To mark a matched substring for future reference.
        !           213: 
        !           214:      This last application is not a consequence of the idea of a
        !           215:      parenthetical grouping; it is a separate feature which happens
        !           216:      to be assigned as a second meaning to the same `\( ... \)'
        !           217:      construct because there is no conflict in practice between the
        !           218:      two meanings.  Here is an explanation of this feature:
        !           219: 
        !           220: `\DIGIT'
        !           221:      After the end of a `\( ... \)' construct, the matcher remembers
        !           222:      the beginning and end of the text matched by that construct. 
        !           223:      Then, later on in the regular expression, you can use `\'
        !           224:      followed by DIGIT to mean "match the same text matched the
        !           225:      DIGIT'th time by the `\( ... \)' construct."  The `\( ... \)'
        !           226:      constructs are numbered in order of commencement in the regexp.
        !           227: 
        !           228:      The strings matching the first nine `\( ... \)' constructs
        !           229:      appearing in a regular expression are assigned numbers 1 through
        !           230:      9 in order of their beginnings.  `\1' through `\9' may be used
        !           231:      to refer to the text matched by the corresponding `\( ... \)'
        !           232:      construct.
        !           233: 
        !           234:      For example, `\(.*\)\1' matches any string that is composed of
        !           235:      two identical halves.  The `\(.*\)' matches the first half,
        !           236:      which may be anything, but the `\1' that follows must match the
        !           237:      same exact text.
        !           238: 
        !           239: `\b'
        !           240:      matches the empty string, but only if it is at the beginning or
        !           241:      end of a word.  Thus, `\bfoo\b' matches any occurrence of `foo'
        !           242:      as a separate word.  `\bball\(s\|\)\b' matches `ball' or `balls'
        !           243:      as a separate word.
        !           244: 
        !           245: `\B'
        !           246:      matches the empty string, provided it is *not* at the beginning
        !           247:      or end of a word.
        !           248: 
        !           249: `\<'
        !           250:      matches the empty string, but only if it is at the beginning of
        !           251:      a word.
        !           252: 
        !           253: `\>'
        !           254:      matches the empty string, but only if it is at the end of a word.
        !           255: 
        !           256: `\w'
        !           257:      matches any word-constituent character.
        !           258: 
        !           259: `\W'
        !           260:      matches any character that is not a word-constituent.
        !           261: 
        !           262: There are a number of additional `\' regexp directives available for
        !           263: use within Emacs only.
        !           264: 
        !           265: (*note emacs::.).
        !           266: 
        !           267: 
        !           268: 
        !           269: File: regex,  Node: emacs,  Next: programming,  Prev: directives,  Up: top
        !           270: 
        !           271: Constructs Available in Emacs Only
        !           272: ----------------------------------
        !           273: 
        !           274: `\`'
        !           275:      matches the empty string, but only if it is at the beginning of
        !           276:      the buffer.
        !           277: 
        !           278: `\''
        !           279:      matches the empty string, but only if it is at the end of the
        !           280:      buffer.
        !           281: 
        !           282: `\sCODE'
        !           283:      matches any character whose syntax is CODE.  CODE is a letter
        !           284:      which represents a syntax code: thus, `w' for word constituent,
        !           285:      `-' for whitespace, `(' for open-parenthesis, etc.  See the
        !           286:      documentation for the Emacs function `modify-syntax-entry' for
        !           287:      further details.
        !           288: 
        !           289:      Thus, `\s(' matches any character with open-parenthesis syntax.
        !           290: 
        !           291: `\SCODE'
        !           292:      matches any character whose syntax is not CODE.
        !           293: 
        !           294: 
        !           295: 
        !           296: File: regex,  Node: programming,  Next: compiling,  Prev: emacs,  Up: top
        !           297: 
        !           298: Programming using the `regex' library
        !           299: =====================================
        !           300: 
        !           301: The subnodes accessible from this menu give information on entry
        !           302: points and data structures which C programs need to interface to the
        !           303: `regex' library.
        !           304: 
        !           305: * Menu:
        !           306: 
        !           307: * compiling::  How to compile regular expressions
        !           308: * matching::   Matching compiled regular expressions
        !           309: * searching::  Searching for compiled regular expressions
        !           310: * translation::        Translating characters into other characters
        !           311:                  (for both compilation and matching)
        !           312: * registers::  determining what was matched
        !           313: * split::      matching data which is split into two pieces
        !           314: * unix::       Unix-compatible entry-points to regex library
        !           315: 
        !           316: 
        !           317: 
        !           318: File: regex,  Node: compiling,  Next: matching,  Prev: programming,  Up: programming
        !           319: 
        !           320: Compiling a Regular Expression
        !           321: ------------------------------
        !           322: 
        !           323: To compile a regular expression, you must supply a pattern buffer. 
        !           324: This is a structure defined, in the include file `regex.h', as
        !           325: follows:
        !           326: 
        !           327:      struct re_pattern_buffer
        !           328:        {
        !           329:          char *buffer   /* Space holding the compiled pattern commands. */
        !           330:          int allocated  /* Size of space that  buffer  points to */
        !           331:          int used       /* Length of portion of buffer actually occupied */
        !           332:          char *fastmap; /* Pointer to fastmap, if any, or zero if none. */
        !           333:                         /* re_search uses the fastmap, if there is one,
        !           334:                            to skip quickly over totally implausible
        !           335:                            characters */
        !           336:          char *translate;
        !           337:                         /* Translate table to apply to characters before
        !           338:                            comparing, or zero for no translation.
        !           339:                            The translation is applied to a pattern when
        !           340:                            it is compiled and to data when it is matched. */
        !           341:          char fastmap_accurate;
        !           342:                         /* Set to zero when a new pattern is stored,
        !           343:                            set to one when the fastmap is updated from it. */
        !           344:        };
        !           345: 
        !           346: Before compiling a pattern, you must initialize the `buffer' field to
        !           347: point to a block of memory obtained with `malloc', and the
        !           348: `allocated' field to the size of that block, in bytes.  The pattern
        !           349: compiler will replace this block with a larger one if necessary.
        !           350: 
        !           351: You must also initialize the `translate' field to point to the
        !           352: translate table that you will use when you match the compiled
        !           353: pattern, or to zero if you will use no translate table when you
        !           354: match.  *Note translation::.
        !           355: 
        !           356: Then call `re_compile_pattern' to compile a regular expression into
        !           357: the buffer:
        !           358: 
        !           359:      re_compile_pattern (REGEX, REGEX_SIZE, BUF)
        !           360: 
        !           361: REGEX is the address of the regular expression (`char *'), REGEX_SIZE
        !           362: is its length (`int'), BUF is the address of the buffer (`struct
        !           363: re_pattern_buffer *').
        !           364: 
        !           365: `re_compile_pattern' returns zero if it succeeds in compiling the
        !           366: regular expression.  In that case, `*buf' now contains the results. 
        !           367: Otherwise, `re_compile_pattern' returns a string which serves as an
        !           368: error message.
        !           369: 
        !           370: After compiling, if you wish to search for the pattern, you must
        !           371: initialize the `fastmap' component of the pattern buffer.  *Note
        !           372: searching::.
        !           373: 
        !           374: 
        !           375: 
        !           376: File: regex,  Node: matching,  Next: searching,  Prev: compiling,  Up: programming
        !           377: 
        !           378: Matching a Compiled Pattern
        !           379: ---------------------------
        !           380: 
        !           381: Once a regular expression has been compiled into a pattern buffer,
        !           382: you can match the pattern buffer against a string with `re_match'.
        !           383: 
        !           384:      re_match (BUF, STRING, SIZE, POS, REGS)
        !           385: 
        !           386: BUF is, once again, the address of the buffer (`struct
        !           387: re_pattern_buffer *').  STRING is the string to be matched (`char *').
        !           388: sIZE is the length of that string (`int').  POS is the position
        !           389: within the string at which to begin matching (`int').  The beginning
        !           390: of the string is position 0.  REGS is described below.  Normally it
        !           391: is zero.  *Note registers::.
        !           392: 
        !           393: `re_match' returns `-1' if the pattern does not match; otherwise, it
        !           394: returns the length of the portion of `string' which was matched.
        !           395: 
        !           396: For example, suppose that BUF points to a buffer containing the
        !           397: result of compiling `x*', STRING points to `xxxxxy', and SIZE is `6'.
        !           398: Suppose that POS is `2'.  Then the last three `x''s will be matched,
        !           399: so `re_match' will return `3'.  If POS is zero, the value will be `5'.
        !           400: If POS is `5' or `6', the value will be zero, meaning that the null
        !           401: string was successfully matched.  Note that since `x*' matches the
        !           402: empty string, it will never entirely fail.
        !           403: 
        !           404: It is up to the caller to avoid passing a value of POS that results
        !           405: in matching outside the specified string.  POS must not be negative
        !           406: and must not be greater than SIZE.
        !           407: 
        !           408: 
        !           409: 
        !           410: File: regex,  Node: searching,  Next: translation,  Prev: matching,  Up: programming
        !           411: 
        !           412: Searching for a Match
        !           413: ---------------------
        !           414: 
        !           415: Searching means trying successive starting positions for a match
        !           416: until a match is found.  To search, you supply a compiled pattern
        !           417: buffer.  Before searching you must initialize the `fastmap' field of
        !           418: the pattern buffer (see below).
        !           419: 
        !           420:      re_search (BUF, STRING, SIZE, STARTPOS, RANGE, REGS)
        !           421: 
        !           422: is called like `re_match' except that the POS argument is replaced by
        !           423: two arguments STARTPOS and RANGE.  `re_search' tests for a match
        !           424: starting at index STARTPOS, then at `STARTPOS + 1', and so on.  It
        !           425: tries RANGE consecutive positions before giving up and returning
        !           426: `-1'.  If a match is found, `re_search' returns the index at which
        !           427: the match was found.
        !           428: 
        !           429: If RANGE is negative, RE_SEARCH tries starting positions STARTPOS,
        !           430: `STARTPOS - 1', ... in that order.  `|RANGE|' is the number of tries
        !           431: made.
        !           432: 
        !           433: It is up to the caller to avoid passing value of STARTPOS and RANGE
        !           434: that result in matching outside the specified string.  STARTPOS must
        !           435: be between zero and SIZE, inclusive, and so must `STARTPOS + RANGE -
        !           436: 1' (if RANGE is positive) or `STARTPOS + RANGE + 1' (if RANGE is
        !           437: negative).
        !           438: 
        !           439: If you may be searching over a long distance (that is, trying many
        !           440: different match starting points) with a compiled pattern, you should
        !           441: use a "fastmap" in it.  This is a block of 256 bytes, whose address
        !           442: is placed in the `fastmap' component of the pattern buffer.  The
        !           443: first time you search for a particular compiled pattern, the fastmap
        !           444: is set so that `FASTMAP[CH]' is nonzero if the character CH might
        !           445: possibly start a match for this pattern.  `re_search' checks each
        !           446: character against the fastmap so that it can skip more quickly over
        !           447: non-matches.
        !           448: 
        !           449: If you do not want a fastmap, store zero in the `fastmap' component
        !           450: of the pattern buffer before calling `re_search'.
        !           451: 
        !           452: In either case, you must initialize this component in a pattern
        !           453: buffer before you can use that buffer in a search; but you can choose
        !           454: as an initial value either zero or the address of a suitable block of
        !           455: memory.
        !           456: 
        !           457: If you compile a new pattern in an existing pattern buffer, it is not
        !           458: necessary to reinitialize the `fastmap' component (unless you wish to
        !           459: override your previous choice).
        !           460: 
        !           461: 
        !           462: 
        !           463: File: regex,  Node: translation,  Next: registers,  Prev: searching,  Up: programming
        !           464: 
        !           465: Translate Tables
        !           466: ----------------
        !           467: 
        !           468: With a translate table, you can apply a transformation to all
        !           469: characters before they are compared.  For example, a table that maps
        !           470: lower case letters into upper case (or vice versa) causes differences
        !           471: in case to be ignored by matching.
        !           472: 
        !           473: A translate table is a block of 256 bytes.  Each character of raw
        !           474: data is used as an index in the translate table.  The value found
        !           475: there is used instead of the original character.  Each character in a
        !           476: regular expression, except for the syntactic constructs, is
        !           477: translated when the expression is compiled.  Each character of a
        !           478: string being matched is translated whenever it is compared or tested.
        !           479: 
        !           480: A suitable translate table to ignore differences in case maps all
        !           481: characters into themselves, except for lower case letters, which are
        !           482: mapped into the corresponding upper case letters.  It could be
        !           483: initialized by:
        !           484: 
        !           485:      for (i = 0; i < 0400; i++)
        !           486:        table[i] = i;
        !           487:      for (i = 'a'; i <= 'z'; i++)
        !           488:        table[i] = i - 040;
        !           489: 
        !           490: You specify the use of a translate table by putting its address in
        !           491: the TRANSLATE component of the compiled pattern buffer.  If this
        !           492: component is zero, no translation is done.  Since both compilation
        !           493: and matching use the translate table, you must use the same table
        !           494: contents for both operations or confusing things will happen.
        !           495: 
        !           496: 
        !           497: 
        !           498: File: regex,  Node: registers,  Next: split,  Prev: translation,  Up: programming
        !           499: 
        !           500: Registers: or "What Did the `\( ... \)' Groupings Actually Match?"
        !           501: ------------------------------------------------------------------
        !           502: 
        !           503: If you want to find out, after the match, what each of the first nine
        !           504: `\( ... \)' groupings actually matched, you can pass the REGS
        !           505: argument to the match or search function.  Pass the address of a
        !           506: structure of this type:
        !           507: 
        !           508:      struct re_registers
        !           509:        {
        !           510:          int start[RE_NREGS];
        !           511:          int end[RE_NREGS];
        !           512:        };
        !           513: 
        !           514:   `re_match' and `re_search' will store into this structure the data
        !           515: you want.  `REGS->start[REG]' will be the index in STRING of the
        !           516: beginning of the data matched by the REG'th `\( ... \)' grouping, and
        !           517: `REGS->end[REG]' will be the index of the end of that data (the index
        !           518: of the first character beyond those matched).  The values in the
        !           519: start and end arrays at indexes greater than the number of `\( ...
        !           520: \)' groupings present in the regular expression will be set to the
        !           521: value -1.  Register numbers start at 1 and run to `RE_NREGS - 1'
        !           522: (normally `9').  `REGS->start[0]' and `REGS->end[0]' are similar but
        !           523: describe the extent of the substring matched by the entire pattern.
        !           524: 
        !           525:   Both `struct re_registers' and `RE_NREGS' are defined in `regex.h'.
        !           526: 
        !           527: 
        !           528: 
        !           529: File: regex,  Node: split,  Next: unix,  Prev: registers,  Up: programming
        !           530: 
        !           531: Matching against Split Data
        !           532: ---------------------------
        !           533: 
        !           534: The functions `re_match_2' and `re_search_2' allow one to match in or
        !           535: search data which is divided into two strings.
        !           536: 
        !           537: `re_match_2' works like `re_match' except that two data strings and
        !           538: sizes must be given.
        !           539: 
        !           540:      re_match_2 (BUF, STRING1, SIZE1, STRING2, SIZE2, POS, REGS)
        !           541: 
        !           542: The matcher regards the contents of STRING1 as effectively followed
        !           543: by the contents of STRING2, and matches the combined string against
        !           544: the pattern in BUF.
        !           545: 
        !           546: `re_search_2' is likewise similar to `re_search':
        !           547: 
        !           548:      re_search_2 (BUF, STRING1, SIZE1, STRING2, SIZE2, STARTPOS, RANGE, REGS)
        !           549: 
        !           550: The value returned by RE_SEARCH_2 is an index into the combined data
        !           551: made up of STRING1 and STRING2.  It never exceeds `SIZE1 + SIZE2'. 
        !           552: The values returned in the REGS structure (if there is one) are
        !           553: likewise indices in the combined data.
        !           554: 
        !           555: 
        !           556: 
        !           557: File: regex,  Node: unix,  Prev: split,  Up: programming
        !           558: 
        !           559: Unix-Compatible Entry Points
        !           560: ----------------------------
        !           561: 
        !           562: The standard Berkeley Unix way to compile a regular expression is to
        !           563: call `re_comp'.  This function takes a single argument, the address
        !           564: of the regular expression, which is assumed to be terminated by a
        !           565: null character.
        !           566: 
        !           567: `re_comp' does not ask you to specify a pattern buffer because it has
        !           568: its own pattern buffer -- just one.  Using `re_comp', one may match
        !           569: only the most recently compiled regular expression.
        !           570: 
        !           571: The value of `re_comp' is zero for success or else an error message
        !           572: string, as for `re_compile_pattern'.
        !           573: 
        !           574: Calling `re_comp' with the null string as argument it has no effect;
        !           575: the contents of the buffer remain unchanged.
        !           576: 
        !           577: The standard Berkeley Unix way to match the last regular expression
        !           578: compiled is to call `re_exec'.  This takes a single argument, the
        !           579: address of the string to be matched.  This string is assumed to be
        !           580: terminated by a null character.  Matching is tried starting at each
        !           581: position in the string.  `re_exec' returns `1' for success or `0' for
        !           582: failure.  One cannot find out how long a substring was matched, nor
        !           583: what the `\( ... \)' groupings matched.
        !           584: 
        !           585: 
        !           586: 
        !           587: Tag Table:
        !           588: Node: top85
        !           589: Node: syntax1706
        !           590: Node: directives3241
        !           591: Node: emacs10891
        !           592: Node: programming11653
        !           593: Node: compiling12381
        !           594: Node: matching14827
        !           595: Node: searching16269
        !           596: Node: translation18534
        !           597: Node: registers19952
        !           598: Node: split21245
        !           599: Node: unix22183
        !           600: 
        !           601: End Tag Table
unix.superglobalmegacorp.com
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.