|
|
1.1 root 1: /* Definitions for data structures callers pass the regex library.
2: Copyright (C) 1985 Free Software Foundation, Inc.
3:
4: NO WARRANTY
5:
6: BECAUSE THIS PROGRAM IS LICENSED FREE OF CHARGE, WE PROVIDE ABSOLUTELY
7: NO WARRANTY, TO THE EXTENT PERMITTED BY APPLICABLE STATE LAW. EXCEPT
8: WHEN OTHERWISE STATED IN WRITING, FREE SOFTWARE FOUNDATION, INC,
9: RICHARD M. STALLMAN AND/OR OTHER PARTIES PROVIDE THIS PROGRAM "AS IS"
10: WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING,
11: BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
12: FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY
13: AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE
14: DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR
15: CORRECTION.
16:
17: IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW WILL RICHARD M.
18: STALLMAN, THE FREE SOFTWARE FOUNDATION, INC., AND/OR ANY OTHER PARTY
19: WHO MAY MODIFY AND REDISTRIBUTE THIS PROGRAM AS PERMITTED BELOW, BE
20: LIABLE TO YOU FOR DAMAGES, INCLUDING ANY LOST PROFITS, LOST MONIES, OR
21: OTHER SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
22: USE OR INABILITY TO USE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR
23: DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY THIRD PARTIES OR
24: A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS) THIS
25: PROGRAM, EVEN IF YOU HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH
26: DAMAGES, OR FOR ANY CLAIM BY ANY OTHER PARTY.
27:
28: GENERAL PUBLIC LICENSE TO COPY
29:
30: 1. You may copy and distribute verbatim copies of this source file
31: as you receive it, in any medium, provided that you conspicuously and
32: appropriately publish on each copy a valid copyright notice "Copyright
33: (C) 1985 Free Software Foundation, Inc."; and include following the
34: copyright notice a verbatim copy of the above disclaimer of warranty
35: and of this License. You may charge a distribution fee for the
36: physical act of transferring a copy.
37:
38: 2. You may modify your copy or copies of this source file or
39: any portion of it, and copy and distribute such modifications under
40: the terms of Paragraph 1 above, provided that you also do the following:
41:
42: a) cause the modified files to carry prominent notices stating
43: that you changed the files and the date of any change; and
44:
45: b) cause the whole of any work that you distribute or publish,
46: that in whole or in part contains or is a derivative of this
47: program or any part thereof, to be licensed at no charge to all
48: third parties on terms identical to those contained in this
49: License Agreement (except that you may choose to grant more extensive
50: warranty protection to some or all third parties, at your option).
51:
52: c) You may charge a distribution fee for the physical act of
53: transferring a copy, and you may at your option offer warranty
54: protection in exchange for a fee.
55:
56: Mere aggregation of another unrelated program with this program (or its
57: derivative) on a volume of a storage or distribution medium does not bring
58: the other program under the scope of these terms.
59:
60: 3. You may copy and distribute this program (or a portion or derivative
61: of it, under Paragraph 2) in object code or executable form under the terms
62: of Paragraphs 1 and 2 above provided that you also do one of the following:
63:
64: a) accompany it with the complete corresponding machine-readable
65: source code, which must be distributed under the terms of
66: Paragraphs 1 and 2 above; or,
67:
68: b) accompany it with a written offer, valid for at least three
69: years, to give any third party free (except for a nominal
70: shipping charge) a complete machine-readable copy of the
71: corresponding source code, to be distributed under the terms of
72: Paragraphs 1 and 2 above; or,
73:
74: c) accompany it with the information you received as to where the
75: corresponding source code may be obtained. (This alternative is
76: allowed only for noncommercial distribution and only if you
77: received the program in object code or executable form alone.)
78:
79: For an executable file, complete source code means all the source code for
80: all modules it contains; but, as a special exception, it need not include
81: source code for modules which are standard libraries that accompany the
82: operating system on which the executable file runs.
83:
84: 4. You may not copy, sublicense, distribute or transfer this program
85: except as expressly provided under this License Agreement. Any attempt
86: otherwise to copy, sublicense, distribute or transfer this program is void and
87: your rights to use the program under this License agreement shall be
88: automatically terminated. However, parties who have received computer
89: software programs from you with this License Agreement will not have
90: their licenses terminated so long as such parties remain in full compliance.
91:
92: 5. If you wish to incorporate parts of this program into other free
93: programs whose distribution conditions are different, write to the Free
94: Software Foundation at 675 Mass Ave, Cambridge, MA 02139. We have not yet
95: worked out a simple rule that can be stated here, but we will often permit
96: this. We will be guided by the two goals of preserving the free status of
97: all derivatives of our free software and of promoting the sharing and reuse of
98: software.
99:
100:
101: In other words, you are welcome to use, share and improve this program.
102: You are forbidden to forbid anyone else to use, share and improve
103: what you give them. Help stamp out software-hoarding! */
104:
105:
106: /* Define number of parens for which we record the beginnings and ends.
107: This affects how much space the `struct re_registers' type takes up. */
108: #ifndef RE_NREGS
109: #define RE_NREGS 10
110: #endif
111:
112: /* These bits are used in the obscure_syntax variable to choose among
113: alternative regexp syntaxes. */
114:
115: /* 1 means plain parentheses serve as grouping, and backslash
116: parentheses are needed for literal searching.
117: 0 means backslash-parentheses are grouping, and plain parentheses
118: are for literal searching. */
119: #define RE_NO_BK_PARENS 1
120:
121: /* 1 means plain | serves as the "or"-operator, and \| is a literal.
122: 0 means \| serves as the "or"-operator, and | is a literal. */
123: #define RE_NO_BK_VBAR 2
124:
125: /* 0 means plain + or ? serves as an operator, and \+, \? are literals.
126: 1 means \+, \? are operators and plain +, ? are literals. */
127: #define RE_BK_PLUS_QM 4
128:
129: /* 1 means | binds tighter than ^ or $.
130: 0 means the contrary. */
131: #define RE_TIGHT_VBAR 8
132:
133: /* 1 means treat \n as an _OR operator
134: 0 means treat it as a normal character */
135: #define RE_NEWLINE_OR 16
136:
137: /* 0 means that a special characters (such as *, ^, and $) always have
138: their special meaning regardless of the surrounding context.
139: 1 means that special characters may act as normal characters in some
140: contexts. Specifically, this applies to:
141: ^ - only special at the beginning, or after ( or |
142: $ - only special at the end, or before ) or |
143: *, +, ? - only special when not after the beginning, (, or | */
144: #define RE_CONTEXT_INDEP_OPS 32
145:
146: /* Now define combinations of bits for the standard possibilities. */
147: #define RE_SYNTAX_AWK (RE_NO_BK_PARENS | RE_NO_BK_VBAR | RE_CONTEXT_INDEP_OPS)
148: #define RE_SYNTAX_EGREP (RE_SYNTAX_AWK | RE_NEWLINE_OR)
149: #define RE_SYNTAX_GREP (RE_BK_PLUS_QM | RE_NEWLINE_OR)
150: #define RE_SYNTAX_EMACS 0
151:
152: /* This data structure is used to represent a compiled pattern. */
153:
154: struct re_pattern_buffer
155: {
156: char *buffer; /* Space holding the compiled pattern commands. */
157: int allocated; /* Size of space that buffer points to */
158: int used; /* Length of portion of buffer actually occupied */
159: char *fastmap; /* Pointer to fastmap, if any, or zero if none. */
160: /* re_search uses the fastmap, if there is one,
161: to skip quickly over totally implausible characters */
162: char *translate; /* Translate table to apply to all characters before comparing.
163: Or zero for no translation.
164: The translation is applied to a pattern when it is compiled
165: and to data when it is matched. */
166: char fastmap_accurate;
167: /* Set to zero when a new pattern is stored,
168: set to one when the fastmap is updated from it. */
169: char can_be_null; /* Set to one by compiling fastmap
170: if this pattern might match the null string.
171: It does not necessarily match the null string
172: in that case, but if this is zero, it cannot.
173: 2 as value means can match null string
174: but at end of range or before a character
175: listed in the fastmap. */
176: };
177:
178: /* Structure to store "register" contents data in.
179:
180: Pass the address of such a structure as an argument to re_match, etc.,
181: if you want this information back.
182:
183: start[i] and end[i] record the string matched by \( ... \) grouping i,
184: for i from 1 to RE_NREGS - 1.
185: start[0] and end[0] record the entire string matched. */
186:
187: struct re_registers
188: {
189: int start[RE_NREGS];
190: int end[RE_NREGS];
191: };
192:
193: /* These are the command codes that appear in compiled regular expressions, one per byte.
194: Some command codes are followed by argument bytes.
195: A command code can specify any interpretation whatever for its arguments.
196: Zero-bytes may appear in the compiled regular expression. */
197:
198: enum regexpcode
199: {
200: unused,
201: exactn, /* followed by one byte giving n, and then by n literal bytes */
202: begline, /* fails unless at beginning of line */
203: endline, /* fails unless at end of line */
204: jump, /* followed by two bytes giving relative address to jump to */
205: on_failure_jump, /* followed by two bytes giving relative address of place
206: to resume at in case of failure. */
207: finalize_jump, /* Throw away latest failure point and then jump to address. */
208: maybe_finalize_jump, /* Like jump but finalize if safe to do so.
209: This is used to jump back to the beginning
210: of a repeat. If the command that follows
211: this jump is clearly incompatible with the
212: one at the beginning of the repeat, such that
213: we can be sure that there is no use backtracking
214: out of repetitions already completed,
215: then we finalize. */
216: dummy_failure_jump, /* jump, and push a dummy failure point.
217: This failure point will be thrown away
218: if an attempt is made to use it for a failure.
219: A + construct makes this before the first repeat. */
220: anychar, /* matches any one character */
221: charset, /* matches any one char belonging to specified set.
222: First following byte is # bitmap bytes.
223: Then come bytes for a bit-map saying which chars are in.
224: Bits in each byte are ordered low-bit-first.
225: A character is in the set if its bit is 1.
226: A character too large to have a bit in the map
227: is automatically not in the set */
228: charset_not, /* similar but match any character that is NOT one of those specified */
229: start_memory, /* starts remembering the text that is matched
230: and stores it in a memory register.
231: followed by one byte containing the register number.
232: Register numbers must be in the range 0 through NREGS. */
233: stop_memory, /* stops remembering the text that is matched
234: and stores it in a memory register.
235: followed by one byte containing the register number.
236: Register numbers must be in the range 0 through NREGS. */
237: duplicate, /* match a duplicate of something remembered.
238: Followed by one byte containing the index of the memory register. */
239: before_dot, /* Succeeds if before dot */
240: at_dot, /* Succeeds if at dot */
241: after_dot, /* Succeeds if after dot */
242: begbuf, /* Succeeds if at beginning of buffer */
243: endbuf, /* Succeeds if at end of buffer */
244: wordchar, /* Matches any word-constituent character */
245: notwordchar, /* Matches any char that is not a word-constituent */
246: wordbeg, /* Succeeds if at word beginning */
247: wordend, /* Succeeds if at word end */
248: wordbound, /* Succeeds if at a word boundary */
249: notwordbound, /* Succeeds if not at a word boundary */
250: syntaxspec, /* Matches any character whose syntax is specified.
251: followed by a byte which contains a syntax code, Sword or such like */
252: notsyntaxspec /* Matches any character whose syntax differs from the specified. */
253: };
254:
255: extern char *re_compile_pattern ();
256: /* Is this really advertised? */
257: extern void re_compile_fastmap ();
258: extern int re_search (), re_search_2 ();
259: extern int re_match (), re_match_2 ();
260:
261: /* 4.2 bsd compatibility (yuck) */
262: extern char *re_comp ();
263: extern int re_exec ();
264:
265: #ifdef SYNTAX_TABLE
266: extern char *re_syntax_table;
267: #endif
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.