comparison lispref/searching.texi @ 8427:bc548090f760

*** empty log message ***
author Richard M. Stallman <rms@gnu.org>
date Wed, 03 Aug 1994 00:12:07 +0000
parents 7db892210924
children 9e44c96dd99d
comparison
equal deleted inserted replaced
8426:3abe02e03dc8 8427:bc548090f760
78 @code{nil} nor @code{t}, then @code{search-forward} moves point to the 78 @code{nil} nor @code{t}, then @code{search-forward} moves point to the
79 upper bound and returns @code{nil}. (It would be more consistent now 79 upper bound and returns @code{nil}. (It would be more consistent now
80 to return the new position of point in that case, but some programs 80 to return the new position of point in that case, but some programs
81 may depend on a value of @code{nil}.) 81 may depend on a value of @code{nil}.)
82 82
83 If @var{repeat} is non-@code{nil}, then the search is repeated that 83 If @var{repeat} is supplied (it must be a positive number), then the
84 many times. Point is positioned at the end of the last match. 84 search is repeated that many times (each time starting at the end of the
85 previous time's match). If these successive searches succeed, the
86 function succeeds, moving point and returning its new value. Otherwise
87 the search fails.
85 @end deffn 88 @end deffn
86 89
87 @deffn Command search-backward string &optional limit noerror repeat 90 @deffn Command search-backward string &optional limit noerror repeat
88 This function searches backward from point for @var{string}. It is 91 This function searches backward from point for @var{string}. It is
89 just like @code{search-forward} except that it searches backwards and 92 just like @code{search-forward} except that it searches backwards and
163 @end menu 166 @end menu
164 167
165 @node Syntax of Regexps 168 @node Syntax of Regexps
166 @subsection Syntax of Regular Expressions 169 @subsection Syntax of Regular Expressions
167 170
168 Regular expressions have a syntax in which a few characters are special 171 Regular expressions have a syntax in which a few characters are
169 constructs and the rest are @dfn{ordinary}. An ordinary character is a 172 special constructs and the rest are @dfn{ordinary}. An ordinary
170 simple regular expression which matches that character and nothing else. 173 character is a simple regular expression that matches that character and
171 The special characters are @samp{$}, @samp{^}, @samp{.}, @samp{*}, 174 nothing else. The special characters are @samp{.}, @samp{*}, @samp{+},
172 @samp{+}, @samp{?}, @samp{[}, @samp{]} and @samp{\}; no new special 175 @samp{?}, @samp{[}, @samp{]}, @samp{^}, @samp{$}, and @samp{\}; no new
173 characters will be defined in the future. Any other character appearing 176 special characters will be defined in the future. Any other character
174 in a regular expression is ordinary, unless a @samp{\} precedes it. 177 appearing in a regular expression is ordinary, unless a @samp{\}
178 precedes it.
175 179
176 For example, @samp{f} is not a special character, so it is ordinary, and 180 For example, @samp{f} is not a special character, so it is ordinary, and
177 therefore @samp{f} is a regular expression that matches the string 181 therefore @samp{f} is a regular expression that matches the string
178 @samp{f} and no other string. (It does @emph{not} match the string 182 @samp{f} and no other string. (It does @emph{not} match the string
179 @samp{ff}.) Likewise, @samp{o} is a regular expression that matches 183 @samp{ff}.) Likewise, @samp{o} is a regular expression that matches
180 only @samp{o}.@refill 184 only @samp{o}.@refill
181 185
182 Any two regular expressions @var{a} and @var{b} can be concatenated. The 186 Any two regular expressions @var{a} and @var{b} can be concatenated. The
183 result is a regular expression which matches a string if @var{a} matches 187 result is a regular expression that matches a string if @var{a} matches
184 some amount of the beginning of that string and @var{b} matches the rest of 188 some amount of the beginning of that string and @var{b} matches the rest of
185 the string.@refill 189 the string.@refill
186 190
187 As a simple example, we can concatenate the regular expressions @samp{f} 191 As a simple example, we can concatenate the regular expressions @samp{f}
188 and @samp{o} to get the regular expression @samp{fo}, which matches only 192 and @samp{o} to get the regular expression @samp{fo}, which matches only
253 257
254 @samp{-} is used for ranges of characters. To write a range, write two 258 @samp{-} is used for ranges of characters. To write a range, write two
255 characters with a @samp{-} between them. Thus, @samp{[a-z]} matches any 259 characters with a @samp{-} between them. Thus, @samp{[a-z]} matches any
256 lower case letter. Ranges may be intermixed freely with individual 260 lower case letter. Ranges may be intermixed freely with individual
257 characters, as in @samp{[a-z$%.]}, which matches any lower case letter 261 characters, as in @samp{[a-z$%.]}, which matches any lower case letter
258 or @samp{$}, @samp{%} or a period.@refill 262 or @samp{$}, @samp{%}, or a period.@refill
259 263
260 To include a @samp{]} in a character set, make it the first character. 264 To include a @samp{]} in a character set, make it the first character.
261 For example, @samp{[]a]} matches @samp{]} or @samp{a}. To include a 265 For example, @samp{[]a]} matches @samp{]} or @samp{a}. To include a
262 @samp{-}, write @samp{-} as the first character in the set, or put 266 @samp{-}, write @samp{-} as the first character in the set, or put it
263 immediately after a range. (You can replace one individual character 267 immediately after a range. (You can replace one individual character
264 @var{c} with the range @samp{@var{c}-@var{c}} to make a place to put the 268 @var{c} with the range @samp{@var{c}-@var{c}} to make a place to put the
265 @samp{-}). There is no way to write a set containing just @samp{-} and 269 @samp{-}.) There is no way to write a set containing just @samp{-} and
266 @samp{]}. 270 @samp{]}.
267 271
268 To include @samp{^} in a set, put it anywhere but at the beginning of 272 To include @samp{^} in a set, put it anywhere but at the beginning of
269 the set. 273 the set.
270 274
282 newline is mentioned as one of the characters not to match. 286 newline is mentioned as one of the characters not to match.
283 287
284 @item ^ 288 @item ^
285 @cindex @samp{^} in regexp 289 @cindex @samp{^} in regexp
286 @cindex beginning of line in regexp 290 @cindex beginning of line in regexp
287 is a special character that matches the empty string, but only at 291 is a special character that matches the empty string, but only at the
288 the beginning of a line in the text being matched. Otherwise it fails 292 beginning of a line in the text being matched. Otherwise it fails to
289 to match anything. Thus, @samp{^foo} matches a @samp{foo} which occurs 293 match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at
290 at the beginning of a line. 294 the beginning of a line.
291 295
292 When matching a string, @samp{^} matches at the beginning of the string 296 When matching a string instead of a buffer, @samp{^} matches at the
293 or after a newline character @samp{\n}. 297 beginning of the string or after a newline character @samp{\n}.
294 298
295 @item $ 299 @item $
296 @cindex @samp{$} in regexp 300 @cindex @samp{$} in regexp
297 is similar to @samp{^} but matches only at the end of a line. Thus, 301 is similar to @samp{^} but matches only at the end of a line. Thus,
298 @samp{x+$} matches a string of one @samp{x} or more at the end of a line. 302 @samp{x+$} matches a string of one @samp{x} or more at the end of a line.
299 303
300 When matching a string, @samp{$} matches at the end of the string 304 When matching a string instead of a buffer, @samp{$} matches at the end
301 or before a newline character @samp{\n}. 305 of the string or before a newline character @samp{\n}.
302 306
303 @item \ 307 @item \
304 @cindex @samp{\} in regexp 308 @cindex @samp{\} in regexp
305 has two functions: it quotes the special characters (including 309 has two functions: it quotes the special characters (including
306 @samp{\}), and it introduces additional special constructs. 310 @samp{\}), and it introduces additional special constructs.
307 311
308 Because @samp{\} quotes special characters, @samp{\$} is a regular 312 Because @samp{\} quotes special characters, @samp{\$} is a regular
309 expression which matches only @samp{$}, and @samp{\[} is a regular 313 expression that matches only @samp{$}, and @samp{\[} is a regular
310 expression which matches only @samp{[}, and so on. 314 expression that matches only @samp{[}, and so on.
311 315
312 Note that @samp{\} also has special meaning in the read syntax of Lisp 316 Note that @samp{\} also has special meaning in the read syntax of Lisp
313 strings (@pxref{String Type}), and must be quoted with @samp{\}. For 317 strings (@pxref{String Type}), and must be quoted with @samp{\}. For
314 example, the regular expression that matches the @samp{\} character is 318 example, the regular expression that matches the @samp{\} character is
315 @samp{\\}. To write a Lisp string that contains the characters 319 @samp{\\}. To write a Lisp string that contains the characters
320 324
321 @strong{Please note:} For historical compatibility, special characters 325 @strong{Please note:} For historical compatibility, special characters
322 are treated as ordinary ones if they are in contexts where their special 326 are treated as ordinary ones if they are in contexts where their special
323 meanings make no sense. For example, @samp{*foo} treats @samp{*} as 327 meanings make no sense. For example, @samp{*foo} treats @samp{*} as
324 ordinary since there is no preceding expression on which the @samp{*} 328 ordinary since there is no preceding expression on which the @samp{*}
325 can act. It is poor practice to depend on this behavior; better to 329 can act. It is poor practice to depend on this behavior; quote the
326 quote the special character anyway, regardless of where it 330 special character anyway, regardless of where it appears.@refill
327 appears.@refill
328 331
329 For the most part, @samp{\} followed by any character matches only 332 For the most part, @samp{\} followed by any character matches only
330 that character. However, there are several exceptions: characters 333 that character. However, there are several exceptions: characters
331 which, when preceded by @samp{\}, are special constructs. Such 334 that, when preceded by @samp{\}, are special constructs. Such
332 characters are always ordinary when encountered on their own. Here 335 characters are always ordinary when encountered on their own. Here
333 is a table of @samp{\} constructs: 336 is a table of @samp{\} constructs:
334 337
335 @table @kbd 338 @table @kbd
336 @item \| 339 @item \|
369 @item 372 @item
370 To record a matched substring for future reference. 373 To record a matched substring for future reference.
371 @end enumerate 374 @end enumerate
372 375
373 This last application is not a consequence of the idea of a 376 This last application is not a consequence of the idea of a
374 parenthetical grouping; it is a separate feature which happens to be 377 parenthetical grouping; it is a separate feature that happens to be
375 assigned as a second meaning to the same @samp{\( @dots{} \)} construct 378 assigned as a second meaning to the same @samp{\( @dots{} \)} construct
376 because there is no conflict in practice between the two meanings. 379 because there is no conflict in practice between the two meanings.
377 Here is an explanation of this feature: 380 Here is an explanation of this feature:
378 381
379 @item \@var{digit} 382 @item \@var{digit}
380 matches the same text which matched the @var{digit}th occurrence of a 383 matches the same text that matched the @var{digit}th occurrence of a
381 @samp{\( @dots{} \)} construct. 384 @samp{\( @dots{} \)} construct.
382 385
383 In other words, after the end of a @samp{\( @dots{} \)} construct. the 386 In other words, after the end of a @samp{\( @dots{} \)} construct. the
384 matcher remembers the beginning and end of the text matched by that 387 matcher remembers the beginning and end of the text matched by that
385 construct. Then, later on in the regular expression, you can use 388 construct. Then, later on in the regular expression, you can use
402 matches any word-constituent character. The editor syntax table 405 matches any word-constituent character. The editor syntax table
403 determines which characters these are. @xref{Syntax Tables}. 406 determines which characters these are. @xref{Syntax Tables}.
404 407
405 @item \W 408 @item \W
406 @cindex @samp{\W} in regexp 409 @cindex @samp{\W} in regexp
407 matches any character that is not a word-constituent. 410 matches any character that is not a word constituent.
408 411
409 @item \s@var{code} 412 @item \s@var{code}
410 @cindex @samp{\s} in regexp 413 @cindex @samp{\s} in regexp
411 matches any character whose syntax is @var{code}. Here @var{code} is a 414 matches any character whose syntax is @var{code}. Here @var{code} is a
412 character which represents a syntax code: thus, @samp{w} for word 415 character that represents a syntax code: thus, @samp{w} for word
413 constituent, @samp{-} for whitespace, @samp{(} for open parenthesis, 416 constituent, @samp{-} for whitespace, @samp{(} for open parenthesis,
414 etc. @xref{Syntax Tables}, for a list of syntax codes and the 417 etc. @xref{Syntax Tables}, for a list of syntax codes and the
415 characters that stand for them. 418 characters that stand for them.
416 419
417 @item \S@var{code} 420 @item \S@var{code}
418 @cindex @samp{\S} in regexp 421 @cindex @samp{\S} in regexp
419 matches any character whose syntax is not @var{code}. 422 matches any character whose syntax is not @var{code}.
420 @end table 423 @end table
421 424
422 These regular expression constructs match the empty string---that is, 425 The following regular expression constructs match the empty string---that is,
423 they don't use up any characters---but whether they match depends on the 426 they don't use up any characters---but whether they match depends on the
424 context. 427 context.
425 428
426 @table @kbd 429 @table @kbd
427 @item \` 430 @item \`
461 @end table 464 @end table
462 465
463 @kindex invalid-regexp 466 @kindex invalid-regexp
464 Not every string is a valid regular expression. For example, a string 467 Not every string is a valid regular expression. For example, a string
465 with unbalanced square brackets is invalid (with a few exceptions, such 468 with unbalanced square brackets is invalid (with a few exceptions, such
466 as @samp{[]]}, and so is a string that ends with a single @samp{\}. If 469 as @samp{[]]}), and so is a string that ends with a single @samp{\}. If
467 an invalid regular expression is passed to any of the search functions, 470 an invalid regular expression is passed to any of the search functions,
468 an @code{invalid-regexp} error is signaled. 471 an @code{invalid-regexp} error is signaled.
469 472
470 @defun regexp-quote string 473 @defun regexp-quote string
471 This function returns a regular expression string that matches exactly 474 This function returns a regular expression string that matches exactly
479 @end group 482 @end group
480 @end example 483 @end example
481 484
482 One use of @code{regexp-quote} is to combine an exact string match with 485 One use of @code{regexp-quote} is to combine an exact string match with
483 context described as a regular expression. For example, this searches 486 context described as a regular expression. For example, this searches
484 for the string which is the value of @code{string}, surrounded by 487 for the string that is the value of @code{string}, surrounded by
485 whitespace: 488 whitespace:
486 489
487 @example 490 @example
488 @group 491 @group
489 (re-search-forward 492 (re-search-forward
490 (concat "\\s " (regexp-quote string) "\\s ")) 493 (concat "\\s-" (regexp-quote string) "\\s-"))
491 @end group 494 @end group
492 @end example 495 @end example
493 @end defun 496 @end defun
494 497
495 @node Regexp Example 498 @node Regexp Example