comparison man/search.texi @ 49984:632746dc04e4

(Regexps): Convert the main table into @table @asis.
author Richard M. Stallman <rms@gnu.org>
date Wed, 26 Feb 2003 09:55:45 +0000
parents 2eca4c95c2bf
children f37984f93151
comparison
equal deleted inserted replaced
49983:2a8850f484eb 49984:632746dc04e4
411 As a simple example, we can concatenate the regular expressions @samp{f} 411 As a simple example, we can concatenate the regular expressions @samp{f}
412 and @samp{o} to get the regular expression @samp{fo}, which matches only 412 and @samp{o} to get the regular expression @samp{fo}, which matches only
413 the string @samp{fo}. Still trivial. To do something nontrivial, you 413 the string @samp{fo}. Still trivial. To do something nontrivial, you
414 need to use one of the special characters. Here is a list of them. 414 need to use one of the special characters. Here is a list of them.
415 415
416 @table @kbd 416 @table @asis
417 @item .@: @r{(Period)} 417 @item @kbd{.}@: @r{(Period)}
418 is a special character that matches any single character except a newline. 418 is a special character that matches any single character except a newline.
419 Using concatenation, we can make regular expressions like @samp{a.b}, which 419 Using concatenation, we can make regular expressions like @samp{a.b}, which
420 matches any three-character string that begins with @samp{a} and ends with 420 matches any three-character string that begins with @samp{a} and ends with
421 @samp{b}.@refill 421 @samp{b}.@refill
422 422
423 @item * 423 @item @kbd{*}
424 is not a construct by itself; it is a postfix operator that means to 424 is not a construct by itself; it is a postfix operator that means to
425 match the preceding regular expression repetitively as many times as 425 match the preceding regular expression repetitively as many times as
426 possible. Thus, @samp{o*} matches any number of @samp{o}s (including no 426 possible. Thus, @samp{o*} matches any number of @samp{o}s (including no
427 @samp{o}s). 427 @samp{o}s).
428 428
439 tries to match all three @samp{a}s; but the rest of the pattern is 439 tries to match all three @samp{a}s; but the rest of the pattern is
440 @samp{ar} and there is only @samp{r} left to match, so this try fails. 440 @samp{ar} and there is only @samp{r} left to match, so this try fails.
441 The next alternative is for @samp{a*} to match only two @samp{a}s. 441 The next alternative is for @samp{a*} to match only two @samp{a}s.
442 With this choice, the rest of the regexp matches successfully.@refill 442 With this choice, the rest of the regexp matches successfully.@refill
443 443
444 @item + 444 @item @kbd{+}
445 is a postfix operator, similar to @samp{*} except that it must match 445 is a postfix operator, similar to @samp{*} except that it must match
446 the preceding expression at least once. So, for example, @samp{ca+r} 446 the preceding expression at least once. So, for example, @samp{ca+r}
447 matches the strings @samp{car} and @samp{caaaar} but not the string 447 matches the strings @samp{car} and @samp{caaaar} but not the string
448 @samp{cr}, whereas @samp{ca*r} matches all three strings. 448 @samp{cr}, whereas @samp{ca*r} matches all three strings.
449 449
450 @item ? 450 @item @kbd{?}
451 is a postfix operator, similar to @samp{*} except that it can match the 451 is a postfix operator, similar to @samp{*} except that it can match the
452 preceding expression either once or not at all. For example, 452 preceding expression either once or not at all. For example,
453 @samp{ca?r} matches @samp{car} or @samp{cr}; nothing else. 453 @samp{ca?r} matches @samp{car} or @samp{cr}; nothing else.
454 454
455 @item *?, +?, ?? 455 @item @kbd{*?}, @kbd{+?}, @kbd{??}
456 @cindex non-greedy regexp matching 456 @cindex non-greedy regexp matching
457 are non-greedy variants of the operators above. The normal operators 457 are non-greedy variants of the operators above. The normal operators
458 @samp{*}, @samp{+}, @samp{?} are @dfn{greedy} in that they match as 458 @samp{*}, @samp{+}, @samp{?} are @dfn{greedy} in that they match as
459 much as they can, as long as the overall regexp can still match. With 459 much as they can, as long as the overall regexp can still match. With
460 a following @samp{?}, they are non-greedy: they will match as little 460 a following @samp{?}, they are non-greedy: they will match as little
471 possible starting point for match is always the one chosen. Thus, if 471 possible starting point for match is always the one chosen. Thus, if
472 you search for @samp{a.*?$} against the text @samp{abbab} followed by 472 you search for @samp{a.*?$} against the text @samp{abbab} followed by
473 a newline, it matches the whole string. Since it @emph{can} match 473 a newline, it matches the whole string. Since it @emph{can} match
474 starting at the first @samp{a}, it does. 474 starting at the first @samp{a}, it does.
475 475
476 @item \@{@var{n}\@} 476 @item @kbd{\@{@var{n}\@}}
477 is a postfix operator that specifies repetition @var{n} times---that 477 is a postfix operator that specifies repetition @var{n} times---that
478 is, the preceding regular expression must match exactly @var{n} times 478 is, the preceding regular expression must match exactly @var{n} times
479 in a row. For example, @samp{x\@{4\@}} matches the string @samp{xxxx} 479 in a row. For example, @samp{x\@{4\@}} matches the string @samp{xxxx}
480 and nothing else. 480 and nothing else.
481 481
482 @item \@{@var{n},@var{m}\@} 482 @item @kbd{\@{@var{n},@var{m}\@}}
483 is a postfix operator that specifies repetition between @var{n} and 483 is a postfix operator that specifies repetition between @var{n} and
484 @var{m} times---that is, the preceding regular expression must match 484 @var{m} times---that is, the preceding regular expression must match
485 at least @var{n} times, but no more than @var{m} times. If @var{m} is 485 at least @var{n} times, but no more than @var{m} times. If @var{m} is
486 omitted, then there is no upper limit, but the preceding regular 486 omitted, then there is no upper limit, but the preceding regular
487 expression must match at least @var{n} times.@* @samp{\@{0,1\@}} is 487 expression must match at least @var{n} times.@* @samp{\@{0,1\@}} is
488 equivalent to @samp{?}. @* @samp{\@{0,\@}} is equivalent to 488 equivalent to @samp{?}. @* @samp{\@{0,\@}} is equivalent to
489 @samp{*}. @* @samp{\@{1,\@}} is equivalent to @samp{+}. 489 @samp{*}. @* @samp{\@{1,\@}} is equivalent to @samp{+}.
490 490
491 @item [ @dots{} ] 491 @item @kbd{[ @dots{} ]}
492 is a @dfn{character set}, which begins with @samp{[} and is terminated 492 is a @dfn{character set}, which begins with @samp{[} and is terminated
493 by @samp{]}. In the simplest case, the characters between the two 493 by @samp{]}. In the simplest case, the characters between the two
494 brackets are what this set can match. 494 brackets are what this set can match.
495 495
496 Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and 496 Thus, @samp{[ad]} matches either one @samp{a} or one @samp{d}, and
521 When you use a range in case-insensitive search, you should write both 521 When you use a range in case-insensitive search, you should write both
522 ends of the range in upper case, or both in lower case, or both should 522 ends of the range in upper case, or both in lower case, or both should
523 be non-letters. The behavior of a mixed-case range such as @samp{A-z} 523 be non-letters. The behavior of a mixed-case range such as @samp{A-z}
524 is somewhat ill-defined, and it may change in future Emacs versions. 524 is somewhat ill-defined, and it may change in future Emacs versions.
525 525
526 @item [^ @dots{} ] 526 @item @kbd{[^ @dots{} ]}
527 @samp{[^} begins a @dfn{complemented character set}, which matches any 527 @samp{[^} begins a @dfn{complemented character set}, which matches any
528 character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches 528 character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} matches
529 all characters @emph{except} ASCII letters and digits. 529 all characters @emph{except} ASCII letters and digits.
530 530
531 @samp{^} is not special in a character set unless it is the first 531 @samp{^} is not special in a character set unless it is the first
534 534
535 A complemented character set can match a newline, unless newline is 535 A complemented character set can match a newline, unless newline is
536 mentioned as one of the characters not to match. This is in contrast to 536 mentioned as one of the characters not to match. This is in contrast to
537 the handling of regexps in programs such as @code{grep}. 537 the handling of regexps in programs such as @code{grep}.
538 538
539 @item ^ 539 @item @kbd{^}
540 is a special character that matches the empty string, but only at the 540 is a special character that matches the empty string, but only at the
541 beginning of a line in the text being matched. Otherwise it fails to 541 beginning of a line in the text being matched. Otherwise it fails to
542 match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at 542 match anything. Thus, @samp{^foo} matches a @samp{foo} that occurs at
543 the beginning of a line. 543 the beginning of a line.
544 544
545 @item $ 545 @item @kbd{$}
546 is similar to @samp{^} but matches only at the end of a line. Thus, 546 is similar to @samp{^} but matches only at the end of a line. Thus,
547 @samp{x+$} matches a string of one @samp{x} or more at the end of a line. 547 @samp{x+$} matches a string of one @samp{x} or more at the end of a line.
548 548
549 @item \ 549 @item @kbd{\}
550 has two functions: it quotes the special characters (including 550 has two functions: it quotes the special characters (including
551 @samp{\}), and it introduces additional special constructs. 551 @samp{\}), and it introduces additional special constructs.
552 552
553 Because @samp{\} quotes special characters, @samp{\$} is a regular 553 Because @samp{\} quotes special characters, @samp{\$} is a regular
554 expression that matches only @samp{$}, and @samp{\[} is a regular 554 expression that matches only @samp{$}, and @samp{\[} is a regular