annotate lisp/emacs-lisp/sregex.el @ 103284:5090c7bf0a02

Eli has checked the msdog chapter.
author Chong Yidong <cyd@stupidchicken.com>
date Sun, 24 May 2009 20:08:12 +0000
parents a9dc0e7c3f2b
children 1d1d5d9bd884
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
1 ;;; sregex.el --- symbolic regular expressions
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
2
74466
1d4b1a32fd66 Update copyright years.
Glenn Morris <rgm@gnu.org>
parents: 68648
diff changeset
3 ;; Copyright (C) 1997, 1998, 2000, 2001, 2002, 2003, 2004,
100908
a9dc0e7c3f2b Add 2009 to copyright years.
Glenn Morris <rgm@gnu.org>
parents: 94655
diff changeset
4 ;; 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
5
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
6 ;; Author: Bob Glickstein <bobg+sregex@zanshin.com>
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
7 ;; Maintainer: Bob Glickstein <bobg+sregex@zanshin.com>
29212
f35b1d67aa8f Add finder keywords.
Dave Love <fx@gnu.org>
parents: 29069
diff changeset
8 ;; Keywords: extensions
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
9
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
10 ;; This file is part of GNU Emacs.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
11
94655
90a2847062be Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents: 93975
diff changeset
12 ;; GNU Emacs is free software: you can redistribute it and/or modify
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
13 ;; it under the terms of the GNU General Public License as published by
94655
90a2847062be Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents: 93975
diff changeset
14 ;; the Free Software Foundation, either version 3 of the License, or
90a2847062be Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents: 93975
diff changeset
15 ;; (at your option) any later version.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
16
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
17 ;; GNU Emacs is distributed in the hope that it will be useful,
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
18 ;; but WITHOUT ANY WARRANTY; without even the implied warranty of
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
19 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
20 ;; GNU General Public License for more details.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
21
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
22 ;; You should have received a copy of the GNU General Public License
94655
90a2847062be Switch to recommended form of GPLv3 permissions notice.
Glenn Morris <rgm@gnu.org>
parents: 93975
diff changeset
23 ;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
24
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
25 ;;; Commentary:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
26
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
27 ;; This package allows you to write regular expressions using a
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
28 ;; totally new, Lisp-like syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
29
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
30 ;; A "symbolic regular expression" (sregex for short) is a Lisp form
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
31 ;; that, when evaluated, produces the string form of the specified
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
32 ;; regular expression. Here's a simple example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
33
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
34 ;; (sregexq (or "Bob" "Robert")) => "Bob\\|Robert"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
35
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
36 ;; As you can see, an sregex is specified by placing one or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
37 ;; special clauses in a call to `sregexq'. The clause in this case is
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
38 ;; the `or' of two strings (not to be confused with the Lisp function
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
39 ;; `or'). The list of allowable clauses appears below.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
40
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
41 ;; With sregex, it is never necessary to "escape" magic characters
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
42 ;; that are meant to be taken literally; that happens automatically.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
43 ;; For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
44
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
45 ;; (sregexq "M*A*S*H") => "M\\*A\\*S\\*H"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
46
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
47 ;; It is also unnecessary to "group" parts of the expression together
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
48 ;; to overcome operator precedence; that also happens automatically.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
49 ;; For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
50
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
51 ;; (sregexq (opt (or "Bob" "Robert"))) => "\\(?:Bob\\|Robert\\)?"
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
52
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
53 ;; It *is* possible to group parts of the expression in order to refer
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
54 ;; to them with numbered backreferences:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
55
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
56 ;; (sregexq (group (or "Go" "Run"))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
57 ;; ", Spot, "
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
58 ;; (backref 1)) => "\\(Go\\|Run\\), Spot, \\1"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
59
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
60 ;; `sregexq' is a macro. Each time it is used, it constructs a simple
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
61 ;; Lisp expression that then invokes a moderately complex engine to
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
62 ;; interpret the sregex and render the string form. Because of this,
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
63 ;; I don't recommend sprinkling calls to `sregexq' throughout your
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
64 ;; code, the way one normally does with string regexes (which are
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
65 ;; cheap to evaluate). Instead, it's wiser to precompute the regexes
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
66 ;; you need wherever possible instead of repeatedly constructing the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
67 ;; same ones over and over. Example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
68
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
69 ;; (let ((field-regex (sregexq (opt "resent-")
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
70 ;; (or "to" "cc" "bcc"))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
71 ;; ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
72 ;; (while ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
73 ;; ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
74 ;; (re-search-forward field-regex ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
75 ;; ...))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
76
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
77 ;; The arguments to `sregexq' are automatically quoted, but the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
78 ;; flipside of this is that it is not straightforward to include
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
79 ;; computed (i.e., non-constant) values in `sregexq' expressions. So
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
80 ;; `sregex' is a function that is like `sregexq' but which does not
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
81 ;; automatically quote its values. Literal sregex clauses must be
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
82 ;; explicitly quoted like so:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
83
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
84 ;; (sregex '(or "Bob" "Robert")) => "Bob\\|Robert"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
85
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
86 ;; but computed clauses can be included easily, allowing for the reuse
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
87 ;; of common clauses:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
88
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
89 ;; (let ((dotstar '(0+ any))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
90 ;; (whitespace '(1+ (syntax ?-)))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
91 ;; (digits '(1+ (char (?0 . ?9)))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
92 ;; (sregex 'bol dotstar ":" whitespace digits)) => "^.*:\\s-+[0-9]+"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
93
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
94 ;; To use this package in a Lisp program, simply (require 'sregex).
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
95
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
96 ;; Here are the clauses allowed in an `sregex' or `sregexq'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
97 ;; expression:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
98
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
99 ;; - a string
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
100 ;; This stands for the literal string. If it contains
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
101 ;; metacharacters, they will be escaped in the resulting regex
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
102 ;; (using `regexp-quote').
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
103
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
104 ;; - the symbol `any'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
105 ;; This stands for ".", a regex matching any character except
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
106 ;; newline.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
107
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
108 ;; - the symbol `bol'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
109 ;; Stands for "^", matching the empty string at the beginning of a line
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
110
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
111 ;; - the symbol `eol'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
112 ;; Stands for "$", matching the empty string at the end of a line
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
113
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
114 ;; - (group CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
115 ;; Groups the given CLAUSEs using "\\(" and "\\)".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
116
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
117 ;; - (sequence CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
118
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
119 ;; Groups the given CLAUSEs; may or may not use "\\(?:" and "\\)".
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
120 ;; Clauses grouped by `sequence' do not count for purposes of
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
121 ;; numbering backreferences. Use `sequence' in situations like
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
122 ;; this:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
123
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
124 ;; (sregexq (or "dog" "cat"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
125 ;; (sequence (opt "sea ") "monkey")))
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
126 ;; => "dog\\|cat\\|\\(?:sea \\)?monkey"
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
127
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
128 ;; where a single `or' alternate needs to contain multiple
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
129 ;; subclauses.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
130
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
131 ;; - (backref N)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
132 ;; Matches the same string previously matched by the Nth "group" in
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
133 ;; the same sregex. N is a positive integer.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
134
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
135 ;; - (or CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
136 ;; Matches any one of the CLAUSEs by separating them with "\\|".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
137
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
138 ;; - (0+ CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
139 ;; Concatenates the given CLAUSEs and matches zero or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
140 ;; occurrences by appending "*".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
141
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
142 ;; - (1+ CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
143 ;; Concatenates the given CLAUSEs and matches one or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
144 ;; occurrences by appending "+".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
145
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
146 ;; - (opt CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
147 ;; Concatenates the given CLAUSEs and matches zero or one occurrence
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
148 ;; by appending "?".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
149
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
150 ;; - (repeat MIN MAX CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
151 ;; Concatenates the given CLAUSEs and constructs a regex matching at
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
152 ;; least MIN occurrences and at most MAX occurrences. MIN must be a
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
153 ;; non-negative integer. MAX must be a non-negative integer greater
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
154 ;; than or equal to MIN; or MAX can be nil to mean "infinity."
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
155
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
156 ;; - (char CHAR-CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
157 ;; Creates a "character class" matching one character from the given
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
158 ;; set. See below for how to construct a CHAR-CLAUSE.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
159
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
160 ;; - (not-char CHAR-CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
161 ;; Creates a "character class" matching any one character not in the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
162 ;; given set. See below for how to construct a CHAR-CLAUSE.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
163
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
164 ;; - the symbol `bot'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
165 ;; Stands for "\\`", matching the empty string at the beginning of
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
166 ;; text (beginning of a string or of a buffer).
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
167
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
168 ;; - the symbol `eot'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
169 ;; Stands for "\\'", matching the empty string at the end of text.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
170
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
171 ;; - the symbol `point'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
172 ;; Stands for "\\=", matching the empty string at point.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
173
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
174 ;; - the symbol `word-boundary'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
175 ;; Stands for "\\b", matching the empty string at the beginning or
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
176 ;; end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
177
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
178 ;; - the symbol `not-word-boundary'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
179 ;; Stands for "\\B", matching the empty string not at the beginning
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
180 ;; or end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
181
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
182 ;; - the symbol `bow'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
183 ;; Stands for "\\<", matching the empty string at the beginning of a
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
184 ;; word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
185
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
186 ;; - the symbol `eow'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
187 ;; Stands for "\\>", matching the empty string at the end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
188
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
189 ;; - the symbol `wordchar'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
190 ;; Stands for the regex "\\w", matching a word-constituent character
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
191 ;; (as determined by the current syntax table)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
192
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
193 ;; - the symbol `not-wordchar'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
194 ;; Stands for the regex "\\W", matching a non-word-constituent
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
195 ;; character.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
196
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
197 ;; - (syntax CODE)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
198 ;; Stands for the regex "\\sCODE", where CODE is a syntax table code
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
199 ;; (a single character). Matches any character with the requested
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
200 ;; syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
201
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
202 ;; - (not-syntax CODE)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
203 ;; Stands for the regex "\\SCODE", where CODE is a syntax table code
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
204 ;; (a single character). Matches any character without the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
205 ;; requested syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
206
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
207 ;; - (regex REGEX)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
208 ;; This is a "trapdoor" for including ordinary regular expression
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
209 ;; strings in the result. Some regular expressions are clearer when
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
210 ;; written the old way: "[a-z]" vs. (sregexq (char (?a . ?z))), for
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
211 ;; instance. However, see the note under "Bugs," below.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
212
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
213 ;; Each CHAR-CLAUSE that is passed to (char ...) and (not-char ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
214 ;; has one of the following forms:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
215
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
216 ;; - a character
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
217 ;; Adds that character to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
218
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
219 ;; - a string
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
220 ;; Adds all the characters in the string to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
221
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
222 ;; - A pair (MIN . MAX)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
223 ;; Where MIN and MAX are characters, adds the range of characters
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
224 ;; from MIN through MAX to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
225
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
226 ;;; To do:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
227
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
228 ;; An earlier version of this package could optionally translate the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
229 ;; symbolic regex into other languages' syntaxes, e.g. Perl. For
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
230 ;; instance, with Perl syntax selected, (sregexq (or "ab" "cd")) would
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
231 ;; yield "ab|cd" instead of "ab\\|cd". It might be useful to restore
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
232 ;; such a facility.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
233
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
234 ;; - handle multibyte chars in sregex--char-aux
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
235 ;; - add support for character classes ([:blank:], ...)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
236 ;; - add support for non-greedy operators *? and +?
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
237 ;; - bug: (sregexq (opt (opt ?a))) returns "a??" which is a non-greedy "a?"
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
238
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
239 ;;; Bugs:
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
240
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
241 ;;; Code:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
242
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
243 (eval-when-compile (require 'cl))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
244
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
245 ;; Compatibility code for when we didn't have shy-groups
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
246 (defvar sregex--current-sregex nil)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
247 (defun sregex-info () nil)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
248 (defmacro sregex-save-match-data (&rest forms) (cons 'save-match-data forms))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
249 (defun sregex-replace-match (r &optional f l str subexp x)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
250 (replace-match r f l str subexp))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
251 (defun sregex-match-string (c &optional i x) (match-string c i))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
252 (defun sregex-match-string-no-properties (count &optional in-string sregex)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
253 (match-string-no-properties count in-string))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
254 (defun sregex-match-beginning (count &optional sregex) (match-beginning count))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
255 (defun sregex-match-end (count &optional sregex) (match-end count))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
256 (defun sregex-match-data (&optional sregex) (match-data))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
257 (defun sregex-backref-num (n &optional sregex) n)
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
258
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
259
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
260 (defun sregex (&rest exps)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
261 "Symbolic regular expression interpreter.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
262 This is exactly like `sregexq' (q.v.) except that it evaluates all its
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
263 arguments, so literal sregex clauses must be quoted. For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
264
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
265 (sregex '(or \"Bob\" \"Robert\")) => \"Bob\\\\|Robert\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
266
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
267 An argument-evaluating sregex interpreter lets you reuse sregex
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
268 subexpressions:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
269
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
270 (let ((dotstar '(0+ any))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
271 (whitespace '(1+ (syntax ?-)))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
272 (digits '(1+ (char (?0 . ?9)))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
273 (sregex 'bol dotstar \":\" whitespace digits)) => \"^.*:\\\\s-+[0-9]+\""
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
274 (sregex--sequence exps nil))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
275
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
276 (defmacro sregexq (&rest exps)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
277 "Symbolic regular expression interpreter.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
278 This macro allows you to specify a regular expression (regexp) in
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
279 symbolic form, and converts it into the string form required by Emacs's
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
280 regex functions such as `re-search-forward' and `looking-at'. Here is
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
281 a simple example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
282
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
283 (sregexq (or \"Bob\" \"Robert\")) => \"Bob\\\\|Robert\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
284
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
285 As you can see, an sregex is specified by placing one or more special
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
286 clauses in a call to `sregexq'. The clause in this case is the `or'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
287 of two strings (not to be confused with the Lisp function `or'). The
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
288 list of allowable clauses appears below.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
289
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
290 With `sregex', it is never necessary to \"escape\" magic characters
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
291 that are meant to be taken literally; that happens automatically.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
292 For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
293
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
294 (sregexq \"M*A*S*H\") => \"M\\\\*A\\\\*S\\\\*H\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
295
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
296 It is also unnecessary to \"group\" parts of the expression together
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
297 to overcome operator precedence; that also happens automatically.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
298 For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
299
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
300 (sregexq (opt (or \"Bob\" \"Robert\"))) => \"\\\\(Bob\\\\|Robert\\\\)?\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
301
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
302 It *is* possible to group parts of the expression in order to refer
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
303 to them with numbered backreferences:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
304
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
305 (sregexq (group (or \"Go\" \"Run\"))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
306 \", Spot, \"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
307 (backref 1)) => \"\\\\(Go\\\\|Run\\\\), Spot, \\\\1\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
308
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
309 If `sregexq' needs to introduce its own grouping parentheses, it will
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
310 automatically renumber your backreferences:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
311
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
312 (sregexq (opt \"resent-\")
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
313 (group (or \"to\" \"cc\" \"bcc\"))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
314 \": \"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
315 (backref 1)) => \"\\\\(resent-\\\\)?\\\\(to\\\\|cc\\\\|bcc\\\\): \\\\2\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
316
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
317 `sregexq' is a macro. Each time it is used, it constructs a simple
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
318 Lisp expression that then invokes a moderately complex engine to
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
319 interpret the sregex and render the string form. Because of this, I
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
320 don't recommend sprinkling calls to `sregexq' throughout your code,
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
321 the way one normally does with string regexes (which are cheap to
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
322 evaluate). Instead, it's wiser to precompute the regexes you need
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
323 wherever possible instead of repeatedly constructing the same ones
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
324 over and over. Example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
325
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
326 (let ((field-regex (sregexq (opt \"resent-\")
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
327 (or \"to\" \"cc\" \"bcc\"))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
328 ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
329 (while ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
330 ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
331 (re-search-forward field-regex ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
332 ...))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
333
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
334 The arguments to `sregexq' are automatically quoted, but the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
335 flipside of this is that it is not straightforward to include
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
336 computed (i.e., non-constant) values in `sregexq' expressions. So
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
337 `sregex' is a function that is like `sregexq' but which does not
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
338 automatically quote its values. Literal sregex clauses must be
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
339 explicitly quoted like so:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
340
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
341 (sregex '(or \"Bob\" \"Robert\")) => \"Bob\\\\|Robert\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
342
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
343 but computed clauses can be included easily, allowing for the reuse
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
344 of common clauses:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
345
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
346 (let ((dotstar '(0+ any))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
347 (whitespace '(1+ (syntax ?-)))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
348 (digits '(1+ (char (?0 . ?9)))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
349 (sregex 'bol dotstar \":\" whitespace digits)) => \"^.*:\\\\s-+[0-9]+\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
350
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
351 Here are the clauses allowed in an `sregex' or `sregexq' expression:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
352
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
353 - a string
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
354 This stands for the literal string. If it contains
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
355 metacharacters, they will be escaped in the resulting regex
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
356 (using `regexp-quote').
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
357
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
358 - the symbol `any'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
359 This stands for \".\", a regex matching any character except
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
360 newline.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
361
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
362 - the symbol `bol'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
363 Stands for \"^\", matching the empty string at the beginning of a line
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
364
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
365 - the symbol `eol'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
366 Stands for \"$\", matching the empty string at the end of a line
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
367
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
368 - (group CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
369 Groups the given CLAUSEs using \"\\\\(\" and \"\\\\)\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
370
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
371 - (sequence CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
372
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
373 Groups the given CLAUSEs; may or may not use \"\\\\(\" and \"\\\\)\".
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
374 Clauses grouped by `sequence' do not count for purposes of
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
375 numbering backreferences. Use `sequence' in situations like
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
376 this:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
377
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
378 (sregexq (or \"dog\" \"cat\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
379 (sequence (opt \"sea \") \"monkey\")))
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
380 => \"dog\\\\|cat\\\\|\\\\(?:sea \\\\)?monkey\"
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
381
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
382 where a single `or' alternate needs to contain multiple
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
383 subclauses.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
384
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
385 - (backref N)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
386 Matches the same string previously matched by the Nth \"group\" in
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
387 the same sregex. N is a positive integer.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
388
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
389 - (or CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
390 Matches any one of the CLAUSEs by separating them with \"\\\\|\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
391
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
392 - (0+ CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
393 Concatenates the given CLAUSEs and matches zero or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
394 occurrences by appending \"*\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
395
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
396 - (1+ CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
397 Concatenates the given CLAUSEs and matches one or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
398 occurrences by appending \"+\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
399
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
400 - (opt CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
401 Concatenates the given CLAUSEs and matches zero or one occurrence
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
402 by appending \"?\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
403
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
404 - (repeat MIN MAX CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
405 Concatenates the given CLAUSEs and constructs a regex matching at
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
406 least MIN occurrences and at most MAX occurrences. MIN must be a
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
407 non-negative integer. MAX must be a non-negative integer greater
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
408 than or equal to MIN; or MAX can be nil to mean \"infinity.\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
409
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
410 - (char CHAR-CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
411 Creates a \"character class\" matching one character from the given
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
412 set. See below for how to construct a CHAR-CLAUSE.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
413
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
414 - (not-char CHAR-CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
415 Creates a \"character class\" matching any one character not in the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
416 given set. See below for how to construct a CHAR-CLAUSE.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
417
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
418 - the symbol `bot'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
419 Stands for \"\\\\`\", matching the empty string at the beginning of
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
420 text (beginning of a string or of a buffer).
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
421
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
422 - the symbol `eot'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
423 Stands for \"\\\\'\", matching the empty string at the end of text.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
424
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
425 - the symbol `point'
76829
0ca7455cc45c (sregexq): Doc fix.
Eli Zaretskii <eliz@gnu.org>
parents: 75346
diff changeset
426 Stands for \"\\\\=\\=\", matching the empty string at point.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
427
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
428 - the symbol `word-boundary'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
429 Stands for \"\\\\b\", matching the empty string at the beginning or
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
430 end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
431
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
432 - the symbol `not-word-boundary'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
433 Stands for \"\\\\B\", matching the empty string not at the beginning
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
434 or end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
435
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
436 - the symbol `bow'
77520
8dd3e56c2212 (sregexq): Fix doc string quoting.
Andreas Schwab <schwab@suse.de>
parents: 76829
diff changeset
437 Stands for \"\\\\=\\<\", matching the empty string at the beginning of a
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
438 word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
439
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
440 - the symbol `eow'
77520
8dd3e56c2212 (sregexq): Fix doc string quoting.
Andreas Schwab <schwab@suse.de>
parents: 76829
diff changeset
441 Stands for \"\\\\=\\>\", matching the empty string at the end of a word.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
442
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
443 - the symbol `wordchar'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
444 Stands for the regex \"\\\\w\", matching a word-constituent character
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
445 (as determined by the current syntax table)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
446
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
447 - the symbol `not-wordchar'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
448 Stands for the regex \"\\\\W\", matching a non-word-constituent
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
449 character.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
450
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
451 - (syntax CODE)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
452 Stands for the regex \"\\\\sCODE\", where CODE is a syntax table code
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
453 (a single character). Matches any character with the requested
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
454 syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
455
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
456 - (not-syntax CODE)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
457 Stands for the regex \"\\\\SCODE\", where CODE is a syntax table code
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
458 (a single character). Matches any character without the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
459 requested syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
460
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
461 - (regex REGEX)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
462 This is a \"trapdoor\" for including ordinary regular expression
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
463 strings in the result. Some regular expressions are clearer when
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
464 written the old way: \"[a-z]\" vs. (sregexq (char (?a . ?z))), for
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
465 instance.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
466
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
467 Each CHAR-CLAUSE that is passed to (char ...) and (not-char ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
468 has one of the following forms:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
469
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
470 - a character
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
471 Adds that character to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
472
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
473 - a string
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
474 Adds all the characters in the string to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
475
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
476 - A pair (MIN . MAX)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
477 Where MIN and MAX are characters, adds the range of characters
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
478 from MIN through MAX to the set."
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
479 `(apply 'sregex ',exps))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
480
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
481 (defun sregex--engine (exp combine)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
482 (cond
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
483 ((stringp exp)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
484 (if (and combine
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
485 (eq combine 'suffix)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
486 (/= (length exp) 1))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
487 (concat "\\(?:" (regexp-quote exp) "\\)")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
488 (regexp-quote exp)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
489 ((symbolp exp)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
490 (ecase exp
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
491 (any ".")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
492 (bol "^")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
493 (eol "$")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
494 (wordchar "\\w")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
495 (not-wordchar "\\W")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
496 (bot "\\`")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
497 (eot "\\'")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
498 (point "\\=")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
499 (word-boundary "\\b")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
500 (not-word-boundary "\\B")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
501 (bow "\\<")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
502 (eow "\\>")))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
503 ((consp exp)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
504 (funcall (intern (concat "sregex--"
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
505 (symbol-name (car exp))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
506 (cdr exp)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
507 combine))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
508 (t (error "Invalid expression: %s" exp))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
509
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
510 (defun sregex--sequence (exps combine)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
511 (if (= (length exps) 1) (sregex--engine (car exps) combine)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
512 (let ((re (mapconcat
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
513 (lambda (e) (sregex--engine e 'concat))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
514 exps "")))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
515 (if (eq combine 'suffix)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
516 (concat "\\(?:" re "\\)")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
517 re))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
518
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
519 (defun sregex--or (exps combine)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
520 (if (= (length exps) 1) (sregex--engine (car exps) combine)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
521 (let ((re (mapconcat
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
522 (lambda (e) (sregex--engine e 'or))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
523 exps "\\|")))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
524 (if (not (eq combine 'or))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
525 (concat "\\(?:" re "\\)")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
526 re))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
527
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
528 (defun sregex--group (exps combine) (concat "\\(" (sregex--sequence exps nil) "\\)"))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
529
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
530 (defun sregex--backref (exps combine) (concat "\\" (int-to-string (car exps))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
531 (defun sregex--opt (exps combine) (concat (sregex--sequence exps 'suffix) "?"))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
532 (defun sregex--0+ (exps combine) (concat (sregex--sequence exps 'suffix) "*"))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
533 (defun sregex--1+ (exps combine) (concat (sregex--sequence exps 'suffix) "+"))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
534
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
535 (defun sregex--char (exps combine) (sregex--char-aux nil exps))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
536 (defun sregex--not-char (exps combine) (sregex--char-aux t exps))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
537
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
538 (defun sregex--syntax (exps combine) (format "\\s%c" (car exps)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
539 (defun sregex--not-syntax (exps combine) (format "\\S%c" (car exps)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
540
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
541 (defun sregex--regex (exps combine)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
542 (if combine (concat "\\(?:" (car exps) "\\)") (car exps)))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
543
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
544 (defun sregex--repeat (exps combine)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
545 (let* ((min (or (pop exps) 0))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
546 (minstr (number-to-string min))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
547 (max (pop exps)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
548 (concat (sregex--sequence exps 'suffix)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
549 (concat "\\{" minstr ","
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
550 (when max (number-to-string max)) "\\}"))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
551
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
552 (defun sregex--char-range (start end)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
553 (let ((startc (char-to-string start))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
554 (endc (char-to-string end)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
555 (cond
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
556 ((> end (+ start 2)) (concat startc "-" endc))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
557 ((> end (+ start 1)) (concat startc (char-to-string (1+ start)) endc))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
558 ((> end start) (concat startc endc))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
559 (t startc))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
560
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
561 (defun sregex--char-aux (complement args)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
562 ;; regex-opt does the same, we should join effort.
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
563 (let ((chars (make-bool-vector 256 nil))) ; Yeah, right!
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
564 (dolist (arg args)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
565 (cond ((integerp arg) (aset chars arg t))
84904
e2e245301b8c (sregex--char-aux): Use `mapc' rather than `mapcar'.
Juanma Barranquero <lekktu@gmail.com>
parents: 78217
diff changeset
566 ((stringp arg) (mapc (lambda (c) (aset chars c t)) arg))
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
567 ((consp arg)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
568 (let ((start (car arg))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
569 (end (cdr arg)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
570 (when (> start end)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
571 (let ((tmp start)) (setq start end) (setq end tmp)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
572 ;; now start <= end
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
573 (let ((i start))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
574 (while (<= i end)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
575 (aset chars i t)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
576 (setq i (1+ i))))))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
577 ;; now chars is a map of the characters in the class
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
578 (let ((caret (aref chars ?^))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
579 (dash (aref chars ?-))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
580 (class (if (aref chars ?\]) "]" "")))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
581 (aset chars ?^ nil)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
582 (aset chars ?- nil)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
583 (aset chars ?\] nil)
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
584
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
585 (let (start end)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
586 (dotimes (i 256)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
587 (if (aref chars i)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
588 (progn
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
589 (unless start (setq start i))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
590 (setq end i)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
591 (aset chars i nil))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
592 (when start
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
593 (setq class (concat class (sregex--char-range start end)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
594 (setq start nil))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
595 (if start
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
596 (setq class (concat class (sregex--char-range start end)))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
597
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
598 (if (> (length class) 0)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
599 (setq class (concat class (if caret "^") (if dash "-")))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
600 (setq class (concat class (if dash "-") (if caret "^"))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
601 (if (and (not complement) (= (length class) 1))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
602 (regexp-quote class)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
603 (concat "[" (if complement "^") class "]")))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
604
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
605 (provide 'sregex)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
606
93975
1e3a407766b9 Fix up comment convention on the arch-tag lines.
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 87649
diff changeset
607 ;; arch-tag: 460c1f5a-eb6e-42ec-a451-ffac78bdf492
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
608 ;;; sregex.el ends here