annotate lisp/emacs-lisp/sregex.el @ 90203:187d6a1f84f7

Revision: miles@gnu.org--gnu-2005/emacs--unicode--0--patch-71 Merge from emacs--cvs-trunk--0 Patches applied: * emacs--cvs-trunk--0 (patch 485-492) - Update from CVS - Merge from gnus--rel--5.10 * gnus--rel--5.10 (patch 92-94) - Merge from emacs--cvs-trunk--0 - Update from CVS
author Miles Bader <miles@gnu.org>
date Fri, 22 Jul 2005 08:27:27 +0000
parents f9a65d7ebd29
children 2d92f5c9d6ae
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
1 ;;; sregex.el --- symbolic regular expressions
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
2
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
3 ;; Copyright (C) 1997, 1998, 2000 Free Software Foundation, Inc.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
4
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
5 ;; Author: Bob Glickstein <bobg+sregex@zanshin.com>
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
6 ;; Maintainer: Bob Glickstein <bobg+sregex@zanshin.com>
29212
f35b1d67aa8f Add finder keywords.
Dave Love <fx@gnu.org>
parents: 29069
diff changeset
7 ;; Keywords: extensions
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
8
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
9 ;; This file is part of GNU Emacs.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
10
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
11 ;; GNU Emacs is free software; you can redistribute it and/or modify
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
12 ;; it under the terms of the GNU General Public License as published by
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
13 ;; the Free Software Foundation; either version 2, or (at your option)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
14 ;; any later version.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
15
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
16 ;; GNU Emacs is distributed in the hope that it will be useful,
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
17 ;; but WITHOUT ANY WARRANTY; without even the implied warranty of
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
18 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
19 ;; GNU General Public License for more details.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
20
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
21 ;; You should have received a copy of the GNU General Public License
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
22 ;; along with GNU Emacs; see the file COPYING. If not, write to the
64085
18a818a2ee7c Update FSF's address.
Lute Kamstra <lute@gnu.org>
parents: 52401
diff changeset
23 ;; Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
18a818a2ee7c Update FSF's address.
Lute Kamstra <lute@gnu.org>
parents: 52401
diff changeset
24 ;; Boston, MA 02110-1301, USA.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
25
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
26 ;;; Commentary:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
27
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
28 ;; This package allows you to write regular expressions using a
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
29 ;; totally new, Lisp-like syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
30
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
31 ;; A "symbolic regular expression" (sregex for short) is a Lisp form
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
32 ;; that, when evaluated, produces the string form of the specified
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
33 ;; regular expression. Here's a simple example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
34
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
35 ;; (sregexq (or "Bob" "Robert")) => "Bob\\|Robert"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
36
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
37 ;; As you can see, an sregex is specified by placing one or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
38 ;; special clauses in a call to `sregexq'. The clause in this case is
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
39 ;; the `or' of two strings (not to be confused with the Lisp function
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
40 ;; `or'). The list of allowable clauses appears below.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
41
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
42 ;; With sregex, it is never necessary to "escape" magic characters
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
43 ;; that are meant to be taken literally; that happens automatically.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
44 ;; For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
45
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
46 ;; (sregexq "M*A*S*H") => "M\\*A\\*S\\*H"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
47
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
48 ;; It is also unnecessary to "group" parts of the expression together
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
49 ;; to overcome operator precedence; that also happens automatically.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
50 ;; For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
51
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
52 ;; (sregexq (opt (or "Bob" "Robert"))) => "\\(?:Bob\\|Robert\\)?"
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
53
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
54 ;; It *is* possible to group parts of the expression in order to refer
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
55 ;; to them with numbered backreferences:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
56
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
57 ;; (sregexq (group (or "Go" "Run"))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
58 ;; ", Spot, "
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
59 ;; (backref 1)) => "\\(Go\\|Run\\), Spot, \\1"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
60
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
61 ;; `sregexq' is a macro. Each time it is used, it constructs a simple
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
62 ;; Lisp expression that then invokes a moderately complex engine to
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
63 ;; interpret the sregex and render the string form. Because of this,
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
64 ;; I don't recommend sprinkling calls to `sregexq' throughout your
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
65 ;; code, the way one normally does with string regexes (which are
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
66 ;; cheap to evaluate). Instead, it's wiser to precompute the regexes
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
67 ;; you need wherever possible instead of repeatedly constructing the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
68 ;; same ones over and over. Example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
69
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
70 ;; (let ((field-regex (sregexq (opt "resent-")
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
71 ;; (or "to" "cc" "bcc"))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
72 ;; ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
73 ;; (while ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
74 ;; ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
75 ;; (re-search-forward field-regex ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
76 ;; ...))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
77
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
78 ;; The arguments to `sregexq' are automatically quoted, but the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
79 ;; flipside of this is that it is not straightforward to include
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
80 ;; computed (i.e., non-constant) values in `sregexq' expressions. So
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
81 ;; `sregex' is a function that is like `sregexq' but which does not
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
82 ;; automatically quote its values. Literal sregex clauses must be
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
83 ;; explicitly quoted like so:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
84
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
85 ;; (sregex '(or "Bob" "Robert")) => "Bob\\|Robert"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
86
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
87 ;; but computed clauses can be included easily, allowing for the reuse
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
88 ;; of common clauses:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
89
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
90 ;; (let ((dotstar '(0+ any))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
91 ;; (whitespace '(1+ (syntax ?-)))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
92 ;; (digits '(1+ (char (?0 . ?9)))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
93 ;; (sregex 'bol dotstar ":" whitespace digits)) => "^.*:\\s-+[0-9]+"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
94
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
95 ;; To use this package in a Lisp program, simply (require 'sregex).
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
96
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
97 ;; Here are the clauses allowed in an `sregex' or `sregexq'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
98 ;; expression:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
99
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
100 ;; - a string
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
101 ;; This stands for the literal string. If it contains
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
102 ;; metacharacters, they will be escaped in the resulting regex
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
103 ;; (using `regexp-quote').
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
104
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
105 ;; - the symbol `any'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
106 ;; This stands for ".", a regex matching any character except
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
107 ;; newline.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
108
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
109 ;; - the symbol `bol'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
110 ;; Stands for "^", matching the empty string at the beginning of a line
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
111
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
112 ;; - the symbol `eol'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
113 ;; Stands for "$", matching the empty string at the end of a line
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
114
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
115 ;; - (group CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
116 ;; Groups the given CLAUSEs using "\\(" and "\\)".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
117
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
118 ;; - (sequence CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
119
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
120 ;; Groups the given CLAUSEs; may or may not use "\\(?:" and "\\)".
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
121 ;; Clauses grouped by `sequence' do not count for purposes of
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
122 ;; numbering backreferences. Use `sequence' in situations like
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
123 ;; this:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
124
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
125 ;; (sregexq (or "dog" "cat"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
126 ;; (sequence (opt "sea ") "monkey")))
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
127 ;; => "dog\\|cat\\|\\(?:sea \\)?monkey"
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
128
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
129 ;; where a single `or' alternate needs to contain multiple
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
130 ;; subclauses.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
131
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
132 ;; - (backref N)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
133 ;; Matches the same string previously matched by the Nth "group" in
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
134 ;; the same sregex. N is a positive integer.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
135
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
136 ;; - (or CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
137 ;; Matches any one of the CLAUSEs by separating them with "\\|".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
138
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
139 ;; - (0+ CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
140 ;; Concatenates the given CLAUSEs and matches zero or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
141 ;; occurrences by appending "*".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
142
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
143 ;; - (1+ CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
144 ;; Concatenates the given CLAUSEs and matches one or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
145 ;; occurrences by appending "+".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
146
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
147 ;; - (opt CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
148 ;; Concatenates the given CLAUSEs and matches zero or one occurrence
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
149 ;; by appending "?".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
150
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
151 ;; - (repeat MIN MAX CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
152 ;; Concatenates the given CLAUSEs and constructs a regex matching at
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
153 ;; least MIN occurrences and at most MAX occurrences. MIN must be a
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
154 ;; non-negative integer. MAX must be a non-negative integer greater
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
155 ;; than or equal to MIN; or MAX can be nil to mean "infinity."
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
156
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
157 ;; - (char CHAR-CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
158 ;; Creates a "character class" matching one character from the given
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
159 ;; set. See below for how to construct a CHAR-CLAUSE.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
160
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
161 ;; - (not-char CHAR-CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
162 ;; Creates a "character class" matching any one character not in the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
163 ;; given set. See below for how to construct a CHAR-CLAUSE.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
164
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
165 ;; - the symbol `bot'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
166 ;; Stands for "\\`", matching the empty string at the beginning of
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
167 ;; text (beginning of a string or of a buffer).
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
168
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
169 ;; - the symbol `eot'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
170 ;; Stands for "\\'", matching the empty string at the end of text.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
171
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
172 ;; - the symbol `point'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
173 ;; Stands for "\\=", matching the empty string at point.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
174
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
175 ;; - the symbol `word-boundary'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
176 ;; Stands for "\\b", matching the empty string at the beginning or
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
177 ;; end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
178
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
179 ;; - the symbol `not-word-boundary'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
180 ;; Stands for "\\B", matching the empty string not at the beginning
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
181 ;; or end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
182
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
183 ;; - the symbol `bow'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
184 ;; Stands for "\\<", matching the empty string at the beginning of a
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
185 ;; word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
186
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
187 ;; - the symbol `eow'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
188 ;; Stands for "\\>", matching the empty string at the end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
189
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
190 ;; - the symbol `wordchar'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
191 ;; Stands for the regex "\\w", matching a word-constituent character
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
192 ;; (as determined by the current syntax table)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
193
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
194 ;; - the symbol `not-wordchar'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
195 ;; Stands for the regex "\\W", matching a non-word-constituent
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
196 ;; character.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
197
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
198 ;; - (syntax CODE)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
199 ;; Stands for the regex "\\sCODE", where CODE is a syntax table code
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
200 ;; (a single character). Matches any character with the requested
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
201 ;; syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
202
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
203 ;; - (not-syntax CODE)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
204 ;; Stands for the regex "\\SCODE", where CODE is a syntax table code
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
205 ;; (a single character). Matches any character without the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
206 ;; requested syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
207
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
208 ;; - (regex REGEX)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
209 ;; This is a "trapdoor" for including ordinary regular expression
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
210 ;; strings in the result. Some regular expressions are clearer when
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
211 ;; written the old way: "[a-z]" vs. (sregexq (char (?a . ?z))), for
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
212 ;; instance. However, see the note under "Bugs," below.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
213
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
214 ;; Each CHAR-CLAUSE that is passed to (char ...) and (not-char ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
215 ;; has one of the following forms:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
216
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
217 ;; - a character
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
218 ;; Adds that character to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
219
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
220 ;; - a string
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
221 ;; Adds all the characters in the string to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
222
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
223 ;; - A pair (MIN . MAX)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
224 ;; Where MIN and MAX are characters, adds the range of characters
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
225 ;; from MIN through MAX to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
226
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
227 ;;; To do:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
228
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
229 ;; An earlier version of this package could optionally translate the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
230 ;; symbolic regex into other languages' syntaxes, e.g. Perl. For
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
231 ;; instance, with Perl syntax selected, (sregexq (or "ab" "cd")) would
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
232 ;; yield "ab|cd" instead of "ab\\|cd". It might be useful to restore
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
233 ;; such a facility.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
234
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
235 ;; - handle multibyte chars in sregex--char-aux
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
236 ;; - add support for character classes ([:blank:], ...)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
237 ;; - add support for non-greedy operators *? and +?
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
238 ;; - bug: (sregexq (opt (opt ?a))) returns "a??" which is a non-greedy "a?"
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
239
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
240 ;;; Bugs:
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
241
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
242 ;;; Code:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
243
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
244 (eval-when-compile (require 'cl))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
245
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
246 ;; Compatibility code for when we didn't have shy-groups
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
247 (defvar sregex--current-sregex nil)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
248 (defun sregex-info () nil)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
249 (defmacro sregex-save-match-data (&rest forms) (cons 'save-match-data forms))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
250 (defun sregex-replace-match (r &optional f l str subexp x)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
251 (replace-match r f l str subexp))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
252 (defun sregex-match-string (c &optional i x) (match-string c i))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
253 (defun sregex-match-string-no-properties (count &optional in-string sregex)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
254 (match-string-no-properties count in-string))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
255 (defun sregex-match-beginning (count &optional sregex) (match-beginning count))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
256 (defun sregex-match-end (count &optional sregex) (match-end count))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
257 (defun sregex-match-data (&optional sregex) (match-data))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
258 (defun sregex-backref-num (n &optional sregex) n)
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
259
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
260
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
261 (defun sregex (&rest exps)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
262 "Symbolic regular expression interpreter.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
263 This is exactly like `sregexq' (q.v.) except that it evaluates all its
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
264 arguments, so literal sregex clauses must be quoted. For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
265
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
266 (sregex '(or \"Bob\" \"Robert\")) => \"Bob\\\\|Robert\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
267
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
268 An argument-evaluating sregex interpreter lets you reuse sregex
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
269 subexpressions:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
270
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
271 (let ((dotstar '(0+ any))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
272 (whitespace '(1+ (syntax ?-)))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
273 (digits '(1+ (char (?0 . ?9)))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
274 (sregex 'bol dotstar \":\" whitespace digits)) => \"^.*:\\\\s-+[0-9]+\""
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
275 (sregex--sequence exps nil))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
276
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
277 (defmacro sregexq (&rest exps)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
278 "Symbolic regular expression interpreter.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
279 This macro allows you to specify a regular expression (regexp) in
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
280 symbolic form, and converts it into the string form required by Emacs's
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
281 regex functions such as `re-search-forward' and `looking-at'. Here is
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
282 a simple example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
283
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
284 (sregexq (or \"Bob\" \"Robert\")) => \"Bob\\\\|Robert\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
285
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
286 As you can see, an sregex is specified by placing one or more special
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
287 clauses in a call to `sregexq'. The clause in this case is the `or'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
288 of two strings (not to be confused with the Lisp function `or'). The
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
289 list of allowable clauses appears below.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
290
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
291 With `sregex', it is never necessary to \"escape\" magic characters
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
292 that are meant to be taken literally; that happens automatically.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
293 For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
294
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
295 (sregexq \"M*A*S*H\") => \"M\\\\*A\\\\*S\\\\*H\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
296
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
297 It is also unnecessary to \"group\" parts of the expression together
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
298 to overcome operator precedence; that also happens automatically.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
299 For example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
300
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
301 (sregexq (opt (or \"Bob\" \"Robert\"))) => \"\\\\(Bob\\\\|Robert\\\\)?\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
302
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
303 It *is* possible to group parts of the expression in order to refer
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
304 to them with numbered backreferences:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
305
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
306 (sregexq (group (or \"Go\" \"Run\"))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
307 \", Spot, \"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
308 (backref 1)) => \"\\\\(Go\\\\|Run\\\\), Spot, \\\\1\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
309
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
310 If `sregexq' needs to introduce its own grouping parentheses, it will
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
311 automatically renumber your backreferences:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
312
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
313 (sregexq (opt \"resent-\")
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
314 (group (or \"to\" \"cc\" \"bcc\"))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
315 \": \"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
316 (backref 1)) => \"\\\\(resent-\\\\)?\\\\(to\\\\|cc\\\\|bcc\\\\): \\\\2\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
317
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
318 `sregexq' is a macro. Each time it is used, it constructs a simple
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
319 Lisp expression that then invokes a moderately complex engine to
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
320 interpret the sregex and render the string form. Because of this, I
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
321 don't recommend sprinkling calls to `sregexq' throughout your code,
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
322 the way one normally does with string regexes (which are cheap to
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
323 evaluate). Instead, it's wiser to precompute the regexes you need
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
324 wherever possible instead of repeatedly constructing the same ones
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
325 over and over. Example:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
326
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
327 (let ((field-regex (sregexq (opt \"resent-\")
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
328 (or \"to\" \"cc\" \"bcc\"))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
329 ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
330 (while ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
331 ...
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
332 (re-search-forward field-regex ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
333 ...))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
334
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
335 The arguments to `sregexq' are automatically quoted, but the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
336 flipside of this is that it is not straightforward to include
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
337 computed (i.e., non-constant) values in `sregexq' expressions. So
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
338 `sregex' is a function that is like `sregexq' but which does not
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
339 automatically quote its values. Literal sregex clauses must be
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
340 explicitly quoted like so:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
341
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
342 (sregex '(or \"Bob\" \"Robert\")) => \"Bob\\\\|Robert\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
343
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
344 but computed clauses can be included easily, allowing for the reuse
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
345 of common clauses:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
346
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
347 (let ((dotstar '(0+ any))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
348 (whitespace '(1+ (syntax ?-)))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
349 (digits '(1+ (char (?0 . ?9)))))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
350 (sregex 'bol dotstar \":\" whitespace digits)) => \"^.*:\\\\s-+[0-9]+\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
351
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
352 Here are the clauses allowed in an `sregex' or `sregexq' expression:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
353
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
354 - a string
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
355 This stands for the literal string. If it contains
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
356 metacharacters, they will be escaped in the resulting regex
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
357 (using `regexp-quote').
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
358
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
359 - the symbol `any'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
360 This stands for \".\", a regex matching any character except
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
361 newline.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
362
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
363 - the symbol `bol'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
364 Stands for \"^\", matching the empty string at the beginning of a line
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
365
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
366 - the symbol `eol'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
367 Stands for \"$\", matching the empty string at the end of a line
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
368
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
369 - (group CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
370 Groups the given CLAUSEs using \"\\\\(\" and \"\\\\)\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
371
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
372 - (sequence CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
373
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
374 Groups the given CLAUSEs; may or may not use \"\\\\(\" and \"\\\\)\".
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
375 Clauses grouped by `sequence' do not count for purposes of
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
376 numbering backreferences. Use `sequence' in situations like
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
377 this:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
378
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
379 (sregexq (or \"dog\" \"cat\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
380 (sequence (opt \"sea \") \"monkey\")))
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
381 => \"dog\\\\|cat\\\\|\\\\(?:sea \\\\)?monkey\"
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
382
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
383 where a single `or' alternate needs to contain multiple
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
384 subclauses.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
385
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
386 - (backref N)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
387 Matches the same string previously matched by the Nth \"group\" in
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
388 the same sregex. N is a positive integer.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
389
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
390 - (or CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
391 Matches any one of the CLAUSEs by separating them with \"\\\\|\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
392
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
393 - (0+ CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
394 Concatenates the given CLAUSEs and matches zero or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
395 occurrences by appending \"*\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
396
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
397 - (1+ CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
398 Concatenates the given CLAUSEs and matches one or more
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
399 occurrences by appending \"+\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
400
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
401 - (opt CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
402 Concatenates the given CLAUSEs and matches zero or one occurrence
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
403 by appending \"?\".
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
404
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
405 - (repeat MIN MAX CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
406 Concatenates the given CLAUSEs and constructs a regex matching at
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
407 least MIN occurrences and at most MAX occurrences. MIN must be a
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
408 non-negative integer. MAX must be a non-negative integer greater
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
409 than or equal to MIN; or MAX can be nil to mean \"infinity.\"
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
410
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
411 - (char CHAR-CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
412 Creates a \"character class\" matching one character from the given
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
413 set. See below for how to construct a CHAR-CLAUSE.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
414
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
415 - (not-char CHAR-CLAUSE ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
416 Creates a \"character class\" matching any one character not in the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
417 given set. See below for how to construct a CHAR-CLAUSE.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
418
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
419 - the symbol `bot'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
420 Stands for \"\\\\`\", matching the empty string at the beginning of
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
421 text (beginning of a string or of a buffer).
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
422
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
423 - the symbol `eot'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
424 Stands for \"\\\\'\", matching the empty string at the end of text.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
425
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
426 - the symbol `point'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
427 Stands for \"\\\\=\", matching the empty string at point.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
428
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
429 - the symbol `word-boundary'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
430 Stands for \"\\\\b\", matching the empty string at the beginning or
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
431 end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
432
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
433 - the symbol `not-word-boundary'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
434 Stands for \"\\\\B\", matching the empty string not at the beginning
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
435 or end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
436
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
437 - the symbol `bow'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
438 Stands for \"\\\\\\=<\", matching the empty string at the beginning of a
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
439 word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
440
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
441 - the symbol `eow'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
442 Stands for \"\\\\\\=>\", matching the empty string at the end of a word.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
443
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
444 - the symbol `wordchar'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
445 Stands for the regex \"\\\\w\", matching a word-constituent character
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
446 (as determined by the current syntax table)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
447
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
448 - the symbol `not-wordchar'
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
449 Stands for the regex \"\\\\W\", matching a non-word-constituent
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
450 character.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
451
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
452 - (syntax CODE)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
453 Stands for the regex \"\\\\sCODE\", where CODE is a syntax table code
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
454 (a single character). Matches any character with the requested
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
455 syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
456
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
457 - (not-syntax CODE)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
458 Stands for the regex \"\\\\SCODE\", where CODE is a syntax table code
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
459 (a single character). Matches any character without the
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
460 requested syntax.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
461
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
462 - (regex REGEX)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
463 This is a \"trapdoor\" for including ordinary regular expression
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
464 strings in the result. Some regular expressions are clearer when
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
465 written the old way: \"[a-z]\" vs. (sregexq (char (?a . ?z))), for
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
466 instance.
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
467
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
468 Each CHAR-CLAUSE that is passed to (char ...) and (not-char ...)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
469 has one of the following forms:
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
470
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
471 - a character
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
472 Adds that character to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
473
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
474 - a string
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
475 Adds all the characters in the string to the set.
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
476
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
477 - A pair (MIN . MAX)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
478 Where MIN and MAX are characters, adds the range of characters
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
479 from MIN through MAX to the set."
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
480 `(apply 'sregex ',exps))
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
481
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
482 (defun sregex--engine (exp combine)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
483 (cond
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
484 ((stringp exp)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
485 (if (and combine
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
486 (eq combine 'suffix)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
487 (/= (length exp) 1))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
488 (concat "\\(?:" (regexp-quote exp) "\\)")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
489 (regexp-quote exp)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
490 ((symbolp exp)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
491 (ecase exp
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
492 (any ".")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
493 (bol "^")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
494 (eol "$")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
495 (wordchar "\\w")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
496 (not-wordchar "\\W")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
497 (bot "\\`")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
498 (eot "\\'")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
499 (point "\\=")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
500 (word-boundary "\\b")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
501 (not-word-boundary "\\B")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
502 (bow "\\<")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
503 (eow "\\>")))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
504 ((consp exp)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
505 (funcall (intern (concat "sregex--"
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
506 (symbol-name (car exp))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
507 (cdr exp)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
508 combine))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
509 (t (error "Invalid expression: %s" exp))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
510
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
511 (defun sregex--sequence (exps combine)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
512 (if (= (length exps) 1) (sregex--engine (car exps) combine)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
513 (let ((re (mapconcat
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
514 (lambda (e) (sregex--engine e 'concat))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
515 exps "")))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
516 (if (eq combine 'suffix)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
517 (concat "\\(?:" re "\\)")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
518 re))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
519
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
520 (defun sregex--or (exps combine)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
521 (if (= (length exps) 1) (sregex--engine (car exps) combine)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
522 (let ((re (mapconcat
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
523 (lambda (e) (sregex--engine e 'or))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
524 exps "\\|")))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
525 (if (not (eq combine 'or))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
526 (concat "\\(?:" re "\\)")
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
527 re))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
528
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
529 (defun sregex--group (exps combine) (concat "\\(" (sregex--sequence exps nil) "\\)"))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
530
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
531 (defun sregex--backref (exps combine) (concat "\\" (int-to-string (car exps))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
532 (defun sregex--opt (exps combine) (concat (sregex--sequence exps 'suffix) "?"))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
533 (defun sregex--0+ (exps combine) (concat (sregex--sequence exps 'suffix) "*"))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
534 (defun sregex--1+ (exps combine) (concat (sregex--sequence exps 'suffix) "+"))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
535
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
536 (defun sregex--char (exps combine) (sregex--char-aux nil exps))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
537 (defun sregex--not-char (exps combine) (sregex--char-aux t exps))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
538
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
539 (defun sregex--syntax (exps combine) (format "\\s%c" (car exps)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
540 (defun sregex--not-syntax (exps combine) (format "\\S%c" (car exps)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
541
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
542 (defun sregex--regex (exps combine)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
543 (if combine (concat "\\(?:" (car exps) "\\)") (car exps)))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
544
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
545 (defun sregex--repeat (exps combine)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
546 (let* ((min (or (pop exps) 0))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
547 (minstr (number-to-string min))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
548 (max (pop exps)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
549 (concat (sregex--sequence exps 'suffix)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
550 (concat "\\{" minstr ","
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
551 (when max (number-to-string max)) "\\}"))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
552
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
553 (defun sregex--char-range (start end)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
554 (let ((startc (char-to-string start))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
555 (endc (char-to-string end)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
556 (cond
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
557 ((> end (+ start 2)) (concat startc "-" endc))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
558 ((> end (+ start 1)) (concat startc (char-to-string (1+ start)) endc))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
559 ((> end start) (concat startc endc))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
560 (t startc))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
561
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
562 (defun sregex--char-aux (complement args)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
563 ;; regex-opt does the same, we should join effort.
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
564 (let ((chars (make-bool-vector 256 nil))) ; Yeah, right!
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
565 (dolist (arg args)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
566 (cond ((integerp arg) (aset chars arg t))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
567 ((stringp arg) (mapcar (lambda (c) (aset chars c t)) arg))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
568 ((consp arg)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
569 (let ((start (car arg))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
570 (end (cdr arg)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
571 (when (> start end)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
572 (let ((tmp start)) (setq start end) (setq end tmp)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
573 ;; now start <= end
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
574 (let ((i start))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
575 (while (<= i end)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
576 (aset chars i t)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
577 (setq i (1+ i))))))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
578 ;; now chars is a map of the characters in the class
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
579 (let ((caret (aref chars ?^))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
580 (dash (aref chars ?-))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
581 (class (if (aref chars ?\]) "]" "")))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
582 (aset chars ?^ nil)
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
583 (aset chars ?- nil)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
584 (aset chars ?\] nil)
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
585
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
586 (let (start end)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
587 (dotimes (i 256)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
588 (if (aref chars i)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
589 (progn
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
590 (unless start (setq start i))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
591 (setq end i)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
592 (aset chars i nil))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
593 (when start
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
594 (setq class (concat class (sregex--char-range start end)))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
595 (setq start nil))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
596 (if start
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
597 (setq class (concat class (sregex--char-range start end)))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
598
29069
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
599 (if (> (length class) 0)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
600 (setq class (concat class (if caret "^") (if dash "-")))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
601 (setq class (concat class (if dash "-") (if caret "^"))))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
602 (if (and (not complement) (= (length class) 1))
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
603 (regexp-quote class)
cb028b1d6345 Rewritten to take advantage of shy-groups and
Stefan Monnier <monnier@iro.umontreal.ca>
parents: 22974
diff changeset
604 (concat "[" (if complement "^") class "]")))))
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
605
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
606 (provide 'sregex)
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
607
52401
695cf19ef79e Add arch taglines
Miles Bader <miles@gnu.org>
parents: 38436
diff changeset
608 ;;; arch-tag: 460c1f5a-eb6e-42ec-a451-ffac78bdf492
22537
7947a4ea28a8 Initial revision
Dan Nicolaescu <done@ece.arizona.edu>
parents:
diff changeset
609 ;;; sregex.el ends here