Mercurial > emacs
annotate lisp/emacs-lisp/regexp-opt.el @ 42829:07bd6e693cb6
(easy-mmode-defmap): Enable "Up Stack", "Down Stack", and
"Finish Function" menu map entries for jdb mode.
(gud-jdb-use-classpath): New customization variable.
(gud-jdb-command-name): Add customization.
(gud-jdb-classpath, gud-marker-acc-max-length): New variables.
(gud-jdb-classpath-string): New variable.
(gud-jdb-source-files, gud-jdb-class-source-alist): Add doc strings.
(gud-jdb-build-source-files-list): Likewise.
(gud-jdb-massage-args): Record any command argument classpath
string in `gud-jdb-classpath-string'.
(gud-jdb-lowest-stack-level): New function, finds bottom of current
java call stack in jdb output.
(gud-jdb-find-source-using-classpath, gud-jdb-find-source)
(gud-jdb-parse-classpath-string): New functions.
(gud-jdb-marker-filter): Search/detect classpath information in
jdb's output. marker regexp updated to match oldjdb and jdb output
formats. Expand search for source files to include new/old methods
using new functions above. Do not allow `gud-marker-acc' to grow
without bound.
(jdb): Set classpath information (if available) as jdb is started.
Change `gud-break' and `gud-remove'
to use new %c ("class") escape in format strings. Add
`gud-finish', `gud-up', `gud-down' command string functions, and
add them to the local menu map. Update `comint-prompt-regexp' for
jdb and oldjdb. If attaching to an already running java VM and
configured to use classpath, send command to query for classpath,
else use previous method for finding and parsing java
sources. Set `gud-jdb-find-source' function accordingly.
(gud-mode): Doc fix.
(gud-format-command): Add support for new %c ("class") escape.
(gud-find-class): New function in support of %c escape.
author | Richard M. Stallman <rms@gnu.org> |
---|---|
date | Fri, 18 Jan 2002 18:57:20 +0000 |
parents | ad0233037e24 |
children | 20781c152651 8eba780a3a36 |
rev | line source |
---|---|
38412
253f761ad37b
Some fixes to follow coding conventions in files maintained by FSF.
Pavel Janík <Pavel@Janik.cz>
parents:
33209
diff
changeset
|
1 ;;; regexp-opt.el --- generate efficient regexps to match strings |
18014 | 2 |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
3 ;; Copyright (C) 1994,95,96,97,98,99,2000 Free Software Foundation, Inc. |
18014 | 4 |
25278 | 5 ;; Author: Simon Marshall <simon@gnu.org> |
27589 | 6 ;; Maintainer: FSF |
28420 | 7 ;; Keywords: strings, regexps, extensions |
18014 | 8 |
9 ;; This file is part of GNU Emacs. | |
10 | |
11 ;; GNU Emacs is free software; you can redistribute it and/or modify | |
12 ;; it under the terms of the GNU General Public License as published by | |
13 ;; the Free Software Foundation; either version 2, or (at your option) | |
14 ;; any later version. | |
15 | |
16 ;; GNU Emacs is distributed in the hope that it will be useful, | |
17 ;; but WITHOUT ANY WARRANTY; without even the implied warranty of | |
18 ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
19 ;; GNU General Public License for more details. | |
20 | |
21 ;; You should have received a copy of the GNU General Public License | |
22 ;; along with GNU Emacs; see the file COPYING. If not, write to the | |
23 ;; Free Software Foundation, Inc., 59 Temple Place - Suite 330, | |
24 ;; Boston, MA 02111-1307, USA. | |
25 | |
26 ;;; Commentary: | |
27 | |
25938
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
28 ;; The "opt" in "regexp-opt" stands for "optim\\(al\\|i[sz]e\\)". |
18014 | 29 ;; |
18149
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
30 ;; This package generates a regexp from a given list of strings (which matches |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
31 ;; one of those strings) so that the regexp generated by: |
18014 | 32 ;; |
18149
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
33 ;; (regexp-opt strings) |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
34 ;; |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
35 ;; is equivalent to, but more efficient than, the regexp generated by: |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
36 ;; |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
37 ;; (mapconcat 'regexp-quote strings "\\|") |
18014 | 38 ;; |
39 ;; For example: | |
40 ;; | |
41 ;; (let ((strings '("cond" "if" "when" "unless" "while" | |
42 ;; "let" "let*" "progn" "prog1" "prog2" | |
43 ;; "save-restriction" "save-excursion" "save-window-excursion" | |
44 ;; "save-current-buffer" "save-match-data" | |
45 ;; "catch" "throw" "unwind-protect" "condition-case"))) | |
46 ;; (concat "(" (regexp-opt strings t) "\\>")) | |
47 ;; => "(\\(c\\(atch\\|ond\\(ition-case\\)?\\)\\|if\\|let\\*?\\|prog[12n]\\|save-\\(current-buffer\\|excursion\\|match-data\\|restriction\\|window-excursion\\)\\|throw\\|un\\(less\\|wind-protect\\)\\|wh\\(en\\|ile\\)\\)\\>" | |
48 ;; | |
18149
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
49 ;; Searching using the above example `regexp-opt' regexp takes approximately |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
50 ;; two-thirds of the time taken using the equivalent `mapconcat' regexp. |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
51 |
18014 | 52 ;; Since this package was written to produce efficient regexps, not regexps |
53 ;; efficiently, it is probably not a good idea to in-line too many calls in | |
54 ;; your code, unless you use the following trick with `eval-when-compile': | |
55 ;; | |
56 ;; (defvar definition-regexp | |
57 ;; (eval-when-compile | |
58 ;; (concat "^(" | |
59 ;; (regexp-opt '("defun" "defsubst" "defmacro" "defalias" | |
60 ;; "defvar" "defconst") t) | |
61 ;; "\\>"))) | |
62 ;; | |
63 ;; The `byte-compile' code will be as if you had defined the variable thus: | |
64 ;; | |
65 ;; (defvar definition-regexp | |
66 ;; "^(\\(def\\(alias\\|const\\|macro\\|subst\\|un\\|var\\)\\)\\>") | |
67 ;; | |
18149
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
68 ;; Note that if you use this trick for all instances of `regexp-opt' and |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
69 ;; `regexp-opt-depth' in your code, regexp-opt.el would only have to be loaded |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
70 ;; at compile time. But note also that using this trick means that should |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
71 ;; regexp-opt.el be changed, perhaps to fix a bug or to add a feature to |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
72 ;; improve the efficiency of `regexp-opt' regexps, you would have to recompile |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
73 ;; your code for such changes to have effect in your code. |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
74 |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
75 ;; Originally written for font-lock.el, from an idea from Stig's hl319.el, with |
25938
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
76 ;; thanks for ideas also to Michael Ernst, Bob Glickstein, Dan Nicolaescu and |
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
77 ;; Stefan Monnier. |
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
78 ;; No doubt `regexp-opt' doesn't always produce optimal regexps, so code, ideas |
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
79 ;; or any other information to improve things are welcome. |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
80 ;; |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
81 ;; One possible improvement would be to compile '("aa" "ab" "ba" "bb") |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
82 ;; into "[ab][ab]" rather than "a[ab]\\|b[ab]". I'm not sure it's worth |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
83 ;; it but if someone knows how to do it without going through too many |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
84 ;; contortions, I'm all ears. |
18014 | 85 |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
86 ;;; Code: |
18014 | 87 |
88 ;;;###autoload | |
89 (defun regexp-opt (strings &optional paren) | |
90 "Return a regexp to match a string in STRINGS. | |
19782 | 91 Each string should be unique in STRINGS and should not contain any regexps, |
92 quoted or not. If optional PAREN is non-nil, ensure that the returned regexp | |
93 is enclosed by at least one regexp grouping construct. | |
18014 | 94 The returned regexp is typically more efficient than the equivalent regexp: |
95 | |
32304
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
96 (let ((open (if PAREN \"\\\\(\" \"\")) (close (if PAREN \"\\\\)\" \"\"))) |
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
97 (concat open (mapconcat 'regexp-quote STRINGS \"\\\\|\") close)) |
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
98 |
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
99 If PAREN is `words', then the resulting regexp is additionally surrounded |
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
100 by \\=\\< and \\>." |
18014 | 101 (save-match-data |
102 ;; Recurse on the sorted list. | |
32304
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
103 (let* ((max-lisp-eval-depth (* 1024 1024)) |
42625
ad0233037e24
(regexp-opt): Bind max-specpdl-size.
Richard M. Stallman <rms@gnu.org>
parents:
41754
diff
changeset
|
104 (max-specpdl-size (* 1024 1024)) |
32304
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
105 (completion-ignore-case nil) |
41754
e78fbcf9b878
(regexp-opt): Bind completion-regexp-list to nil.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
41617
diff
changeset
|
106 (completion-regexp-list nil) |
32304
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
107 (words (eq paren 'words)) |
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
108 (open (cond ((stringp paren) paren) (paren "\\("))) |
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
109 (sorted-strings (sort (copy-sequence strings) 'string-lessp)) |
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
110 (re (regexp-opt-group sorted-strings open))) |
e8fca08bb4cc
(regexp-opt): Add \< and \> if PAREN=`words'.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32041
diff
changeset
|
111 (if words (concat "\\<" re "\\>") re)))) |
18014 | 112 |
113 ;;;###autoload | |
114 (defun regexp-opt-depth (regexp) | |
115 "Return the depth of REGEXP. | |
116 This means the number of regexp grouping constructs (parenthesised expressions) | |
117 in REGEXP." | |
118 (save-match-data | |
119 ;; Hack to signal an error if REGEXP does not have balanced parentheses. | |
120 (string-match regexp "") | |
121 ;; Count the number of open parentheses in REGEXP. | |
122 (let ((count 0) start) | |
28864
71337b429a49
(regexp-opt-depth): Fix regexp.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28420
diff
changeset
|
123 (while (string-match "\\(\\`\\|[^\\]\\)\\\\\\(\\\\\\\\\\)*([^?]" |
71337b429a49
(regexp-opt-depth): Fix regexp.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28420
diff
changeset
|
124 regexp start) |
41616
8ba7e2fecead
(regexp-opt-depth): Fix off-by-two error.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
38412
diff
changeset
|
125 (setq count (1+ count) |
8ba7e2fecead
(regexp-opt-depth): Fix off-by-two error.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
38412
diff
changeset
|
126 ;; Go back 2 chars (one for [^?] and one for [^\\]). |
41617 | 127 start (- (match-end 0) 2))) |
18014 | 128 count))) |
129 | |
130 ;;; Workhorse functions. | |
131 | |
132 (eval-when-compile | |
133 (require 'cl)) | |
134 | |
135 (defun regexp-opt-group (strings &optional paren lax) | |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
136 "Return a regexp to match a string in STRINGS. |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
137 If PAREN non-nil, output regexp parentheses around returned regexp. |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
138 If LAX non-nil, don't output parentheses if it doesn't require them. |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
139 Merges keywords to avoid backtracking in Emacs' regexp matcher. |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
140 |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
141 The basic idea is to find the shortest common prefix or suffix, remove it |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
142 and recurse. If there is no prefix, we divide the list into two so that |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
143 \(at least) one half will have at least a one-character common prefix. |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
144 |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
145 Also we delay the addition of grouping parenthesis as long as possible |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
146 until we're sure we need them, and try to remove one-character sequences |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
147 so we can use character sets rather than grouping parenthesis." |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
148 (let* ((open-group (cond ((stringp paren) paren) (paren "\\(?:") (t ""))) |
18014 | 149 (close-group (if paren "\\)" "")) |
150 (open-charset (if lax "" open-group)) | |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
151 (close-charset (if lax "" close-group))) |
18014 | 152 (cond |
25938
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
153 ;; |
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
154 ;; If there are no strings, just return the empty string. |
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
155 ((= (length strings) 0) |
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
156 "") |
6f591e2d9c0d
(regexp-opt-try-suffix): New function.
Gerd Moellmann <gerd@gnu.org>
parents:
25278
diff
changeset
|
157 ;; |
18014 | 158 ;; If there is only one string, just return it. |
159 ((= (length strings) 1) | |
160 (if (= (length (car strings)) 1) | |
161 (concat open-charset (regexp-quote (car strings)) close-charset) | |
162 (concat open-group (regexp-quote (car strings)) close-group))) | |
163 ;; | |
164 ;; If there is an empty string, remove it and recurse on the rest. | |
165 ((= (length (car strings)) 0) | |
166 (concat open-charset | |
167 (regexp-opt-group (cdr strings) t t) "?" | |
168 close-charset)) | |
169 ;; | |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
170 ;; If there are several one-char strings, use charsets |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
171 ((and (= (length (car strings)) 1) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
172 (let ((strs (cdr strings))) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
173 (while (and strs (/= (length (car strs)) 1)) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
174 (pop strs)) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
175 strs)) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
176 (let (letters rest) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
177 ;; Collect one-char strings |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
178 (dolist (s strings) |
30719
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
179 (if (= (length s) 1) (push (string-to-char s) letters) (push s rest))) |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
180 |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
181 (if rest |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
182 ;; several one-char strings: take them and recurse |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
183 ;; on the rest (first so as to match the longest). |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
184 (concat open-group |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
185 (regexp-opt-group (nreverse rest)) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
186 "\\|" (regexp-opt-charset letters) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
187 close-group) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
188 ;; all are one-char strings: just return a character set. |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
189 (concat open-charset |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
190 (regexp-opt-charset letters) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
191 close-charset)))) |
18014 | 192 ;; |
193 ;; We have a list of different length strings. | |
194 (t | |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
195 (let ((prefix (try-completion "" (mapcar 'list strings)))) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
196 (if (> (length prefix) 0) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
197 ;; common prefix: take it and recurse on the suffixes. |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
198 (let* ((n (length prefix)) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
199 (suffixes (mapcar (lambda (s) (substring s n)) strings))) |
32041
a055173cadf8
(regexp-opt-group): Put more parenthesis.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
30719
diff
changeset
|
200 (concat open-group |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
201 (regexp-quote prefix) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
202 (regexp-opt-group suffixes t t) |
32041
a055173cadf8
(regexp-opt-group): Put more parenthesis.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
30719
diff
changeset
|
203 close-group)) |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
204 |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
205 (let* ((sgnirts (mapcar (lambda (s) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
206 (concat (nreverse (string-to-list s)))) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
207 strings)) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
208 (xiffus (try-completion "" (mapcar 'list sgnirts)))) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
209 (if (> (length xiffus) 0) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
210 ;; common suffix: take it and recurse on the prefixes. |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
211 (let* ((n (- (length xiffus))) |
33209
f94f82069336
(regexp-opt-group): Sort the strings when extracting a suffix.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32304
diff
changeset
|
212 (prefixes |
f94f82069336
(regexp-opt-group): Sort the strings when extracting a suffix.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32304
diff
changeset
|
213 ;; Sorting is necessary in cases such as ("ad" "d"). |
f94f82069336
(regexp-opt-group): Sort the strings when extracting a suffix.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32304
diff
changeset
|
214 (sort (mapcar (lambda (s) (substring s 0 n)) strings) |
f94f82069336
(regexp-opt-group): Sort the strings when extracting a suffix.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
32304
diff
changeset
|
215 'string-lessp))) |
32041
a055173cadf8
(regexp-opt-group): Put more parenthesis.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
30719
diff
changeset
|
216 (concat open-group |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
217 (regexp-opt-group prefixes t t) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
218 (regexp-quote |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
219 (concat (nreverse (string-to-list xiffus)))) |
32041
a055173cadf8
(regexp-opt-group): Put more parenthesis.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
30719
diff
changeset
|
220 close-group)) |
28067
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
221 |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
222 ;; Otherwise, divide the list into those that start with a |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
223 ;; particular letter and those that do not, and recurse on them. |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
224 (let* ((char (char-to-string (string-to-char (car strings)))) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
225 (half1 (all-completions char (mapcar 'list strings))) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
226 (half2 (nthcdr (length half1) strings))) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
227 (concat open-group |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
228 (regexp-opt-group half1) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
229 "\\|" (regexp-opt-group half2) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
230 close-group)))))))))) |
e09db52da018
Update copyright and leading comment.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
27589
diff
changeset
|
231 |
18014 | 232 |
233 (defun regexp-opt-charset (chars) | |
234 ;; | |
235 ;; Return a regexp to match a character in CHARS. | |
236 ;; | |
237 ;; The basic idea is to find character ranges. Also we take care in the | |
238 ;; position of character set meta characters in the character set regexp. | |
239 ;; | |
30719
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
240 (let* ((charmap (make-char-table 'case-table)) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
241 (start -1) (end -2) |
18014 | 242 (charset "") |
243 (bracket "") (dash "") (caret "")) | |
244 ;; | |
245 ;; Make a character map but extract character set meta characters. | |
30719
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
246 (dolist (char chars) |
18149
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
247 (case char |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
248 (?\] |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
249 (setq bracket "]")) |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
250 (?^ |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
251 (setq caret "^")) |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
252 (?- |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
253 (setq dash "-")) |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
254 (otherwise |
2fec7f622b82
emit charsets after strings so that the final regexp finds the longest match.
Simon Marshall <simon@gnu.org>
parents:
18014
diff
changeset
|
255 (aset charmap char t)))) |
18014 | 256 ;; |
257 ;; Make a character set from the map using ranges where applicable. | |
30719
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
258 (map-char-table |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
259 (lambda (c v) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
260 (when v |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
261 (if (= (1- c) end) (setq end c) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
262 (if (> end (+ start 2)) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
263 (setq charset (format "%s%c-%c" charset start end)) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
264 (while (>= end start) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
265 (setq charset (format "%s%c" charset start)) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
266 (incf start))) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
267 (setq start c end c)))) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
268 charmap) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
269 (when (>= end start) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
270 (if (> end (+ start 2)) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
271 (setq charset (format "%s%c-%c" charset start end)) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
272 (while (>= end start) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
273 (setq charset (format "%s%c" charset start)) |
fd7db1cf7adf
(make-bool-vector): Remove.
Stefan Monnier <monnier@iro.umontreal.ca>
parents:
28864
diff
changeset
|
274 (incf start)))) |
18014 | 275 ;; |
276 ;; Make sure a caret is not first and a dash is first or last. | |
277 (if (and (string-equal charset "") (string-equal bracket "")) | |
278 (concat "[" dash caret "]") | |
279 (concat "[" bracket charset caret dash "]")))) | |
280 | |
281 (provide 'regexp-opt) | |
282 | |
283 ;;; regexp-opt.el ends here |