Mercurial > emacs
view lisp/emacs-lisp/regexp-opt.el @ 110523:a5ad4f188e19
Synch Semantic to CEDET 1.0.
Move CEDET ChangeLog entries to new file lisp/cedet/ChangeLog.
* semantic.el (semantic-version): Update to 2.0.
(semantic-mode-map): Add "," and "m" bindings.
(navigate-menu): Update.
* semantic/symref.el (semantic-symref-calculate-rootdir):
New function.
(semantic-symref-detect-symref-tool): Use it.
* semantic/symref/grep.el (semantic-symref-grep-shell): New var.
(semantic-symref-perform-search): Use it. Calculate root dir with
semantic-symref-calculate-rootdir.
(semantic-symref-derive-find-filepatterns): Improve error message.
* semantic/symref/list.el
(semantic-symref-results-mode-map): New bindings.
(semantic-symref-auto-expand-results): New option.
(semantic-symref-results-dump): Obey auto-expand.
(semantic-symref-list-expand-all, semantic-symref-regexp)
(semantic-symref-list-contract-all)
(semantic-symref-list-map-open-hits)
(semantic-symref-list-update-open-hits)
(semantic-symref-list-create-macro-on-open-hit)
(semantic-symref-list-call-macro-on-open-hits): New functions.
(semantic-symref-list-menu-entries)
(semantic-symref-list-menu): New vars.
(semantic-symref-list-map-open-hits): Move cursor to beginning of
match before calling the mapped function.
* semantic/doc.el
(semantic-documentation-comment-preceeding-tag): Do nothing if the
mode doesn't provide comment-start-skip.
* semantic/scope.el
(semantic-analyze-scope-nested-tags-default): Strip duplicates.
(semantic-analyze-scoped-inherited-tag-map): Take the tag we are
looking for as part of the scoped tags list.
* semantic/html.el (semantic-default-html-setup): Add
senator-step-at-tag-classes.
* semantic/decorate/include.el
(semantic-decoration-on-unknown-includes): Change light bgcolor.
(semantic-decoration-on-includes-highlight-default): Check that
the include tag has a postion.
* semantic/complete.el (semantic-collector-local-members):
(semantic-complete-read-tag-local-members)
(semantic-complete-jump-local-members): New class and functions.
(semantic-complete-self-insert): Save excursion before completing.
* semantic/analyze/complete.el
(semantic-analyze-possible-completions-default): If no completions
are found, return the raw by-name-only completion list. Add FLAGS
arguments. Add support for 'no-tc (type constraint) and
'no-unique, or no stripping duplicates.
(semantic-analyze-possible-completions-default): Add FLAGS arg.
* semantic/util-modes.el
(semantic-stickyfunc-show-only-functions-p): New option.
(semantic-stickyfunc-fetch-stickyline): Don't show stickytext for
the very first line in a buffer.
* semantic/util.el (semantic-hack-search)
(semantic-recursive-find-nonterminal-by-name)
(semantic-current-tag-interactive): Deleted.
(semantic-describe-buffer): Fix expand-nonterminal. Add
lex-syntax-mods, type relation separator char, and command
separation char.
(semantic-sanity-check): Only message if called interactively.
* semantic/tag.el (semantic-tag-deep-copy-one-tag): Copy the
:filename property and the tag position.
* semantic/lex-spp.el (semantic-lex-spp-lex-text-string):
Add recursion limit.
* semantic/imenu.el (semantic-imenu-bucketize-type-members):
Make this buffer local, not the obsoleted variable.
* semantic/idle.el: Add breadcrumbs support.
(semantic-idle-summary-current-symbol-info-default)
(semantic-idle-tag-highlight)
(semantic-idle-completion-list-default): Use
semanticdb-without-unloaded-file-searches for speed, and to
conform to the controls that specify if the idle timer is supposed
to be parsing unparsed includes.
(semantic-idle-symbol-highlight-face)
(semantic-idle-symbol-maybe-highlight): Rename from *-summary-*.
Callers changed.
(semantic-idle-work-parse-neighboring-files-flag): Default to nil.
(semantic-idle-work-update-headers-flag): New var.
(semantic-idle-work-for-one-buffer): Use it.
(semantic-idle-local-symbol-highlight): Rename from
semantic-idle-tag-highlight.
(semantic-idle-truncate-long-summaries): New option.
* semantic/ia.el (semantic-ia-cache)
(semantic-ia-get-completions): Deleted. Callers changed.
(semantic-ia-show-variants): New command.
(semantic-ia-show-doc): If doc is empty, don't make a temp buffer.
(semantic-ia-show-summary): If there isn't anything to show, say so.
* semantic/grammar.el (semantic-grammar-create-package):
Save the buffer even in batch mode.
* semantic/fw.el
(semanticdb-without-unloaded-file-searches): New macro.
* semantic/dep.el (semantic-dependency-find-file-on-path):
Fix case dereferencing ede-object when it is a list.
* semantic/db-typecache.el (semanticdb-expand-nested-tag)
(semanticdb-typecache-faux-namespace): New functions.
(semanticdb-typecache-file-tags)
(semanticdb-typecache-merge-streams): Use them.
(semanticdb-typecache-file-tags): When deriving tags from a file,
give the mode a chance to monkey with the tag copy.
(semanticdb-typecache-find-default): Wrap find in save-excursion.
(semanticdb-typecache-find-by-name-helper): Merge found names down.
* semantic/db-global.el
(semanticdb-enable-gnu-global-in-buffer): Don't show messages if
GNU Global is not available and we don't want to throw an error.
* semantic/db-find.el (semanticdb-find-result-nth-in-buffer):
When trying to normalize the tag to a buffer, don't error if
set-buffer method doesn't exist.
* semantic/db-file.el (semanticdb-save-db): Simplify msg.
* semantic/db.el (semanticdb-refresh-table): If forcing a
refresh on a file not in a buffer, use semantic-find-file-noselect
and delete the buffer after use.
(semanticdb-current-database-list): When calculating root via
hooks, force it through true-filename and skip the list of
possible roots.
* semantic/ctxt.el (semantic-ctxt-imported-packages): New.
* semantic/analyze/debug.el
(semantic-analyzer-debug-insert-tag): Reset standard output to
current buffer.
(semantic-analyzer-debug-global-symbol)
(semantic-analyzer-debug-missing-innertype): Change "prefix" to
"symbol" in messages.
* semantic/analyze/refs.el: (semantic-analyze-refs-impl)
(semantic-analyze-refs-proto): When calculating value, make sure
the found tag is 'similar' to the originating tag.
(semantic--analyze-refs-find-tags-with-parent): Attempt to
identify matches via imported symbols of parents.
(semantic--analyze-refs-full-lookup-with-parents): Do a deep
search during the brute search.
* semantic/analyze.el
(semantic-analyze-find-tag-sequence-default): Be robust to
calculated scopes being nil.
* semantic/bovine/c.el (semantic-c-describe-environment): Add
project macro symbol array.
(semantic-c-parse-lexical-token): Add recursion limit.
(semantic-ctxt-imported-packages, semanticdb-expand-nested-tag):
New overrides.
(semantic-expand-c-tag-namelist): Split a full type from a typedef
out to its own tag.
(semantic-expand-c-tag-namelist): Do not split out a typedef'd
inline type if it is an anonymous type.
(semantic-c-reconstitute-token): Use the optional initializers as
a clue that some function is probably a constructor. When
defining the type of these constructors, split the parent name,
and use only the class part, if applicable.
* semantic/bovine/c-by.el:
* semantic/wisent/python-wy.el: Regenerate.
author | Chong Yidong <cyd@stupidchicken.com> |
---|---|
date | Sat, 18 Sep 2010 22:49:54 -0400 |
parents | 1d1d5d9bd884 |
children | b10051866f51 2e8109ba205d |
line wrap: on
line source
;;; regexp-opt.el --- generate efficient regexps to match strings ;; Copyright (C) 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001, 2002, ;; 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. ;; Author: Simon Marshall <simon@gnu.org> ;; Maintainer: FSF ;; Keywords: strings, regexps, extensions ;; This file is part of GNU Emacs. ;; GNU Emacs is free software: you can redistribute it and/or modify ;; it under the terms of the GNU General Public License as published by ;; the Free Software Foundation, either version 3 of the License, or ;; (at your option) any later version. ;; GNU Emacs is distributed in the hope that it will be useful, ;; but WITHOUT ANY WARRANTY; without even the implied warranty of ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the ;; GNU General Public License for more details. ;; You should have received a copy of the GNU General Public License ;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. ;;; Commentary: ;; The "opt" in "regexp-opt" stands for "optim\\(al\\|i[sz]e\\)". ;; ;; This package generates a regexp from a given list of strings (which matches ;; one of those strings) so that the regexp generated by: ;; ;; (regexp-opt strings) ;; ;; is equivalent to, but more efficient than, the regexp generated by: ;; ;; (mapconcat 'regexp-quote strings "\\|") ;; ;; For example: ;; ;; (let ((strings '("cond" "if" "when" "unless" "while" ;; "let" "let*" "progn" "prog1" "prog2" ;; "save-restriction" "save-excursion" "save-window-excursion" ;; "save-current-buffer" "save-match-data" ;; "catch" "throw" "unwind-protect" "condition-case"))) ;; (concat "(" (regexp-opt strings t) "\\>")) ;; => "(\\(c\\(atch\\|ond\\(ition-case\\)?\\)\\|if\\|let\\*?\\|prog[12n]\\|save-\\(current-buffer\\|excursion\\|match-data\\|restriction\\|window-excursion\\)\\|throw\\|un\\(less\\|wind-protect\\)\\|wh\\(en\\|ile\\)\\)\\>" ;; ;; Searching using the above example `regexp-opt' regexp takes approximately ;; two-thirds of the time taken using the equivalent `mapconcat' regexp. ;; Since this package was written to produce efficient regexps, not regexps ;; efficiently, it is probably not a good idea to in-line too many calls in ;; your code, unless you use the following trick with `eval-when-compile': ;; ;; (defvar definition-regexp ;; (eval-when-compile ;; (concat "^(" ;; (regexp-opt '("defun" "defsubst" "defmacro" "defalias" ;; "defvar" "defconst") t) ;; "\\>"))) ;; ;; The `byte-compile' code will be as if you had defined the variable thus: ;; ;; (defvar definition-regexp ;; "^(\\(def\\(alias\\|const\\|macro\\|subst\\|un\\|var\\)\\)\\>") ;; ;; Note that if you use this trick for all instances of `regexp-opt' and ;; `regexp-opt-depth' in your code, regexp-opt.el would only have to be loaded ;; at compile time. But note also that using this trick means that should ;; regexp-opt.el be changed, perhaps to fix a bug or to add a feature to ;; improve the efficiency of `regexp-opt' regexps, you would have to recompile ;; your code for such changes to have effect in your code. ;; Originally written for font-lock.el, from an idea from Stig's hl319.el, with ;; thanks for ideas also to Michael Ernst, Bob Glickstein, Dan Nicolaescu and ;; Stefan Monnier. ;; No doubt `regexp-opt' doesn't always produce optimal regexps, so code, ideas ;; or any other information to improve things are welcome. ;; ;; One possible improvement would be to compile '("aa" "ab" "ba" "bb") ;; into "[ab][ab]" rather than "a[ab]\\|b[ab]". I'm not sure it's worth ;; it but if someone knows how to do it without going through too many ;; contortions, I'm all ears. ;;; Code: ;;;###autoload (defun regexp-opt (strings &optional paren) "Return a regexp to match a string in the list STRINGS. Each string should be unique in STRINGS and should not contain any regexps, quoted or not. If optional PAREN is non-nil, ensure that the returned regexp is enclosed by at least one regexp grouping construct. The returned regexp is typically more efficient than the equivalent regexp: (let ((open (if PAREN \"\\\\(\" \"\")) (close (if PAREN \"\\\\)\" \"\"))) (concat open (mapconcat 'regexp-quote STRINGS \"\\\\|\") close)) If PAREN is `words', then the resulting regexp is additionally surrounded by \\=\\< and \\>." (save-match-data ;; Recurse on the sorted list. (let* ((max-lisp-eval-depth 10000) (max-specpdl-size 10000) (completion-ignore-case nil) (completion-regexp-list nil) (words (eq paren 'words)) (open (cond ((stringp paren) paren) (paren "\\("))) (sorted-strings (delete-dups (sort (copy-sequence strings) 'string-lessp))) (re (regexp-opt-group sorted-strings (or open t) (not open)))) (if words (concat "\\<" re "\\>") re)))) ;;;###autoload (defun regexp-opt-depth (regexp) "Return the depth of REGEXP. This means the number of non-shy regexp grouping constructs \(parenthesized expressions) in REGEXP." (save-match-data ;; Hack to signal an error if REGEXP does not have balanced parentheses. (string-match regexp "") ;; Count the number of open parentheses in REGEXP. (let ((count 0) start last) (while (string-match "\\\\(\\(\\?:\\)?" regexp start) (setq start (match-end 0)) ; Start of next search. (when (and (not (match-beginning 1)) (subregexp-context-p regexp (match-beginning 0) last)) ;; It's not a shy group and it's not inside brackets or after ;; a backslash: it's really a group-open marker. (setq last start) ; Speed up next regexp-opt-re-context-p. (setq count (1+ count)))) count))) ;;; Workhorse functions. (eval-when-compile (require 'cl)) (defun regexp-opt-group (strings &optional paren lax) ;; Return a regexp to match a string in the sorted list STRINGS. ;; If PAREN non-nil, output regexp parentheses around returned regexp. ;; If LAX non-nil, don't output parentheses if it doesn't require them. ;; Merges keywords to avoid backtracking in Emacs' regexp matcher. ;; The basic idea is to find the shortest common prefix or suffix, remove it ;; and recurse. If there is no prefix, we divide the list into two so that ;; \(at least) one half will have at least a one-character common prefix. ;; Also we delay the addition of grouping parenthesis as long as possible ;; until we're sure we need them, and try to remove one-character sequences ;; so we can use character sets rather than grouping parenthesis. (let* ((open-group (cond ((stringp paren) paren) (paren "\\(?:") (t ""))) (close-group (if paren "\\)" "")) (open-charset (if lax "" open-group)) (close-charset (if lax "" close-group))) (cond ;; ;; If there are no strings, just return the empty string. ((= (length strings) 0) "") ;; ;; If there is only one string, just return it. ((= (length strings) 1) (if (= (length (car strings)) 1) (concat open-charset (regexp-quote (car strings)) close-charset) (concat open-group (regexp-quote (car strings)) close-group))) ;; ;; If there is an empty string, remove it and recurse on the rest. ((= (length (car strings)) 0) (concat open-charset (regexp-opt-group (cdr strings) t t) "?" close-charset)) ;; ;; If there are several one-char strings, use charsets ((and (= (length (car strings)) 1) (let ((strs (cdr strings))) (while (and strs (/= (length (car strs)) 1)) (pop strs)) strs)) (let (letters rest) ;; Collect one-char strings (dolist (s strings) (if (= (length s) 1) (push (string-to-char s) letters) (push s rest))) (if rest ;; several one-char strings: take them and recurse ;; on the rest (first so as to match the longest). (concat open-group (regexp-opt-group (nreverse rest)) "\\|" (regexp-opt-charset letters) close-group) ;; all are one-char strings: just return a character set. (concat open-charset (regexp-opt-charset letters) close-charset)))) ;; ;; We have a list of different length strings. (t (let ((prefix (try-completion "" strings))) (if (> (length prefix) 0) ;; common prefix: take it and recurse on the suffixes. (let* ((n (length prefix)) (suffixes (mapcar (lambda (s) (substring s n)) strings))) (concat open-group (regexp-quote prefix) (regexp-opt-group suffixes t t) close-group)) (let* ((sgnirts (mapcar (lambda (s) (concat (nreverse (string-to-list s)))) strings)) (xiffus (try-completion "" sgnirts))) (if (> (length xiffus) 0) ;; common suffix: take it and recurse on the prefixes. (let* ((n (- (length xiffus))) (prefixes ;; Sorting is necessary in cases such as ("ad" "d"). (sort (mapcar (lambda (s) (substring s 0 n)) strings) 'string-lessp))) (concat open-group (regexp-opt-group prefixes t t) (regexp-quote (concat (nreverse (string-to-list xiffus)))) close-group)) ;; Otherwise, divide the list into those that start with a ;; particular letter and those that do not, and recurse on them. (let* ((char (substring-no-properties (car strings) 0 1)) (half1 (all-completions char strings)) (half2 (nthcdr (length half1) strings))) (concat open-group (regexp-opt-group half1) "\\|" (regexp-opt-group half2) close-group)))))))))) (defun regexp-opt-charset (chars) ;; ;; Return a regexp to match a character in CHARS. ;; ;; The basic idea is to find character ranges. Also we take care in the ;; position of character set meta characters in the character set regexp. ;; (let* ((charmap (make-char-table 'case-table)) (start -1) (end -2) (charset "") (bracket "") (dash "") (caret "")) ;; ;; Make a character map but extract character set meta characters. (dolist (char chars) (case char (?\] (setq bracket "]")) (?^ (setq caret "^")) (?- (setq dash "-")) (otherwise (aset charmap char t)))) ;; ;; Make a character set from the map using ranges where applicable. (map-char-table (lambda (c v) (when v (if (consp c) (if (= (1- (car c)) end) (setq end (cdr c)) (if (> end (+ start 2)) (setq charset (format "%s%c-%c" charset start end)) (while (>= end start) (setq charset (format "%s%c" charset start)) (incf start))) (setq start (car c) end (cdr c))) (if (= (1- c) end) (setq end c) (if (> end (+ start 2)) (setq charset (format "%s%c-%c" charset start end)) (while (>= end start) (setq charset (format "%s%c" charset start)) (incf start))) (setq start c end c))))) charmap) (when (>= end start) (if (> end (+ start 2)) (setq charset (format "%s%c-%c" charset start end)) (while (>= end start) (setq charset (format "%s%c" charset start)) (incf start)))) ;; ;; Make sure a caret is not first and a dash is first or last. (if (and (string-equal charset "") (string-equal bracket "")) (concat "[" dash caret "]") (concat "[" bracket charset caret dash "]")))) (provide 'regexp-opt) ;; arch-tag: 6c5a66f4-29af-4fd6-8c3b-4b554d5b4370 ;;; regexp-opt.el ends here