# HG changeset patch # User Francesco Potort # Date 1023966946 0 # Node ID d11816fe2c5970e1a8bb203e953c4feb59d296d1 # Parent 147a637372eb89efc9b0205f72470e1e2ff0de47 New multi-line regexp and new regexp syntax. diff -r 147a637372eb -r d11816fe2c59 etc/NEWS --- a/etc/NEWS Thu Jun 13 10:57:55 2002 +0000 +++ b/etc/NEWS Thu Jun 13 11:15:46 2002 +0000 @@ -569,6 +569,23 @@ ** Etags changes. +*** New syntax for regular expressions, multi-line regular expressions. +The syntax --ignore-case-regexp=/REGEX/NAME/ is now undocumented and +retained only for backward compatibility. The new equivalent syntax is +--regex=/REGEX/NAME/i. More generally, it is --regex=/REGEX/NAME/MODS, +where `/NAME' is optional, as usual, and MODS is a string of 0 or more +characters among `i' (ignore case), `m' (multi-line) and `s' +(single-line). The `m' and `s' modifiers behave as in Perl regular +expressions: `m' allows regexps to match more than one line, while `s' +(which implies `m') means that `.' matches newlines. The ability to +span newlines allows writing of much more powerful regular expressions +and rapid prototyping for tagging new languages. + +*** Regular expressions can use char escape sequences as in Gcc +The escaped character sequence \a, \b, \d, \e, \f, \n, \r, \t, \v, +respectively, stand for the ASCII characters BEL, BS, DEL, ESC, FF, NL, +CR, TAB, VT, + *** In Prolog, etags creates tags for rules in addition to predicates. *** In Perl, packages are tags. @@ -596,9 +613,6 @@ will read from standard input and mark the produced tags as belonging to the file FILE. -*** Regular expressions can use char escape sequences as in Gcc -These are the escapes \a, \b, \d, \e, \f, \n, \r, \t, \v. - +++ ** The command line option --no-windows has been changed to --no-window-system. The old one still works, but is deprecated. diff -r 147a637372eb -r d11816fe2c59 etc/etags.1 --- a/etc/etags.1 Thu Jun 13 10:57:55 2002 +0000 +++ b/etc/etags.1 Thu Jun 13 11:15:46 2002 +0000 @@ -22,7 +22,6 @@ [\|\-\-ignore\-indentation\|] [\|\-\-language=\fIlanguage\fP\|] [\|\-\-members\|] [\|\-\-output=\fItagfile\fP\|] [\|\-\-regex=\fIregexp\fP\|] [\|\-\-no\-regex\|] -[\|\-\-ignore\-case\-regex=\fIregexp\fP\|] [\|\-\-help\|] [\|\-\-version\|] \fIfile\fP .\|.\|. @@ -36,7 +35,6 @@ [\|\-\-globals\|] [\|\-\-ignore\-indentation\|] [\|\-\-language=\fIlanguage\fP\|] [\|\-\-members\|] [\|\-\-output=\fItagfile\fP\|] [\|\-\-regex=\fIregexp\fP\|] -[\|\-\-ignore\-case\-regex=\fIregexp\fP\|] [\|\-\-typedefs\|] [\|\-\-typedefs\-and\-c++\|] [\|\-\-update\|] [\|\-\-no\-warn\|] [\|\-\-help\|] [\|\-\-version\|] @@ -149,27 +147,32 @@ \fBtags\fP. (But ignored with \fB\-v\fP or \fB\-x\fP.) .TP \fB\-r\fP \fIregexp\fP, \fB\-\-regex=\fIregexp\fP -.TP -\fB\-\-ignore\-case\-regex=\fIregexp\fP -Make tags based on regexp matching for each line of the files -following this option, in addition to the tags made with the standard -parsing based on language. When using \fB\-\-regex\fP, case is -significant, while it is not with \fB\-\-ignore\-case\-regex\fP. May -be freely intermixed with filenames and the \fB\-R\fP option. The -regexps are cumulative, i.e. each option will add to the previous -ones. The regexps are of the form: + +Make tags based on regexp matching for the files following this option, +in addition to the tags made with the standard parsing based on +language. May be freely intermixed with filenames and the \fB\-R\fP +option. The regexps are cumulative, i.e. each such option will add to +the previous ones. The regexps are of the form: .br - \fB/\fP\fItagregexp\fP[\fB/\fP\fInameregexp\fP]\fB/\fP + \fB/\fP\fItagregexp/\fP[\fInameregexp\fP\fB/\fP]\fImodifiers\fP .br -where \fItagregexp\fP is used to match the lines that must be tagged. -It should not match useless characters. If the match is -such that more characters than needed are unavoidably matched by -\fItagregexp\fP, it may be useful to add a \fInameregexp\fP, to -narrow down the tag scope. \fBctags\fP ignores regexps without a -\fInameregexp\fP. The syntax of regexps is the same as in emacs. -The following character escape sequences are supported: -\\a, \\b, \\d, \\e, \\f, \\n, \\r, \\t, \\v. +where \fItagregexp\fP is used to match the tag. It should not match +useless characters. If the match is such that more characters than +needed are unavoidably matched by \fItagregexp\fP, it may be useful to +add a \fInameregexp\fP, to narrow down the tag scope. \fBctags\fP +ignores regexps without a \fInameregexp\fP. The syntax of regexps is +the same as in emacs. The following character escape sequences are +supported: \\a, \\b, \\d, \\e, \\f, \\n, \\r, \\t, \\v, which +respectively stand for the ASCII characters BEL, BS, DEL, ESC, FF, NL, +CR, TAB, VT. +.br +The \fImodifiers\fP are a sequence of 0 or more characters among +\fIi\fP, which means to ignore case when matching; \fIm\fP, which means +that the \fItagregexp\fP will be matched against the whole file contents +at once, rather than line by line, and the matching sequence can match +multiple lines; and \fIs\fP, which implies \fIm\fP and means that the +dot character in \fItagregexp\fP matches the newline char as well. .br Here are some examples. All the regexps are quoted to protect them diff -r 147a637372eb -r d11816fe2c59 lib-src/ChangeLog --- a/lib-src/ChangeLog Thu Jun 13 10:57:55 2002 +0000 +++ b/lib-src/ChangeLog Thu Jun 13 11:15:46 2002 +0000 @@ -1,3 +1,31 @@ +2002-06-12 Francesco Potorti` + + * etags.c: New multi-line regexp and new regexp syntax. + (arg_type): at_icregexp label removed (obsolete). + (pattern): New member multi_line for multi-line regexps. + (filebuf): A global buffer containing the whole file as a string + for multi-line regexp matching. + (need_filebuf): Global flag raised if multi-line regexps used. + (print_help): Document new regexp modifiers, remove references to + obsolete option --ignore-case-regexp. + (main): Do not set regexp syntax and translation table here. + (main): Treat -c option as a backward compatibility hack. + (main, find_entries): Init and free filebuf. + (find_entries): Call regex_tag_multiline after the regular parser. + (scan_separators): Check for untermintaed regexp and return NULL. + (analyse_regex, add_regex): Remove the ignore_case argument, which + is now a modifier to the regexp. All callers changed. + (add_regex): Manage the regexp modifiers. + (regex_tag_multiline): New function. Reads from filebuf. + (readline_internal): If necessary, copy the whole file into filebuf. + (readline): Skip multi-line regexps, leave them to regex_tag_multiline. + +2002-06-11 Francesco Potorti` + + * etags.c (add_regex): Better check for null regexps. + (readline): Check for regex matching null string. + (find_entries): Reorganisation. + 2002-06-07 Francesco Potorti` * etags.c (scan_separators): Support all character escape