# HG changeset patch # User Richard M. Stallman # Date 764833265 0 # Node ID 3b84ed22f747cba614f83c0f63b9d1da6ac16765 # Parent 99ca8123a3ca690eb71ee1e8d870c76514f3546f Initial revision diff -r 99ca8123a3ca -r 3b84ed22f747 lispref/abbrevs.texi --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lispref/abbrevs.texi Mon Mar 28 05:41:05 1994 +0000 @@ -0,0 +1,331 @@ +@c -*-texinfo-*- +@c This is part of the GNU Emacs Lisp Reference Manual. +@c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc. +@c See the file elisp.texi for copying conditions. +@setfilename ../info/abbrevs +@node Abbrevs, Processes, Syntax Tables, Top +@chapter Abbrevs And Abbrev Expansion +@cindex abbrev +@cindex abbrev table + + An abbreviation or @dfn{abbrev} is a string of characters that may be +expanded to a longer string. The user can insert the abbrev string and +find it replaced automatically with the expansion of the abbrev. This +saves typing. + + The set of abbrevs currently in effect is recorded in an @dfn{abbrev +table}. Each buffer has a local abbrev table, but normally all buffers +in the same major mode share one abbrev table. There is also a global +abbrev table. Normally both are used. + + An abbrev table is represented as an obarray containing a symbol for +each abbreviation. The symbol's name is the abbreviation. Its value is +the expansion; its function definition is the hook function to do the +expansion (if any); its property list cell contains the use count, the +number of times the abbreviation has been expanded. Because these +symbols are not interned in the usual obarray, they will never appear as +the result of reading a Lisp expression; in fact, normally they are +never used except by the code that handles abbrevs. Therefore, it is +safe to use them in an extremely nonstandard way. @xref{Creating +Symbols}. + + For the user-level commands for abbrevs, see @ref{Abbrevs,, Abbrev +Mode, emacs, The GNU Emacs Manual}. + +@menu +* Abbrev Mode:: Setting up Emacs for abbreviation. +* Tables: Abbrev Tables. Creating and working with abbrev tables. +* Defining Abbrevs:: Specifying abbreviations and their expansions. +* Files: Abbrev Files. Saving abbrevs in files. +* Expansion: Abbrev Expansion. Controlling expansion; expansion subroutines. +* Standard Abbrev Tables:: Abbrev tables used by various major modes. +@end menu + +@node Abbrev Mode, Abbrev Tables, Abbrevs, Abbrevs +@comment node-name, next, previous, up +@section Setting Up Abbrev Mode + + Abbrev mode is a minor mode controlled by the value of the variable +@code{abbrev-mode}. + +@defvar abbrev-mode +A non-@code{nil} value of this variable turns on the automatic expansion +of abbrevs when their abbreviations are inserted into a buffer. +If the value is @code{nil}, abbrevs may be defined, but they are not +expanded automatically. + +This variable automatically becomes local when set in any fashion. +@end defvar + +@defvar default-abbrev-mode +This is the value @code{abbrev-mode} for buffers that do not override it. +This is the same as @code{(default-value 'abbrev-mode)}. +@end defvar + +@node Abbrev Tables, Defining Abbrevs, Abbrev Mode, Abbrevs +@section Abbrev Tables + + This section describes how to create and manipulate abbrev tables. + +@defun make-abbrev-table +This function creates and returns a new, empty abbrev table---an obarray +containing no symbols. It is a vector filled with zeros. +@end defun + +@defun clear-abbrev-table table +This function undefines all the abbrevs in abbrev table @var{table}, +leaving it empty. The function returns @code{nil}. +@end defun + +@defun define-abbrev-table tabname definitions +This function defines @var{tabname} (a symbol) as an abbrev table name, +i.e., as a variable whose value is an abbrev table. It defines abbrevs +in the table according to @var{definitions}, a list of elements of the +form @code{(@var{abbrevname} @var{expansion} @var{hook} +@var{usecount})}. The value is always @code{nil}. +@end defun + +@defvar abbrev-table-name-list +This is a list of symbols whose values are abbrev tables. +@code{define-abbrev-table} adds the new abbrev table name to this list. +@end defvar + +@defun insert-abbrev-table-description name &optional human +This function inserts before point a description of the abbrev table +named @var{name}. The argument @var{name} is a symbol whose value is an +abbrev table. The value is always @code{nil}. + +If @var{human} is non-@code{nil}, the description is human-oriented. +Otherwise the description is a Lisp expression---a call to +@code{define-abbrev-table} which would define @var{name} exactly as it +is currently defined. +@end defun + +@node Defining Abbrevs, Abbrev Files, Abbrev Tables, Abbrevs +@comment node-name, next, previous, up +@section Defining Abbrevs + + These functions define an abbrev in a specified abbrev table. +@code{define-abbrev} is the low-level basic function, while +@code{add-abbrev} is used by commands that ask for information from the +user. + +@defun add-abbrev table type arg +This function adds an abbreviation to abbrev table @var{table}. The +argument @var{type} is a string describing in English the kind of abbrev +this will be (typically, @code{"global"} or @code{"mode-specific"}); +this is used in prompting the user. The argument @var{arg} is the +number of words in the expansion. + +The return value is the symbol which internally represents the new +abbrev, or @code{nil} if the user declines to confirm redefining an +existing abbrev. +@end defun + +@defun define-abbrev table name expansion hook +This function defines an abbrev in @var{table} named @var{name}, to +expand to @var{expansion}, and call @var{hook}. The return value is an +uninterned symbol which represents the abbrev inside Emacs; its name is +@var{name}. + +The argument @var{name} should be a string. The argument +@var{expansion} should be a string, or @code{nil}, to undefine the +abbrev. + +The argument @var{hook} is a function or @code{nil}. If @var{hook} is +non-@code{nil}, then it is called with no arguments after the abbrev is +replaced with @var{expansion}; point is located at the end of +@var{expansion}. + +The use count of the abbrev is initialized to zero. +@end defun + +@defopt only-global-abbrevs +If this variable is non-@code{nil}, it means that the user plans to use +global abbrevs only. This tells the commands that define mode-specific +abbrevs to define global ones instead. This variable does not alter the +functioning of the functions in this section; it is examined by their +callers. +@end defopt + +@node Abbrev Files, Abbrev Expansion, Defining Abbrevs, Abbrevs +@section Saving Abbrevs in Files + + A file of saved abbrev definitions is actually a file of Lisp code. +The abbrevs are saved in the form of a Lisp program to define the same +abbrev tables with the same contents. Therefore, you can load the file +with @code{load} (@pxref{How Programs Do Loading}). However, the +function @code{quietly-read-abbrev-file} is provided as a more +convenient interface. + + User-level facilities such as @code{save-some-buffers} can save +abbrevs in a file automatically, under the control of variables +described here. + +@defopt abbrev-file-name +This is the default file name for reading and saving abbrevs. +@end defopt + +@defun quietly-read-abbrev-file filename +This function reads abbrev definitions from a file named @var{filename}, +previously written with @code{write-abbrev-file}. If @var{filename} is +@code{nil}, the file specified in @code{abbrev-file-name} is used. +@code{save-abbrevs} is set to @code{t} so that changes will be saved. + +This function does not display any messages. It returns @code{nil}. +@end defun + +@defopt save-abbrevs +A non-@code{nil} value for @code{save-abbrev} means that Emacs should +save abbrevs when files are saved. @code{abbrev-file-name} specifies +the file to save the abbrevs in. +@end defopt + +@defvar abbrevs-changed +This variable is set non-@code{nil} by defining or altering any +abbrevs. This serves as a flag for various Emacs commands to offer to +save your abbrevs. +@end defvar + +@deffn Command write-abbrev-file filename +Save all abbrev definitions, in all abbrev tables, in the file +@var{filename}, in the form of a Lisp program which when loaded will +define the same abbrevs. This function returns @code{nil}. +@end deffn + +@node Abbrev Expansion, Standard Abbrev Tables, Abbrev Files, Abbrevs +@comment node-name, next, previous, up +@section Looking Up and Expanding Abbreviations + + Abbrevs are usually expanded by commands for interactive use, +including @code{self-insert-command}. This section describes the +subroutines used in writing such functions, as well as the variables +they use for communication. + +@defun abbrev-symbol abbrev &optional table +This function returns the symbol representing the abbrev named +@var{abbrev}. The value returned is @code{nil} if that abbrev is not +defined. The optional second argument @var{table} is the abbrev table +to look it up in. If @var{table} is @code{nil}, this function tries +first the current buffer's local abbrev table, and second the global +abbrev table. +@end defun + +@defopt abbrev-all-caps +When this is set non-@code{nil}, an abbrev entered entirely in upper +case is expanded using all upper case. Otherwise, an abbrev entered +entirely in upper case is expanded by capitalizing each word of the +expansion. +@end defopt + +@defun abbrev-expansion abbrev &optional table +This function returns the string that @var{abbrev} would expand into (as +defined by the abbrev tables used for the current buffer). The optional +argument @var{table} specifies the abbrev table to use; if it is +specified, the abbrev is looked up in that table only. +@end defun + +@defvar abbrev-start-location +This is the buffer position for @code{expand-abbrev} to use as the start +of the next abbrev to be expanded. (@code{nil} means use the word +before point instead.) @code{abbrev-start-location} is set to +@code{nil} each time @code{expand-abbrev} is called. This variable is +also set by @code{abbrev-prefix-mark}. +@end defvar + +@defvar abbrev-start-location-buffer +The value of this variable is the buffer for which +@code{abbrev-start-location} has been set. Trying to expand an abbrev +in any other buffer clears @code{abbrev-start-location}. This variable +is set by @code{abbrev-prefix-mark}. +@end defvar + +@defvar last-abbrev +This is the @code{abbrev-symbol} of the last abbrev expanded. This +information is left by @code{expand-abbrev} for the sake of the +@code{unexpand-abbrev} command. +@end defvar + +@defvar last-abbrev-location +This is the location of the last abbrev expanded. This contains +information left by @code{expand-abbrev} for the sake of the +@code{unexpand-abbrev} command. +@end defvar + +@defvar last-abbrev-text +This is the exact expansion text of the last abbrev expanded, as +results from case conversion. Its value is +@code{nil} if the abbrev has already been unexpanded. This +contains information left by @code{expand-abbrev} for the sake of the +@code{unexpand-abbrev} command. +@end defvar + +@c Emacs 19 feature +@defvar pre-abbrev-expand-hook +This is a normal hook whose functions are executed, in sequence, just +before any expansion of an abbrev. @xref{Hooks}. Since it is a normal +hook, the hook functions receive no arguments. However, they can find +the abbrev to be expanded by looking in the buffer before point. +@end defvar + + The following sample code shows a simple use of +@code{pre-abbrev-expand-hook}. If the user terminates an abbrev with a +punctuation character, the hook function asks for confirmation. Thus, +this hook allows the user to decide whether to expand the abbrev, and +aborts expansion if it is not confirmed. + +@smallexample +(add-hook 'pre-abbrev-expand-hook 'query-if-not-space) + +;; @r{This is the function invoked by @code{pre-abbrev-expand-hook}.} + +;; @r{If the user terminated the abbrev with a space, the function does} +;; @r{nothing (that is, it returns so that the abbrev can expand). If the} +;; @r{user entered some other character, this function asks whether} +;; @r{expansion should continue.} + +;; @r{If the user enters the prompt with @kbd{y}, the function returns} +;; @r{@code{nil} (because of the @code{not} function), but that is} +;; @r{acceptable; the return value has no effect on expansion.} + +(defun query-if-not-space () + (if (/= ?\ (preceding-char)) + (if (not (y-or-n-p "Do you want to expand this abbrev? ")) + (error "Not expanding this abbrev")))) +@end smallexample + +@node Standard Abbrev Tables, , Abbrev Expansion, Abbrevs +@comment node-name, next, previous, up +@section Standard Abbrev Tables + + Here we list the variables that hold the abbrev tables for the +preloaded major modes of Emacs. + +@defvar global-abbrev-table +This is the abbrev table for mode-independent abbrevs. The abbrevs +defined in it apply to all buffers. Each buffer may also have a local +abbrev table, whose abbrev definitions take precedence over those in the +global table. +@end defvar + +@defvar local-abbrev-table +The value of this buffer-local variable is the (mode-specific) +abbreviation table of the current buffer. +@end defvar + +@defvar fundamental-mode-abbrev-table +This is the local abbrev table used in Fundamental mode. It is the +local abbrev table in all buffers in Fundamental mode. +@end defvar + +@defvar text-mode-abbrev-table +This is the local abbrev table used in Text mode. +@end defvar + +@defvar c-mode-abbrev-table +This is the local abbrev table used in C mode. +@end defvar + +@defvar lisp-mode-abbrev-table +This is the local abbrev table used in Lisp mode and Emacs Lisp mode. +@end defvar diff -r 99ca8123a3ca -r 3b84ed22f747 lispref/positions.texi --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lispref/positions.texi Mon Mar 28 05:41:05 1994 +0000 @@ -0,0 +1,896 @@ +@c -*-texinfo-*- +@c This is part of the GNU Emacs Lisp Reference Manual. +@c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc. +@c See the file elisp.texi for copying conditions. +@setfilename ../info/positions +@node Positions, Markers, Frames, Top +@chapter Positions +@cindex position (in buffer) + + A @dfn{position} is the index of a character in the text of buffer. +More precisely, a position identifies the place between two characters +(or before the first character, or after the last character), so we can +speak of the character before or after a given position. However, the +we often speak of the character ``at'' a position, meaning the character +after that position. + + Positions are usually represented as integers starting from 1, but can +also be represented as @dfn{markers}---special objects which relocate +automatically when text is inserted or deleted so they stay with the +surrounding characters. @xref{Markers}. + +@menu +* Point:: The special position where editing takes place. +* Motion:: Changing point. +* Excursions:: Temporary motion and buffer changes. +* Narrowing:: Restricting editing to a portion of the buffer. +@end menu + +@node Point +@section Point +@cindex point + + @dfn{Point} is a special buffer position used by many editing +commands, including the self-inserting typed characters and text +insertion functions. Other commands move point through the text +to allow editing and insertion at different places. + + Like other positions, point designates a place between two characters +(or before the first character, or after the last character), rather +than a particular character. Many terminals display the cursor over the +character that immediately follows point; on such terminals, point is +actually before the character on which the cursor sits. + +@cindex point with narrowing + The value of point is a number between 1 and the buffer size plus 1. +If narrowing is in effect (@pxref{Narrowing}), then point is constrained +to fall within the accessible portion of the buffer (possibly at one end +of it). + + Each buffer has its own value of point, which is independent of the +value of point in other buffers. Each window also has a value of point, +which is independent of the value of point in other windows on the same +buffer. This is why point can have different values in various windows +that display the same buffer. When a buffer appears in only one window, +the buffer's point and the window's point normally have the same value, +so the distinction is rarely important. @xref{Window Point}, for more +details. + +@defun point +@cindex current buffer position + This function returns the position of point in the current buffer, +as an integer. + +@need 700 +@example +@group +(point) + @result{} 175 +@end group +@end example +@end defun + +@defun point-min + This function returns the minimum accessible value of point in the +current buffer. This is 1, unless narrowing is in effect, in +which case it is the position of the start of the region that you +narrowed to. (@xref{Narrowing}.) +@end defun + +@defun point-max + This function returns the maximum accessible value of point in the +current buffer. This is @code{(1+ (buffer-size))}, unless narrowing is +in effect, in which case it is the position of the end of the region +that you narrowed to. (@xref{Narrowing}). +@end defun + +@defun buffer-end flag + This function returns @code{(point-min)} if @var{flag} is less than 1, +@code{(point-max)} otherwise. The argument @var{flag} must be a number. +@end defun + +@defun buffer-size + This function returns the total number of characters in the current +buffer. In the absence of any narrowing (@pxref{Narrowing}), +@code{point-max} returns a value one larger than this. + +@example +@group +(buffer-size) + @result{} 35 +@end group +@group +(point-max) + @result{} 36 +@end group +@end example +@end defun + +@defvar buffer-saved-size + The value of this buffer-local variable is the former length of the +current buffer, as of the last time it was read in, saved or auto-saved. +@end defvar + +@node Motion +@section Motion + + Motion functions change the value of point, either relative to the +current value of point, relative to the beginning or end of the buffer, +or relative to the edges of the selected window. @xref{Point}. + +@menu +* Character Motion:: Moving in terms of characters. +* Word Motion:: Moving in terms of words. +* Buffer End Motion:: Moving to the beginning or end of the buffer. +* Text Lines:: Moving in terms of lines of text. +* Screen Lines:: Moving in terms of lines as displayed. +* Vertical Motion:: Implementation of @code{next-line} and + @code{previous-line}. +* List Motion:: Moving by parsing lists and sexps. +* Skipping Characters:: Skipping characters belonging to a certain set. +@end menu + +@node Character Motion +@subsection Motion by Characters + + These functions move point based on a count of characters. +@code{goto-char} is the fundamental primitive; the functions others use +that. + +@deffn Command goto-char position +This function sets point in the current buffer to the value +@var{position}. If @var{position} is less than 1, it moves point to the +beginning of the buffer. If @var{position} is greater than the length +of the buffer, it moves point to the end. + +If narrowing is in effect, @var{position} still counts from the +beginning of the buffer, but point cannot go outside the accessible +portion. If @var{position} is out of range, @code{goto-char} moves +point to the beginning or the end of the accessible portion. + +When this function is called interactively, @var{position} is the +numeric prefix argument, if provided; otherwise it is read from the +minibuffer. + +@code{goto-char} returns @var{position}. +@end deffn + +@deffn Command forward-char &optional count +@c @kindex beginning-of-buffer +@c @kindex end-of-buffer +This function moves point @var{count} characters forward, towards the +end of the buffer (or backward, towards the beginning of the buffer, if +@var{count} is negative). If the function attempts to move point past +the beginning or end of the buffer (or the limits of the accessible +portion, when narrowing is in effect), an error is signaled with error +code @code{beginning-of-buffer} or @code{end-of-buffer}. + +In an interactive call, @var{count} is the numeric prefix argument. +@end deffn + +@deffn Command backward-char &optional count +This function moves point @var{count} characters backward, towards the +beginning of the buffer (or forward, towards the end of the buffer, if +@var{count} is negative). If the function attempts to move point past +the beginning or end of the buffer (or the limits of the accessible +portion, when narrowing is in effect), an error is signaled with error +code @code{beginning-of-buffer} or @code{end-of-buffer}. + +In an interactive call, @var{count} is the numeric prefix argument. +@end deffn + +@node Word Motion +@subsection Motion by Words + + These functions for parsing words use the syntax table to decide +whether a given character is part of a word. @xref{Syntax Tables}. + +@deffn Command forward-word count +This function moves point forward @var{count} words (or backward if +@var{count} is negative). Normally it returns @code{t}. If this motion +encounters the beginning or end of the buffer, or the limits of the +accessible portion when narrowing is in effect, point stops there +and the value is @code{nil}. + +In an interactive call, @var{count} is set to the numeric prefix +argument. +@end deffn + +@deffn Command backward-word count +This function just like @code{forward-word}, except that it moves +backward until encountering the front of a word, rather than forward. + +In an interactive call, @var{count} is set to the numeric prefix +argument. + +This function is rarely used in programs, as it is more efficient to +call @code{forward-word} with negative argument. +@end deffn + +@defvar words-include-escapes +@c Emacs 19 feature +This variable affects the behavior of @code{forward-word} and everything +that uses it. If it is non-@code{nil}, then characters in the +``escape'' and ``character quote'' syntax classes count as part of +words. Otherwise, they do not. +@end defvar + +@node Buffer End Motion +@subsection Motion to an End of the Buffer + + To move point to the beginning of the buffer, write: + +@example +@group +(goto-char (point-min)) +@end group +@end example + +@noindent +Likewise, to move to the end of the buffer, use: + +@example +@group +(goto-char (point-max)) +@end group +@end example + + Here are two commands which users use to do these things. They are +documented here to warn you not to use them in Lisp programs, because +they set the mark and display messages in the echo area. + +@deffn Command beginning-of-buffer &optional n +This function moves point to the beginning of the buffer (or the limits +of the accessible portion, when narrowing is in effect), setting the +mark at the previous position. If @var{n} is non-@code{nil}, then it +puts point @var{n} tenths of the way from the beginning of the buffer. + +In an interactive call, @var{n} is the numeric prefix argument, +if provided; otherwise @var{n} defaults to @code{nil}. + +Don't use this function in Lisp programs! +@end deffn + +@deffn Command end-of-buffer &optional n +This function moves point to the end of the buffer (or the limits of +the accessible portion, when narrowing is in effect), setting the mark +at the previous position. If @var{n} is non-@code{nil}, then it puts +point @var{n} tenths of the way from the end. + +In an interactive call, @var{n} is the numeric prefix argument, +if provided; otherwise @var{n} defaults to @code{nil}. + +Don't use this function in Lisp programs! +@end deffn + +@node Text Lines +@subsection Motion by Text Lines +@cindex lines + + Text lines are portions of the buffer delimited by newline characters, +which are regarded as part of the previous line. The first text line +begins at the beginning of the buffer, and the last text line ends at +the end of the buffer whether or not the last character is a newline. +The division of the buffer into text lines is not affected by the width +of the window, by line continuation in display, or by how tabs and +control characters are displayed. + +@deffn Command goto-line line +This function moves point to the front of the @var{line}th line, +counting from line 1 at beginning of buffer. If @var{line} is less than +1, it moves point to the beginning of the buffer. If @var{line} is +greater than the number of lines in the buffer, it moves point to the +@emph{end of the last line} of the buffer. + +If narrowing is in effect, then @var{line} still counts from the +beginning of the buffer, but point cannot go outside the accessible +portion. So @code{goto-line} moves point to the beginning or end of the +accessible portion, if the line number specifies an inaccessible +position. + +The return value of @code{goto-line} is the difference between +@var{line} and the line number of the line to which point actually was +able move (in the full buffer, disregarding any narrowing). Thus, the +value is positive if the scan encounters the real end of the buffer. + +In an interactive call, @var{line} is the numeric prefix argument if +one has been provided. Otherwise @var{line} is read in the minibuffer. +@end deffn + +@deffn Command beginning-of-line &optional count +This function moves point to the beginning of the current line. With an +argument @var{count} not @code{nil} or 1, it moves forward +@var{count}@minus{}1 lines and then to the beginning of the line. + +If this function reaches the end of the buffer (or of the accessible +portion, if narrowing is in effect), it positions point at the end of +the buffer. No error is signaled. +@end deffn + +@deffn Command end-of-line &optional count +This function moves point to the end of the current line. With an +argument @var{count} not @code{nil} or 1, it moves forward +@var{count}@minus{}1 lines and then to the end of the line. + +If this function reaches the end of the buffer (or of the accessible +portion, if narrowing is in effect), it positions point at the end of +the buffer. No error is signaled. +@end deffn + +@deffn Command forward-line &optional count +@cindex beginning of line +This function moves point forward @var{count} lines, to the beginning of +the line. If @var{count} is negative, it moves point +@minus{}@var{count} lines backward, to the beginning of the line. + +If @code{forward-line} encounters the beginning or end of the buffer (or +of the accessible portion) before finding that many lines, it sets point +there. No error is signaled. + +@code{forward-line} returns the difference between @var{count} and the +number of lines actually moved. If you attempt to move down five lines +from the beginning of a buffer that has only three lines, point stops at +the end of the last line, and the value will be 2. + +In an interactive call, @var{count} is the numeric prefix argument. +@end deffn + +@defun count-lines start end +@cindex lines in region +This function returns the number of lines between the positions +@var{start} and @var{end} in the current buffer. If @var{start} and +@var{end} are equal, then it returns 0. Otherwise it returns at least +1, even if @var{start} and @var{end} are on the same line. This is +because the text between them, considered in isolation, must contain at +least one line unless it is empty. + +Here is an example of using @code{count-lines}: + +@example +@group +(defun current-line () + "Return the vertical position of point@dots{}" + (+ (count-lines (window-start) (point)) + (if (= (current-column) 0) 1 0) + -1)) +@end group +@end example +@end defun + +@ignore +@c ================ +The @code{previous-line} and @code{next-line} commands are functions +that should not be used in programs. They are for users and are +mentioned here only for completeness. + +@deffn Command previous-line count +@cindex goal column +This function moves point up @var{count} lines (down if @var{count} +is negative). In moving, it attempts to keep point in the ``goal column'' +(normally the same column that it was at the beginning of the move). + +If there is no character in the target line exactly under the current +column, point is positioned after the character in that line which +spans this column, or at the end of the line if it is not long enough. + +If it attempts to move beyond the top or bottom of the buffer (or clipped +region), then point is positioned in the goal column in the top or +bottom line. No error is signaled. + +In an interactive call, @var{count} will be the numeric +prefix argument. + +The command @code{set-goal-column} can be used to create a semipermanent +goal column to which this command always moves. Then it does not try to +move vertically. + +If you are thinking of using this in a Lisp program, consider using +@code{forward-line} with a negative argument instead. It is usually easier +to use and more reliable (no dependence on goal column, etc.). +@end deffn + +@deffn Command next-line count +This function moves point down @var{count} lines (up if @var{count} +is negative). In moving, it attempts to keep point in the ``goal column'' +(normally the same column that it was at the beginning of the move). + +If there is no character in the target line exactly under the current +column, point is positioned after the character in that line which +spans this column, or at the end of the line if it is not long enough. + +If it attempts to move beyond the top or bottom of the buffer (or clipped +region), then point is positioned in the goal column in the top or +bottom line. No error is signaled. + +In the case where the @var{count} is 1, and point is on the last +line of the buffer (or clipped region), a new empty line is inserted at the +end of the buffer (or clipped region) and point moved there. + +In an interactive call, @var{count} will be the numeric +prefix argument. + +The command @code{set-goal-column} can be used to create a semipermanent +goal column to which this command always moves. Then it does not try to +move vertically. + +If you are thinking of using this in a Lisp program, consider using +@code{forward-line} instead. It is usually easier +to use and more reliable (no dependence on goal column, etc.). +@end deffn + +@c ================ +@end ignore + + Also see the functions @code{bolp} and @code{eolp} in @ref{Near Point}. +These functions do not move point, but test whether it is already at the +beginning or end of a line. + +@node Screen Lines +@subsection Motion by Screen Lines + + The line functions in the previous section count text lines, delimited +only by newline characters. By contrast, these functions count screen +lines, which are defined by the way the text appears on the screen. A +text line is a single screen line if it is short enough to fit the width +of the selected window, but otherwise it may occupy several screen +lines. + + In some cases, text lines are truncated on the screen rather than +continued onto additional screen lines. In these cases, +@code{vertical-motion} moves point much like @code{forward-line}. +@xref{Truncation}. + + Because the width of a given string depends on the flags which control +the appearance of certain characters, @code{vertical-motion} behaves +differently, for a given piece of text, depending on the buffer it is +in, and even on the selected window (because the width, the truncation +flag, and display table may vary between windows). @xref{Usual +Display}. + +@defun vertical-motion count +This function moves point to the start of the screen line @var{count} +screen lines down from the screen line containing point. If @var{count} +is negative, it moves up instead. + +This function returns the number of lines moved. The value may be less +in absolute value than @var{count} if the beginning or end of the buffer +was reached. +@end defun + +@deffn Command move-to-window-line count +This function moves point with respect to the text currently displayed +in the selected window. It moves point to the beginning of the screen +line @var{count} screen lines from the top of the window. If +@var{count} is negative, that specifies a position +@w{@minus{}@var{count}} lines from the bottom---or else the last line of +the buffer, if the buffer ends above the specified screen position. + +If @var{count} is @code{nil}, then point moves to the beginning of the +line in the middle of the window. If the absolute value of @var{count} +is greater than the size of the window, then point moves to the place +which would appear on that screen line if the window were tall enough. +This will probably cause the next redisplay to scroll to bring that +location onto the screen. + +In an interactive call, @var{count} is the numeric prefix argument. + +The value returned is the window line number, with the top line in the +window numbered 0. +@end deffn + +@defun compute-motion from frompos to topos width offsets +This function scan through the current buffer, calculating screen +position. It scans the current buffer forward from position @var{from}, +assuming that is at screen coordinates @var{frompos}, to position +@var{to} or coordinates @var{topos}, whichever comes first. It returns +the ending buffer position and screen coordinates. + +The coordinate arguments @var{frompos} and @var{topos} are cons cells of +the form @code{(@var{hpos} . @var{vpos})}. + +The argument @var{width} is the number of columns available to display +text; this affects handling of continuation lines. Use the value +returned by @code{window-width} for the window of your choice. + +The argument @var{offsets} is either @code{nil} or a cons cell of the +form @code{(@var{hscroll} . @var{tab-offset})}. Here @var{hscroll} is +the number of columns not being displayed at the left margin; in most +calls, this comes from @code{window-hscroll}. Meanwhile, +@var{tab-offset} is the number of columns of an initial tab character +(at @var{from}) that aren't included in the display, perhaps because the +line was continued within that character. + +The return value is a list of five elements: + +@example +(@var{pos} @var{vpos} @var{hpos} @var{prevhpos} @var{contin}) +@end example + +@noindent +Here @var{pos} is the buffer position where the scan stopped, @var{vpos} +is the vertical position, and @var{hpos} is the horizontal position. + +The result @var{prevhpos} is the horizontal position one character back +from @var{pos}. The result @var{contin} is @code{t} if a line was +continued after (or within) the previous character. + +For example, to find the buffer position of column @var{col} of line +@var{line} of a certain window, pass the window's display start location +as @var{from} and the window's upper-left coordinates as @var{frompos}. +Pass the buffer's @code{(point-max)} as @var{to}, to limit the scan to +the end of the visible section of the buffer, and pass @var{line} and +@var{col} as @var{topos}. Here's a function that does this: + +@example +(defun coordinates-of-position (col line) + (car (compute-motion (window-start) + '(0 . 0) + (point) + (cons col line) + (window-width) + (cons (window-hscroll) 0)))) +@end example +@end defun + +@node Vertical Motion +@comment node-name, next, previous, up +@subsection The User-Level Vertical Motion Commands +@cindex goal column +@cindex vertical text line motion +@findex next-line +@findex previous-line + + A goal column is useful if you want to edit text such as a table in +which you want to move point to a certain column on each line. The goal +column affects the vertical text line motion commands, @code{next-line} +and @code{previous-line}. @xref{Basic,, Basic Editing Commands, emacs, +The GNU Emacs Manual}. + +@defopt goal-column +This variable holds an explicitly specified goal column for vertical +line motion commands. If it is an integer, it specifies a column, and +these commands try to move to that column on each line. If it is +@code{nil}, then the commands set their own goal columns. Any other +value is invalid. +@end defopt + +@defvar temporary-goal-column +This variable holds the temporary goal column during a sequence of +consecutive vertical line motion commands. It is overridden by +@code{goal-column} if that is non-@code{nil}. It is set each time a +vertical motion command is invoked, unless the previous command was also +a vertical motion command. +@end defvar + +@defopt track-eol +This variable controls how the vertical line motion commands operate +when starting at the end of a line. If @code{track-eol} is +non-@code{nil}, then vertical motion starting at the end of a line will +keep to the ends of lines. This means moving to the end of each line +moved onto. The value of @code{track-eol} has no effect if point is not +at the end of a line when the first vertical motion command is given. + +@code{track-eol} has its effect by telling line motion commands to set +@code{temporary-goal-column} to 9999 instead of to the current column. +@end defopt + +@node List Motion +@comment node-name, next, previous, up +@subsection Moving over Balanced Expressions +@cindex sexp motion +@cindex Lisp expression motion +@cindex list motion + + Here are several functions concerned with balanced-parenthesis +expressions (also called @dfn{sexps} in connection with moving across +them in Emacs). The syntax table controls how these functions interpret +various characters; see @ref{Syntax Tables}. @xref{Parsing +Expressions}, for lower-level primitives for scanning sexps or parts of +sexps. For user-level commands, see @ref{Lists and Sexps,,, emacs, GNU +Emacs Manual}. + +@deffn Command forward-list arg +Move forward across @var{arg} balanced groups of parentheses. +(Other syntactic entities such as words or paired string quotes +are ignored.) +@end deffn + +@deffn Command backward-list arg +Move backward across @var{arg} balanced groups of parentheses. +(Other syntactic entities such as words or paired string quotes +are ignored.) +@end deffn + +@deffn Command up-list arg +Move forward out of @var{arg} levels of parentheses. +A negative argument means move backward but still to a less deep spot. +@end deffn + +@deffn Command down-list arg +Move forward down @var{arg} levels of parentheses. A negative argument +means move backward but still go down @var{arg} levels. +@end deffn + +@deffn Command forward-sexp arg +Move forward across @var{arg} balanced expressions. +Balanced expressions include both those delimited by parentheses +and other kinds, such as words and string constants. For example, + +@example +@group +---------- Buffer: foo ---------- +(concat@point{} "foo " (car x) y z) +---------- Buffer: foo ---------- +@end group + +@group +(forward-sexp 3) + @result{} nil + +---------- Buffer: foo ---------- +(concat "foo " (car x) y@point{} z) +---------- Buffer: foo ---------- +@end group +@end example +@end deffn + +@deffn Command backward-sexp arg +Move backward across @var{arg} balanced expressions. +@end deffn + +@node Skipping Characters +@comment node-name, next, previous, up +@subsection Skipping Characters +@cindex skipping characters + + The following two functions move point over a specified set of +characters. For example, they are often used to skip whitespace. For +related functions, see @ref{Motion and Syntax}. + +@defun skip-chars-forward character-set &optional limit +This function moves point in the current buffer forward, skipping over a +given set of characters. It examines the character following point, +then advances point if the character matches @var{character-set}. This +continues until it reaches a character that does not match. The +function returns @code{nil}. + +The argument @var{character-set} is like the inside of a +@samp{[@dots{}]} in a regular expression except that @samp{]} is never +special and @samp{\} quotes @samp{^}, @samp{-} or @samp{\}. Thus, +@code{"a-zA-Z"} skips over all letters, stopping before the first +nonletter, and @code{"^a-zA-Z}" skips nonletters stopping before the +first letter. @xref{Regular Expressions}. + +If @var{limit} is supplied (it must be a number or a marker), it +specifies the maximum position in the buffer that point can be skipped +to. Point will stop at or before @var{limit}. + +In the following example, point is initially located directly before the +@samp{T}. After the form is evaluated, point is located at the end of +that line (between the @samp{t} of @samp{hat} and the newline). The +function skips all letters and spaces, but not newlines. + +@example +@group +---------- Buffer: foo ---------- +I read "@point{}The cat in the hat +comes back" twice. +---------- Buffer: foo ---------- +@end group + +@group +(skip-chars-forward "a-zA-Z ") + @result{} nil + +---------- Buffer: foo ---------- +I read "The cat in the hat@point{} +comes back" twice. +---------- Buffer: foo ---------- +@end group +@end example +@end defun + +@defun skip-chars-backward character-set &optional limit +This function moves point backward, skipping characters that match +@var{character-set}, until @var{limit}. It just like +@code{skip-chars-forward} except for the direction of motion. +@end defun + +@node Excursions +@section Excursions +@cindex excursion + + It is often useful to move point ``temporarily'' within a localized +portion of the program, or to switch buffers temporarily. This is +called an @dfn{excursion}, and it is done with the @code{save-excursion} +special form. This construct saves the current buffer and its values of +point and the mark so they can be restored after the completion of the +excursion. + + The forms for saving and restoring the configuration of windows are +described elsewhere (see @ref{Window Configurations}, and @pxref{Frame +Configurations}). + +@defspec save-excursion forms@dots{} +@cindex mark excursion +@cindex point excursion +@cindex current buffer excursion +The @code{save-excursion} special form saves the identity of the current +buffer and the values of point and the mark in it, evaluates @var{forms}, +and finally restores the buffer and its saved values of point and the mark. +All three saved values are restored even in case of an abnormal exit +via throw or error (@pxref{Nonlocal Exits}). + +The @code{save-excursion} special form is the standard way to switch +buffers or move point within one part of a program and avoid affecting +the rest of the program. It is used more than 500 times in the Lisp +sources of Emacs. + +@code{save-excursion} does not save the values of point and the mark for +other buffers, so changes in other buffers remain in effect after +@code{save-excursion} exits. + +@cindex window excursions +Likewise, @code{save-excursion} does not restore window-buffer +correspondences altered by functions such as @code{switch-to-buffer}. +One way to restore these correspondences, and the selected window, is to +use @code{save-window-excursion} inside @code{save-excursion} +(@pxref{Window Configurations}). + +The value returned by @code{save-excursion} is the result of the last of +@var{forms}, or @code{nil} if no @var{forms} are given. + +@example +@group +(save-excursion + @var{forms}) +@equiv{} +(let ((old-buf (current-buffer)) + (old-pnt (point-marker)) + (old-mark (copy-marker (mark-marker)))) + (unwind-protect + (progn @var{forms}) + (set-buffer old-buf) + (goto-char old-pnt) + (set-marker (mark-marker) old-mark))) +@end group +@end example +@end defspec + +@node Narrowing +@section Narrowing +@cindex narrowing +@cindex restriction (in a buffer) +@cindex accessible portion (of a buffer) + + @dfn{Narrowing} means limiting the text addressable by Emacs editing +commands to a limited range of characters in a buffer. The text that +remains addressable is called the @dfn{accessible portion} of the +buffer. + + Narrowing is specified with two buffer positions which become the +beginning and end of the accessible portion. For most editing commands +and most Emacs primitives, these positions replace the values of the +beginning and end of the buffer. While narrowing is in effect, no text +outside the accessible portion is displayed, and point cannot move +outside the accessible portion. + + Values such as positions or line numbers, that usually count from the +beginning of the buffer, do so despite narrowing, but the functions +which use them refuse to operate on text that is inaccessible. + + The commands for saving buffers are unaffected by narrowing; they save +the entire buffer regardless of the any narrowing. + +@deffn Command narrow-to-region start end +This function sets the accessible portion of the current buffer to start +at @var{start} and end at @var{end}. Both arguments should be character +positions. + +In an interactive call, @var{start} and @var{end} are set to the bounds +of the current region (point and the mark, with the smallest first). +@end deffn + +@deffn Command narrow-to-page move-count +This function sets the accessible portion of the current buffer to +include just the current page. An optional first argument +@var{move-count} non-@code{nil} means to move forward or backward by +@var{move-count} pages and then narrow. The variable +@code{page-delimiter} specifies where pages start and end +(@pxref{Standard Regexps}). + +In an interactive call, @var{move-count} is set to the numeric prefix +argument. +@end deffn + +@deffn Command widen +@cindex widening +This function cancels any narrowing in the current buffer, so that the +entire contents are accessible. This is called @dfn{widening}. +It is equivalent to the following expression: + +@example +(narrow-to-region 1 (1+ (buffer-size))) +@end example +@end deffn + +@defspec save-restriction body@dots{} +This special form saves the current bounds of the accessible portion, +evaluates the @var{body} forms, and finally restores the saved bounds, +thus restoring the same state of narrowing (or absence thereof) formerly +in effect. The state of narrowing is restored even in the event of an +abnormal exit via throw or error (@pxref{Nonlocal Exits}). Therefore, +this construct is a clean way to narrow a buffer temporarily. + +The value returned by @code{save-restriction} is that returned by the +last form in @var{body}, or @code{nil} if no body forms were given. + +@c Wordy to avoid overfull hbox. --rjc 16mar92 +@strong{Caution:} it is easy to make a mistake when using the +@code{save-restriction} construct. Read the entire description here +before you try it. + +If @var{body} changes the current buffer, @code{save-restriction} still +restores the restrictions on the original buffer (the buffer whose +restructions it saved from), but it does not restore the identity of the +current buffer. + +@code{save-restriction} does @emph{not} restore point and the mark; use +@code{save-excursion} for that. If you use both @code{save-restriction} +and @code{save-excursion} together, @code{save-excursion} should come +first (on the outside). Otherwise, the old point value would be +restored with temporary narrowing still in effect. If the old point +value were outside the limits of the temporary narrowing, this would +fail to restore it accurately. + +The @code{save-restriction} special form records the values of the +beginning and end of the accessible portion as distances from the +beginning and end of the buffer. In other words, it records the amount +of inaccessible text before and after the accessible portion. + +This method yields correct results if @var{body} does further narrowing. +However, @code{save-restriction} can become confused if the body widens +and then make changes outside the range of the saved narrowing. When +this is what you want to do, @code{save-restriction} is not the right +tool for the job. Here is what you must use instead: + +@example +@group +(let ((beg (point-min-marker)) + (end (point-max-marker))) + (unwind-protect + (progn @var{body}) + (save-excursion + (set-buffer (marker-buffer beg)) + (narrow-to-region beg end)))) +@end group +@end example + +Here is a simple example of correct use of @code{save-restriction}: + +@example +@group +---------- Buffer: foo ---------- +This is the contents of foo +This is the contents of foo +This is the contents of foo@point{} +---------- Buffer: foo ---------- +@end group + +@group +(save-excursion + (save-restriction + (goto-char 1) + (forward-line 2) + (narrow-to-region 1 (point)) + (goto-char (point-min)) + (replace-string "foo" "bar"))) + +---------- Buffer: foo ---------- +This is the contents of bar +This is the contents of bar +This is the contents of foo@point{} +---------- Buffer: foo ---------- +@end group +@end example +@end defspec diff -r 99ca8123a3ca -r 3b84ed22f747 lispref/searching.texi --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lispref/searching.texi Mon Mar 28 05:41:05 1994 +0000 @@ -0,0 +1,1254 @@ +@c -*-texinfo-*- +@c This is part of the GNU Emacs Lisp Reference Manual. +@c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc. +@c See the file elisp.texi for copying conditions. +@setfilename ../info/searching +@node Searching and Matching, Syntax Tables, Text, Top +@chapter Searching and Matching +@cindex searching + + GNU Emacs provides two ways to search through a buffer for specified +text: exact string searches and regular expression searches. After a +regular expression search, you can examine the @dfn{match data} to +determine which text matched the whole regular expression or various +portions of it. + +@menu +* String Search:: Search for an exact match. +* Regular Expressions:: Describing classes of strings. +* Regexp Search:: Searching for a match for a regexp. +* Search and Replace:: Internals of @code{query-replace}. +* Match Data:: Finding out which part of the text matched + various parts of a regexp, after regexp search. +* Searching and Case:: Case-independent or case-significant searching. +* Standard Regexps:: Useful regexps for finding sentences, pages,... +@end menu + + The @samp{skip-chars@dots{}} functions also perform a kind of searching. +@xref{Skipping Characters}. + +@node String Search +@section Searching for Strings +@cindex string search + + These are the primitive functions for searching through the text in a +buffer. They are meant for use in programs, but you may call them +interactively. If you do so, they prompt for the search string; +@var{limit} and @var{noerror} are set to @code{nil}, and @var{repeat} +is set to 1. + +@deffn Command search-forward string &optional limit noerror repeat + This function searches forward from point for an exact match for +@var{string}. If successful, it sets point to the end of the occurrence +found, and returns the new value of point. If no match is found, the +value and side effects depend on @var{noerror} (see below). +@c Emacs 19 feature + + In the following example, point is initially at the beginning of the +line. Then @code{(search-forward "fox")} moves point after the last +letter of @samp{fox}: + +@example +@group +---------- Buffer: foo ---------- +@point{}The quick brown fox jumped over the lazy dog. +---------- Buffer: foo ---------- +@end group + +@group +(search-forward "fox") + @result{} 20 + +---------- Buffer: foo ---------- +The quick brown fox@point{} jumped over the lazy dog. +---------- Buffer: foo ---------- +@end group +@end example + + The argument @var{limit} specifies the upper bound to the search. (It +must be a position in the current buffer.) No match extending after +that position is accepted. If @var{limit} is omitted or @code{nil}, it +defaults to the end of the accessible portion of the buffer. + +@kindex search-failed + What happens when the search fails depends on the value of +@var{noerror}. If @var{noerror} is @code{nil}, a @code{search-failed} +error is signaled. If @var{noerror} is @code{t}, @code{search-forward} +returns @code{nil} and does nothing. If @var{noerror} is neither +@code{nil} nor @code{t}, then @code{search-forward} moves point to the +upper bound and returns @code{nil}. (It would be more consistent now +to return the new position of point in that case, but some programs +may depend on a value of @code{nil}.) + + If @var{repeat} is non-@code{nil}, then the search is repeated that +many times. Point is positioned at the end of the last match. +@end deffn + +@deffn Command search-backward string &optional limit noerror repeat +This function searches backward from point for @var{string}. It is +just like @code{search-forward} except that it searches backwards and +leaves point at the beginning of the match. +@end deffn + +@deffn Command word-search-forward string &optional limit noerror repeat +@cindex word search +This function searches forward from point for a ``word'' match for +@var{string}. If it finds a match, it sets point to the end of the +match found, and returns the new value of point. +@c Emacs 19 feature + +Word matching regards @var{string} as a sequence of words, disregarding +punctuation that separates them. It searches the buffer for the same +sequence of words. Each word must be distinct in the buffer (searching +for the word @samp{ball} does not match the word @samp{balls}), but the +details of punctuation and spacing are ignored (searching for @samp{ball +boy} does match @samp{ball. Boy!}). + +In this example, point is initially at the beginning of the buffer; the +search leaves it between the @samp{y} and the @samp{!}. + +@example +@group +---------- Buffer: foo ---------- +@point{}He said "Please! Find +the ball boy!" +---------- Buffer: foo ---------- +@end group + +@group +(word-search-forward "Please find the ball, boy.") + @result{} 35 + +---------- Buffer: foo ---------- +He said "Please! Find +the ball boy@point{}!" +---------- Buffer: foo ---------- +@end group +@end example + +If @var{limit} is non-@code{nil} (it must be a position in the current +buffer), then it is the upper bound to the search. The match found must +not extend after that position. + +If @var{noerror} is @code{nil}, then @code{word-search-forward} signals +an error if the search fails. If @var{noerror} is @code{t}, then it +returns @code{nil} instead of signaling an error. If @var{noerror} is +neither @code{nil} nor @code{t}, it moves point to @var{limit} (or the +end of the buffer) and returns @code{nil}. + +If @var{repeat} is non-@code{nil}, then the search is repeated that many +times. Point is positioned at the end of the last match. +@end deffn + +@deffn Command word-search-backward string &optional limit noerror repeat +This function searches backward from point for a word match to +@var{string}. This function is just like @code{word-search-forward} +except that it searches backward and normally leaves point at the +beginning of the match. +@end deffn + +@node Regular Expressions +@section Regular Expressions +@cindex regular expression +@cindex regexp + + A @dfn{regular expression} (@dfn{regexp}, for short) is a pattern that +denotes a (possibly infinite) set of strings. Searching for matches for +a regexp is a very powerful operation. This section explains how to write +regexps; the following section says how to search for them. + +@menu +* Syntax of Regexps:: Rules for writing regular expressions. +* Regexp Example:: Illustrates regular expression syntax. +@end menu + +@node Syntax of Regexps +@subsection Syntax of Regular Expressions + + Regular expressions have a syntax in which a few characters are special +constructs and the rest are @dfn{ordinary}. An ordinary character is a +simple regular expression which matches that character and nothing else. +The special characters are @samp{$}, @samp{^}, @samp{.}, @samp{*}, +@samp{+}, @samp{?}, @samp{[}, @samp{]} and @samp{\}; no new special +characters will be defined in the future. Any other character appearing +in a regular expression is ordinary, unless a @samp{\} precedes it. + +For example, @samp{f} is not a special character, so it is ordinary, and +therefore @samp{f} is a regular expression that matches the string +@samp{f} and no other string. (It does @emph{not} match the string +@samp{ff}.) Likewise, @samp{o} is a regular expression that matches +only @samp{o}.@refill + +Any two regular expressions @var{a} and @var{b} can be concatenated. The +result is a regular expression which matches a string if @var{a} matches +some amount of the beginning of that string and @var{b} matches the rest of +the string.@refill + +As a simple example, we can concatenate the regular expressions @samp{f} +and @samp{o} to get the regular expression @samp{fo}, which matches only +the string @samp{fo}. Still trivial. To do something more powerful, you +need to use one of the special characters. Here is a list of them: + +@need 1200 +@table @kbd +@item .@: @r{(Period)} +@cindex @samp{.} in regexp +is a special character that matches any single character except a newline. +Using concatenation, we can make regular expressions like @samp{a.b}, which +matches any three-character string that begins with @samp{a} and ends with +@samp{b}.@refill + +@item * +@cindex @samp{*} in regexp +is not a construct by itself; it is a suffix operator that means to +repeat the preceding regular expression as many times as possible. In +@samp{fo*}, the @samp{*} applies to the @samp{o}, so @samp{fo*} matches +one @samp{f} followed by any number of @samp{o}s. The case of zero +@samp{o}s is allowed: @samp{fo*} does match @samp{f}.@refill + +@samp{*} always applies to the @emph{smallest} possible preceding +expression. Thus, @samp{fo*} has a repeating @samp{o}, not a +repeating @samp{fo}.@refill + +The matcher processes a @samp{*} construct by matching, immediately, +as many repetitions as can be found. Then it continues with the rest +of the pattern. If that fails, backtracking occurs, discarding some +of the matches of the @samp{*}-modified construct in case that makes +it possible to match the rest of the pattern. For example, in matching +@samp{ca*ar} against the string @samp{caaar}, the @samp{a*} first +tries to match all three @samp{a}s; but the rest of the pattern is +@samp{ar} and there is only @samp{r} left to match, so this try fails. +The next alternative is for @samp{a*} to match only two @samp{a}s. +With this choice, the rest of the regexp matches successfully.@refill + +@item + +@cindex @samp{+} in regexp +is a suffix operator similar to @samp{*} except that the preceding +expression must match at least once. So, for example, @samp{ca+r} +matches the strings @samp{car} and @samp{caaaar} but not the string +@samp{cr}, whereas @samp{ca*r} matches all three strings. + +@item ? +@cindex @samp{?} in regexp +is a suffix operator similar to @samp{*} except that the preceding +expression can match either once or not at all. For example, +@samp{ca?r} matches @samp{car} or @samp{cr}, but does not match anyhing +else. + +@item [ @dots{} ] +@cindex character set (in regexp) +@cindex @samp{[} in regexp +@cindex @samp{]} in regexp +@samp{[} begins a @dfn{character set}, which is terminated by a +@samp{]}. In the simplest case, the characters between the two brackets +form the set. Thus, @samp{[ad]} matches either one @samp{a} or one +@samp{d}, and @samp{[ad]*} matches any string composed of just @samp{a}s +and @samp{d}s (including the empty string), from which it follows that +@samp{c[ad]*r} matches @samp{cr}, @samp{car}, @samp{cdr}, +@samp{caddaar}, etc.@refill + +The usual regular expression special characters are not special inside a +character set. A completely different set of special characters exists +inside character sets: @samp{]}, @samp{-} and @samp{^}.@refill + +@samp{-} is used for ranges of characters. To write a range, write two +characters with a @samp{-} between them. Thus, @samp{[a-z]} matches any +lower case letter. Ranges may be intermixed freely with individual +characters, as in @samp{[a-z$%.]}, which matches any lower case letter +or @samp{$}, @samp{%} or a period.@refill + +To include a @samp{]} in a character set, make it the first character. +For example, @samp{[]a]} matches @samp{]} or @samp{a}. To include a +@samp{-}, write @samp{-} as the first character in the set, or put +immediately after a range. (You can replace one individual character +@var{c} with the range @samp{@var{c}-@var{c}} to make a place to put the +@samp{-}). There is no way to write a set containing just @samp{-} and +@samp{]}. + +To include @samp{^} in a set, put it anywhere but at the beginning of +the set. + +@item [^ @dots{} ] +@cindex @samp{^} in regexp +@samp{[^} begins a @dfn{complement character set}, which matches any +character except the ones specified. Thus, @samp{[^a-z0-9A-Z]} +matches all characters @emph{except} letters and digits.@refill + +@samp{^} is not special in a character set unless it is the first +character. The character following the @samp{^} is treated as if it +were first (thus, @samp{-} and @samp{]} are not special there). + +Note that a complement character set can match a newline, unless +newline is mentioned as one of the characters not to match. + +@item ^ +@cindex @samp{^} in regexp +@cindex beginning of line in regexp +is a special character that matches the empty string, but only at +the beginning of a line in the text being matched. Otherwise it fails +to match anything. Thus, @samp{^foo} matches a @samp{foo} which occurs +at the beginning of a line. + +When matching a string, @samp{^} matches at the beginning of the string +or after a newline character @samp{\n}. + +@item $ +@cindex @samp{$} in regexp +is similar to @samp{^} but matches only at the end of a line. Thus, +@samp{x+$} matches a string of one @samp{x} or more at the end of a line. + +When matching a string, @samp{$} matches at the end of the string +or before a newline character @samp{\n}. + +@item \ +@cindex @samp{\} in regexp +has two functions: it quotes the special characters (including +@samp{\}), and it introduces additional special constructs. + +Because @samp{\} quotes special characters, @samp{\$} is a regular +expression which matches only @samp{$}, and @samp{\[} is a regular +expression which matches only @samp{[}, and so on. + +Note that @samp{\} also has special meaning in the read syntax of Lisp +strings (@pxref{String Type}), and must be quoted with @samp{\}. For +example, the regular expression that matches the @samp{\} character is +@samp{\\}. To write a Lisp string that contains the characters +@samp{\\}, Lisp syntax requires you to quote each @samp{\} with another +@samp{\}. Therefore, the read syntax for a regular expression matching +@samp{\} is @code{"\\\\"}.@refill +@end table + +@strong{Please note:} for historical compatibility, special characters +are treated as ordinary ones if they are in contexts where their special +meanings make no sense. For example, @samp{*foo} treats @samp{*} as +ordinary since there is no preceding expression on which the @samp{*} +can act. It is poor practice to depend on this behavior; better to +quote the special character anyway, regardless of where it +appears.@refill + +For the most part, @samp{\} followed by any character matches only +that character. However, there are several exceptions: characters +which, when preceded by @samp{\}, are special constructs. Such +characters are always ordinary when encountered on their own. Here +is a table of @samp{\} constructs: + +@table @kbd +@item \| +@cindex @samp{|} in regexp +@cindex regexp alternative +specifies an alternative. +Two regular expressions @var{a} and @var{b} with @samp{\|} in +between form an expression that matches anything that either @var{a} or +@var{b} matches.@refill + +Thus, @samp{foo\|bar} matches either @samp{foo} or @samp{bar} +but no other string.@refill + +@samp{\|} applies to the largest possible surrounding expressions. Only a +surrounding @samp{\( @dots{} \)} grouping can limit the grouping power of +@samp{\|}.@refill + +Full backtracking capability exists to handle multiple uses of @samp{\|}. + +@item \( @dots{} \) +@cindex @samp{(} in regexp +@cindex @samp{)} in regexp +@cindex regexp grouping +is a grouping construct that serves three purposes: + +@enumerate +@item +To enclose a set of @samp{\|} alternatives for other operations. +Thus, @samp{\(foo\|bar\)x} matches either @samp{foox} or @samp{barx}. + +@item +To enclose an expression for a suffix operator such as @samp{*} to act +on. Thus, @samp{ba\(na\)*} matches @samp{bananana}, etc., with any +(zero or more) number of @samp{na} strings.@refill + +@item +To record a matched substring for future reference. +@end enumerate + +This last application is not a consequence of the idea of a +parenthetical grouping; it is a separate feature which happens to be +assigned as a second meaning to the same @samp{\( @dots{} \)} construct +because there is no conflict in practice between the two meanings. +Here is an explanation of this feature: + +@item \@var{digit} +matches the same text which matched the @var{digit}th occurrence of a +@samp{\( @dots{} \)} construct. + +In other words, after the end of a @samp{\( @dots{} \)} construct. the +matcher remembers the beginning and end of the text matched by that +construct. Then, later on in the regular expression, you can use +@samp{\} followed by @var{digit} to match that same text, whatever it +may have been. + +The strings matching the first nine @samp{\( @dots{} \)} constructs +appearing in a regular expression are assigned numbers 1 through 9 in +the order that the open parentheses appear in the regular expression. +So you can use @samp{\1} through @samp{\9} to refer to the text matched +by the corresponding @samp{\( @dots{} \)} constructs. + +For example, @samp{\(.*\)\1} matches any newline-free string that is +composed of two identical halves. The @samp{\(.*\)} matches the first +half, which may be anything, but the @samp{\1} that follows must match +the same exact text. + +@item \w +@cindex @samp{\w} in regexp +matches any word-constituent character. The editor syntax table +determines which characters these are. @xref{Syntax Tables}. + +@item \W +@cindex @samp{\W} in regexp +matches any character that is not a word-constituent. + +@item \s@var{code} +@cindex @samp{\s} in regexp +matches any character whose syntax is @var{code}. Here @var{code} is a +character which represents a syntax code: thus, @samp{w} for word +constituent, @samp{-} for whitespace, @samp{(} for open parenthesis, +etc. @xref{Syntax Tables}, for a list of syntax codes and the +characters that stand for them. + +@item \S@var{code} +@cindex @samp{\S} in regexp +matches any character whose syntax is not @var{code}. +@end table + + These regular expression constructs match the empty string---that is, +they don't use up any characters---but whether they match depends on the +context. + +@table @kbd +@item \` +@cindex @samp{\`} in regexp +matches the empty string, but only at the beginning +of the buffer or string being matched against. + +@item \' +@cindex @samp{\'} in regexp +matches the empty string, but only at the end of +the buffer or string being matched against. + +@item \= +@cindex @samp{\=} in regexp +matches the empty string, but only at point. +(This construct is not defined when matching against a string.) + +@item \b +@cindex @samp{\b} in regexp +matches the empty string, but only at the beginning or +end of a word. Thus, @samp{\bfoo\b} matches any occurrence of +@samp{foo} as a separate word. @samp{\bballs?\b} matches +@samp{ball} or @samp{balls} as a separate word.@refill + +@item \B +@cindex @samp{\B} in regexp +matches the empty string, but @emph{not} at the beginning or +end of a word. + +@item \< +@cindex @samp{\<} in regexp +matches the empty string, but only at the beginning of a word. + +@item \> +@cindex @samp{\>} in regexp +matches the empty string, but only at the end of a word. +@end table + +@kindex invalid-regexp + Not every string is a valid regular expression. For example, a string +with unbalanced square brackets is invalid (with a few exceptions, such +as @samp{[]]}, and so is a string that ends with a single @samp{\}. If +an invalid regular expression is passed to any of the search functions, +an @code{invalid-regexp} error is signaled. + +@defun regexp-quote string +This function returns a regular expression string that matches exactly +@var{string} and nothing else. This allows you to request an exact +string match when calling a function that wants a regular expression. + +@example +@group +(regexp-quote "^The cat$") + @result{} "\\^The cat\\$" +@end group +@end example + +One use of @code{regexp-quote} is to combine an exact string match with +context described as a regular expression. For example, this searches +for the string which is the value of @code{string}, surrounded by +whitespace: + +@example +@group +(re-search-forward + (concat "\\s " (regexp-quote string) "\\s ")) +@end group +@end example +@end defun + +@node Regexp Example +@comment node-name, next, previous, up +@subsection Complex Regexp Example + + Here is a complicated regexp, used by Emacs to recognize the end of a +sentence together with any whitespace that follows. It is the value of +the variable @code{sentence-end}. + + First, we show the regexp as a string in Lisp syntax to distinguish +spaces from tab characters. The string constant begins and ends with a +double-quote. @samp{\"} stands for a double-quote as part of the +string, @samp{\\} for a backslash as part of the string, @samp{\t} for a +tab and @samp{\n} for a newline. + +@example +"[.?!][]\"')@}]*\\($\\| $\\|\t\\| \\)[ \t\n]*" +@end example + + In contrast, if you evaluate the variable @code{sentence-end}, you +will see the following: + +@example +@group +sentence-end +@result{} +"[.?!][]\"')@}]*\\($\\| $\\| \\| \\)[ +]*" +@end group +@end example + +@noindent +In this output, tab and newline appear as themselves. + + This regular expression contains four parts in succession and can be +deciphered as follows: + +@table @code +@item [.?!] +The first part of the pattern consists of three characters, a period, a +question mark and an exclamation mark, within square brackets. The +match must begin with one of these three characters. + +@item []\"')@}]* +The second part of the pattern matches any closing braces and quotation +marks, zero or more of them, that may follow the period, question mark +or exclamation mark. The @code{\"} is Lisp syntax for a double-quote in +a string. The @samp{*} at the end indicates that the immediately +preceding regular expression (a character set, in this case) may be +repeated zero or more times. + +@item \\($\\|@ \\|\t\\|@ @ \\) +The third part of the pattern matches the whitespace that follows the +end of a sentence: the end of a line, or a tab, or two spaces. The +double backslashes mark the parentheses and vertical bars as regular +expression syntax; the parentheses mark the group and the vertical bars +separate alternatives. The dollar sign is used to match the end of a +line. + +@item [ \t\n]* +Finally, the last part of the pattern matches any additional whitespace +beyond the minimum needed to end a sentence. +@end table + +@node Regexp Search +@section Regular Expression Searching +@cindex regular expression searching +@cindex regexp searching +@cindex searching for regexp + + In GNU Emacs, you can search for the next match for a regexp either +incrementally or not. For incremental search commands, see @ref{Regexp +Search, , Regular Expression Search, emacs, The GNU Emacs Manual}. Here +we describe only the search functions useful in programs. The principal +one is @code{re-search-forward}. + +@deffn Command re-search-forward regexp &optional limit noerror repeat +This function searches forward in the current buffer for a string of +text that is matched by the regular expression @var{regexp}. The +function skips over any amount of text that is not matched by +@var{regexp}, and leaves point at the end of the first match found. +It returns the new value of point. + +If @var{limit} is non-@code{nil} (it must be a position in the current +buffer), then it is the upper bound to the search. No match extending +after that position is accepted. + +What happens when the search fails depends on the value of +@var{noerror}. If @var{noerror} is @code{nil}, a @code{search-failed} +error is signaled. If @var{noerror} is @code{t}, +@code{re-search-forward} does nothing and returns @code{nil}. If +@var{noerror} is neither @code{nil} nor @code{t}, then +@code{re-search-forward} moves point to @var{limit} (or the end of the +buffer) and returns @code{nil}. + +If @var{repeat} is supplied (it must be a positive number), then the +search is repeated that many times (each time starting at the end of the +previous time's match). If these successive searches succeed, the +function succeeds, moving point and returning its new value. Otherwise +the search fails. + +In the following example, point is initially before the @samp{T}. +Evaluating the search call moves point to the end of that line (between +the @samp{t} of @samp{hat} and the newline). + +@example +@group +---------- Buffer: foo ---------- +I read "@point{}The cat in the hat +comes back" twice. +---------- Buffer: foo ---------- +@end group + +@group +(re-search-forward "[a-z]+" nil t 5) + @result{} 27 + +---------- Buffer: foo ---------- +I read "The cat in the hat@point{} +comes back" twice. +---------- Buffer: foo ---------- +@end group +@end example +@end deffn + +@deffn Command re-search-backward regexp &optional limit noerror repeat +This function searches backward in the current buffer for a string of +text that is matched by the regular expression @var{regexp}, leaving +point at the beginning of the first text found. + +This function is analogous to @code{re-search-forward}, but they are +not simple mirror images. @code{re-search-forward} finds the match +whose beginning is as close as possible. If @code{re-search-backward} +were a perfect mirror image, it would find the match whose end is as +close as possible. However, in fact it finds the match whose beginning +is as close as possible. The reason is that matching a regular +expression at a given spot always works from beginning to end, and is +done at a specified beginning position. + +A true mirror-image of @code{re-search-forward} would require a special +feature for matching regexps from end to beginning. It's not worth the +trouble of implementing that. +@end deffn + +@defun string-match regexp string &optional start +This function returns the index of the start of the first match for +the regular expression @var{regexp} in @var{string}, or @code{nil} if +there is no match. If @var{start} is non-@code{nil}, the search starts +at that index in @var{string}. + +For example, + +@example +@group +(string-match + "quick" "The quick brown fox jumped quickly.") + @result{} 4 +@end group +@group +(string-match + "quick" "The quick brown fox jumped quickly." 8) + @result{} 27 +@end group +@end example + +@noindent +The index of the first character of the +string is 0, the index of the second character is 1, and so on. + +After this function returns, the index of the first character beyond +the match is available as @code{(match-end 0)}. @xref{Match Data}. + +@example +@group +(string-match + "quick" "The quick brown fox jumped quickly." 8) + @result{} 27 +@end group + +@group +(match-end 0) + @result{} 32 +@end group +@end example +@end defun + +@defun looking-at regexp +This function determines whether the text in the current buffer directly +following point matches the regular expression @var{regexp}. ``Directly +following'' means precisely that: the search is ``anchored'' and it can +succeed only starting with the first character following point. The +result is @code{t} if so, @code{nil} otherwise. + +This function does not move point, but it updates the match data, which +you can access using @code{match-beginning} and @code{match-end}. +@xref{Match Data}. + +In this example, point is located directly before the @samp{T}. If it +were anywhere else, the result would be @code{nil}. + +@example +@group +---------- Buffer: foo ---------- +I read "@point{}The cat in the hat +comes back" twice. +---------- Buffer: foo ---------- + +(looking-at "The cat in the hat$") + @result{} t +@end group +@end example +@end defun + +@ignore +@deffn Command delete-matching-lines regexp +This function is identical to @code{delete-non-matching-lines}, save +that it deletes what @code{delete-non-matching-lines} keeps. + +In the example below, point is located on the first line of text. + +@example +@group +---------- Buffer: foo ---------- +We hold these truths +to be self-evident, +that all men are created +equal, and that they are +---------- Buffer: foo ---------- +@end group + +@group +(delete-matching-lines "the") + @result{} nil + +---------- Buffer: foo ---------- +to be self-evident, +that all men are created +---------- Buffer: foo ---------- +@end group +@end example +@end deffn + +@deffn Command flush-lines regexp +This function is the same as @code{delete-matching-lines}. +@end deffn + +@defun delete-non-matching-lines regexp +This function deletes all lines following point which don't +contain a match for the regular expression @var{regexp}. +@end defun + +@deffn Command keep-lines regexp +This function is the same as @code{delete-non-matching-lines}. +@end deffn + +@deffn Command how-many regexp +This function counts the number of matches for @var{regexp} there are in +the current buffer following point. It prints this number in +the echo area, returning the string printed. +@end deffn + +@deffn Command count-matches regexp +This function is a synonym of @code{how-many}. +@end deffn + +@deffn Command list-matching-lines regexp nlines +This function is a synonym of @code{occur}. +Show all lines following point containing a match for @var{regexp}. +Display each line with @var{nlines} lines before and after, +or @code{-}@var{nlines} before if @var{nlines} is negative. +@var{nlines} defaults to @code{list-matching-lines-default-context-lines}. +Interactively it is the prefix arg. + +The lines are shown in a buffer named @samp{*Occur*}. +It serves as a menu to find any of the occurrences in this buffer. +@kbd{C-h m} (@code{describe-mode} in that buffer gives help. +@end deffn + +@defopt list-matching-lines-default-context-lines +Default value is 0. +Default number of context lines to include around a @code{list-matching-lines} +match. A negative number means to include that many lines before the match. +A positive number means to include that many lines both before and after. +@end defopt +@end ignore + +@node Search and Replace +@section Search and Replace +@cindex replacement + +@defun perform-replace from-string replacements query-flag regexp-flag delimited-flag &optional repeat-count map +This function is the guts of @code{query-replace} and related commands. +It searches for occurrences of @var{from-string} and replaces some or +all of them. If @var{query-flag} is @code{nil}, it replaces all +occurrences; otherwise, it asks the user what to do about each one. + +If @var{regexp-flag} is non-@code{nil}, then @var{from-string} is +considered a regular expression; otherwise, it must match literally. If +@var{delimited-flag} is non-@code{nil}, then only replacements +surrounded by word boundaries are considered. + +The argument @var{replacements} specifies what to replace occurrences +with. If it is a string, that string is used. It can also be a list of +strings, to be used in cyclic order. + +If @var{repeat-count} is non-@code{nil}, it should be an integer, the +number of occurrences to consider. In this case, @code{perform-replace} +returns after considering that many occurrences. + +Normally, the keymap @code{query-replace-map} defines the possible user +responses. The argument @var{map}, if non-@code{nil}, is a keymap to +use instead of @code{query-replace-map}. +@end defun + +@defvar query-replace-map +This variable holds a special keymap that defines the valid user +responses for @code{query-replace} and related functions, as well as +@code{y-or-n-p} and @code{map-y-or-n-p}. It is unusual in two ways: + +@itemize @bullet +@item +The ``key bindings'' are not commands, just symbols that are meaningful +to the functions that use this map. + +@item +Prefix keys are not supported; each key binding must be for a single event +key sequence. This is because the functions don't use read key sequence to +get the input; instead, they read a single event and look it up ``by hand.'' +@end itemize +@end defvar + +Here are the meaningful ``bindings'' for @code{query-replace-map}. +Several of them are meaningful only for @code{query-replace} and +friends. + +@table @code +@item act +Do take the action being considered---in other words, ``yes.'' + +@item skip +Do not take action for this question---in other words, ``no.'' + +@item exit +Answer this question ``no,'' and don't ask any more. + +@item act-and-exit +Answer this question ``yes,'' and don't ask any more. + +@item act-and-show +Answer this question ``yes,'' but show the results---don't advance yet +to the next question. + +@item automatic +Answer this question and all subsequent questions in the series with +``yes,'' without further user interaction. + +@item backup +Move back to the previous place that a question was asked about. + +@item edit +Enter a recursive edit to deal with this question---instead of any +other action that would normally be taken. + +@item delete-and-edit +Delete the text being considered, then enter a recursive edit to replace +it. + +@item recenter +Redisplay and center the window, then ask the same question again. + +@item quit +Perform a quit right away. Only @code{y-or-n-p} and related functions +use this answer. + +@item help +Display some help, then ask again. +@end table + +@node Match Data +@section The Match Data +@cindex match data + + Emacs keeps track of the positions of the start and end of segments of +text found during a regular expression search. This means, for example, +that you can search for a complex pattern, such as a date in an Rmail +message, and then extract parts of the match under control of the +pattern. + + Because the match data normally describe the most recent search only, +you must be careful not to do another search inadvertently between the +search you wish to refer back to and the use of the match data. If you +can't avoid another intervening search, you must save and restore the +match data around it, to prevent it from being overwritten. + +@menu +* Simple Match Data:: Accessing single items of match data, + such as where a particular subexpression started. +* Replacing Match:: Replacing a substring that was matched. +* Entire Match Data:: Accessing the entire match data at once, as a list. +* Saving Match Data:: Saving and restoring the match data. +@end menu + +@node Simple Match Data +@subsection Simple Match Data Access + + This section explains how to use the match data to find the starting +point or ending point of the text that was matched by a particular +search, or by a particular parenthetical subexpression of a regular +expression. + +@defun match-beginning count +This function returns the position of the start of text matched by the +last regular expression searched for, or a subexpression of it. + +The argument @var{count}, a number, specifies a subexpression whose +start position is the value. If @var{count} is zero, then the value is +the position of the text matched by the whole regexp. If @var{count} is +greater than zero, then the value is the position of the beginning of +the text matched by the @var{count}th subexpression. + +Subexpressions of a regular expression are those expressions grouped +inside of parentheses, @samp{\(@dots{}\)}. The @var{count}th +subexpression is found by counting occurrences of @samp{\(} from the +beginning of the whole regular expression. The first subexpression is +numbered 1, the second 2, and so on. + +The value is @code{nil} for a parenthetical grouping inside of a +@samp{\|} alternative that wasn't used in the match. +@end defun + +@defun match-end count +This function returns the position of the end of the text that matched +the last regular expression searched for, or a subexpression of it. +This function is otherwise similar to @code{match-beginning}. +@end defun + + Here is an example of using the match data, with a comment showing the +positions within the text: + +@example +@group +(string-match "\\(qu\\)\\(ick\\)" + "The quick fox jumped quickly.") + ;0123456789 + @result{} 4 +@end group + +@group +(match-beginning 1) ; @r{The beginning of the match} + @result{} 4 ; @r{with @samp{qu} is at index 4.} +@end group + +@group +(match-beginning 2) ; @r{The beginning of the match} + @result{} 6 ; @r{with @samp{ick} is at index 6.} +@end group + +@group +(match-end 1) ; @r{The end of the match} + @result{} 6 ; @r{with @samp{qu} is at index 6.} + +(match-end 2) ; @r{The end of the match} + @result{} 9 ; @r{with @samp{ick} is at index 9.} +@end group +@end example + + Here is another example. Point is initially located at the beginning +of the line. Searching moves point to between the space and the word +@samp{in}. The beginning of the entire match is at the 9th character of +the buffer (@samp{T}), and the beginning of the match for the first +subexpression is at the 13th character (@samp{c}). + +@example +@group +(list + (re-search-forward "The \\(cat \\)") + (match-beginning 0) + (match-beginning 1)) + @result{} (t 9 13) +@end group + +@group +---------- Buffer: foo ---------- +I read "The cat @point{}in the hat comes back" twice. + ^ ^ + 9 13 +---------- Buffer: foo ---------- +@end group +@end example + +@noindent +(In this case, the index returned is a buffer position; the first +character of the buffer counts as 1.) + +@node Replacing Match +@subsection Replacing the Text That Matched + + This function replaces the text matched by the last search with +@var{replacement}. + +@cindex case in replacements +@defun replace-match replacement &optional fixedcase literal +This function replaces the buffer text matched by the last search, with +@var{replacement}. It applies only to buffers; you can't use +@code{replace-match} to replace a substring found with +@code{string-match}. + +If @var{fixedcase} is non-@code{nil}, then the case of the replacement +text is not changed; otherwise, the replacement text is converted to a +different case depending upon the capitalization of the text to be +replaced. If the original text is all upper case, the replacement text +is converted to upper case, except when all of the words in the original +text are only one character long. In that event, the replacement text +is capitalized. If @emph{any} of the words in the original text is +capitalized, then all of the words in the replacement text are +capitalized. + +If @var{literal} is non-@code{nil}, then @var{replacement} is inserted +exactly as it is, the only alterations being case changes as needed. +If it is @code{nil} (the default), then the character @samp{\} is treated +specially. If a @samp{\} appears in @var{replacement}, then it must be +part of one of the following sequences: + +@table @asis +@item @samp{\&} +@cindex @samp{&} in replacement +@samp{\&} stands for the entire text being replaced. + +@item @samp{\@var{n}} +@cindex @samp{\@var{n}} in replacement +@samp{\@var{n}} stands for the text that matched the @var{n}th +subexpression in the original regexp. Subexpressions are those +expressions grouped inside of @samp{\(@dots{}\)}. @var{n} is a digit. + +@item @samp{\\} +@cindex @samp{\} in replacement +@samp{\\} stands for a single @samp{\} in the replacement text. +@end table + +@code{replace-match} leaves point at the end of the replacement text, +and returns @code{t}. +@end defun + +@node Entire Match Data +@subsection Accessing the Entire Match Data + + The functions @code{match-data} and @code{set-match-data} read or +write the entire match data, all at once. + +@defun match-data +This function returns a newly constructed list containing all the +information on what text the last search matched. Element zero is the +position of the beginning of the match for the whole expression; element +one is the position of the end of the match for the expression. The +next two elements are the positions of the beginning and end of the +match for the first subexpression, and so on. In general, element +@ifinfo +number 2@var{n} +@end ifinfo +@tex +number {\mathsurround=0pt $2n$} +@end tex +corresponds to @code{(match-beginning @var{n})}; and +element +@ifinfo +number 2@var{n} + 1 +@end ifinfo +@tex +number {\mathsurround=0pt $2n+1$} +@end tex +corresponds to @code{(match-end @var{n})}. + +All the elements are markers or @code{nil} if matching was done on a +buffer, and all are integers or @code{nil} if matching was done on a +string with @code{string-match}. (In Emacs 18 and earlier versions, +markers were used even for matching on a string, except in the case +of the integer 0.) + +As always, there must be no possibility of intervening searches between +the call to a search function and the call to @code{match-data} that is +intended to access the match data for that search. + +@example +@group +(match-data) + @result{} (# + # + # + #) +@end group +@end example +@end defun + +@defun set-match-data match-list +This function sets the match data from the elements of @var{match-list}, +which should be a list that was the value of a previous call to +@code{match-data}. + +If @var{match-list} refers to a buffer that doesn't exist, you don't get +an error; that sets the match data in a meaningless but harmless way. + +@findex store-match-data +@code{store-match-data} is an alias for @code{set-match-data}. +@end defun + +@node Saving Match Data +@subsection Saving and Restoring the Match Data + + All asynchronous process functions (filters and sentinels) and +functions that use @code{recursive-edit} should save and restore the +match data if they do a search or if they let the user type arbitrary +commands. Saving the match data is useful in other cases as +well---whenever you want to access the match data resulting from an +earlier search, notwithstanding another intervening search. + + This example shows the problem that can arise if you fail to +attend to this requirement: + +@example +@group +(re-search-forward "The \\(cat \\)") + @result{} 48 +(foo) ; @r{Perhaps @code{foo} does} + ; @r{more searching.} +(match-end 0) + @result{} 61 ; @r{Unexpected result---not 48!} +@end group +@end example + + In Emacs versions 19 and later, you can save and restore the match +data with @code{save-match-data}: + +@defspec save-match-data body@dots{} +This special form executes @var{body}, saving and restoring the match +data around it. This is useful if you wish to do a search without +altering the match data that resulted from an earlier search. +@end defspec + + You can use @code{set-match-data} together with @code{match-data} to +imitate the effect of the special form @code{save-match-data}. This is +useful for writing code that can run in Emacs 18. Here is how: + +@example +@group +(let ((data (match-data))) + (unwind-protect + @dots{} ; @r{May change the original match data.} + (set-match-data data))) +@end group +@end example + +@ignore + Here is a function which restores the match data provided the buffer +associated with it still exists. + +@smallexample +@group +(defun restore-match-data (data) +@c It is incorrect to split the first line of a doc string. +@c If there's a problem here, it should be solved in some other way. + "Restore the match data DATA unless the buffer is missing." + (catch 'foo + (let ((d data)) +@end group + (while d + (and (car d) + (null (marker-buffer (car d))) +@group + ;; @file{match-data} @r{buffer is deleted.} + (throw 'foo nil)) + (setq d (cdr d))) + (set-match-data data)))) +@end group +@end smallexample +@end ignore + +@node Searching and Case +@section Searching and Case +@cindex searching and case + + By default, searches in Emacs ignore the case of the text they are +searching through; if you specify searching for @samp{FOO}, then +@samp{Foo} or @samp{foo} is also considered a match. Regexps, and in +particular character sets, are included: thus, @samp{[aB]} would match +@samp{a} or @samp{A} or @samp{b} or @samp{B}. + + If you do not want this feature, set the variable +@code{case-fold-search} to @code{nil}. Then all letters must match +exactly, including case. This is a per-buffer-local variable; altering +the variable affects only the current buffer. (@xref{Intro to +Buffer-Local}.) Alternatively, you may change the value of +@code{default-case-fold-search}, which is the default value of +@code{case-fold-search} for buffers that do not override it. + + Note that the user-level incremental search feature handles case +distinctions differently. When given a lower case letter, it looks for +a match of either case, but when given an upper case letter, it looks +for an upper case letter only. But this has nothing to do with the +searching functions Lisp functions use. + +@defopt case-replace +This variable determines whether @code{query-replace} should preserve +case in replacements. If the variable is @code{nil}, then +@code{replace-match} should not try to convert case. +@end defopt + +@defopt case-fold-search +This buffer-local variable determines whether searches should ignore +case. If the variable is @code{nil} they do not ignore case; otherwise +they do ignore case. +@end defopt + +@defvar default-case-fold-search +The value of this variable is the default value for +@code{case-fold-search} in buffers that do not override it. This is the +same as @code{(default-value 'case-fold-search)}. +@end defvar + +@node Standard Regexps +@section Standard Regular Expressions Used in Editing +@cindex regexps used standardly in editing +@cindex standard regexps used in editing + + This section describes some variables that hold regular expressions +used for certain purposes in editing: + +@defvar page-delimiter +This is the regexp describing line-beginnings that separate pages. The +default value is @code{"^\014"} (i.e., @code{"^^L"} or @code{"^\C-l"}). +@end defvar + +@defvar paragraph-separate +This is the regular expression for recognizing the beginning of a line +that separates paragraphs. (If you change this, you may have to +change @code{paragraph-start} also.) The default value is @code{"^[ +\t\f]*$"}, which is a line that consists entirely of spaces, tabs, and +form feeds. +@end defvar + +@defvar paragraph-start +This is the regular expression for recognizing the beginning of a line +that starts @emph{or} separates paragraphs. The default value is +@code{"^[ \t\n\f]"}, which matches a line starting with a space, tab, +newline, or form feed. +@end defvar + +@defvar sentence-end +This is the regular expression describing the end of a sentence. (All +paragraph boundaries also end sentences, regardless.) The default value +is: + +@example +"[.?!][]\"')@}]*\\($\\|\t\\| \\)[ \t\n]*" +@end example + +This means a period, question mark or exclamation mark, followed by a +closing brace, followed by tabs, spaces or new lines. + +For a detailed explanation of this regular expression, see @ref{Regexp +Example}. +@end defvar diff -r 99ca8123a3ca -r 3b84ed22f747 lispref/syntax.texi --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lispref/syntax.texi Mon Mar 28 05:41:05 1994 +0000 @@ -0,0 +1,707 @@ +@c -*-texinfo-*- +@c This is part of the GNU Emacs Lisp Reference Manual. +@c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc. +@c See the file elisp.texi for copying conditions. +@setfilename ../info/syntax +@node Syntax Tables, Abbrevs, Searching and Matching, Top +@chapter Syntax Tables +@cindex parsing +@cindex syntax table +@cindex text parsing + + A @dfn{syntax table} specifies the syntactic textual function of each +character. This information is used by the parsing commands, the +complex movement commands, and others to determine where words, symbols, +and other syntactic constructs begin and end. The current syntax table +controls the meaning of the word motion functions (@pxref{Word Motion}) +and the list motion functions (@pxref{List Motion}) as well as the +functions in this chapter. + +@menu +* Basics: Syntax Basics. Basic concepts of syntax tables. +* Desc: Syntax Descriptors. How characters are classified. +* Syntax Table Functions:: How to create, examine and alter syntax tables. +* Motion and Syntax:: Moving over characters with certain syntaxes. +* Parsing Expressions:: Parsing balanced expressions + using the syntax table. +* Standard Syntax Tables:: Syntax tables used by various major modes. +* Syntax Table Internals:: How syntax table information is stored. +@end menu + +@node Syntax Basics +@section Syntax Table Concepts + +@ifinfo + A @dfn{syntax table} provides Emacs with the information that +determines the syntactic use of each character in a buffer. This +information is used by the parsing commands, the complex movement +commands, and others to determine where words, symbols, and other +syntactic constructs begin and end. The current syntax table controls +the meaning of the word motion functions (@pxref{Word Motion}) and the +list motion functions (@pxref{List Motion}) as well as the functions in +this chapter. +@end ifinfo + + A syntax table is a vector of 256 elements; it contains one entry for +each of the 256 @sc{ASCII} characters of an 8-bit byte. Each element is +an integer that encodes the syntax of the character in question. + + Syntax tables are used only for moving across text, not for the Emacs +Lisp reader. Emacs Lisp uses built-in syntactic rules when reading Lisp +expressions, and these rules cannot be changed. + + Each buffer has its own major mode, and each major mode has its own +idea of the syntactic class of various characters. For example, in Lisp +mode, the character @samp{;} begins a comment, but in C mode, it +terminates a statement. To support these variations, Emacs makes the +choice of syntax table local to each buffer. Typically, each major +mode has its own syntax table and installs that table in each buffer +which uses that mode. Changing this table alters the syntax in all +those buffers as well as in any buffers subsequently put in that mode. +Occasionally several similar modes share one syntax table. +@xref{Example Major Modes}, for an example of how to set up a syntax +table. + +A syntax table can inherit the data for some characters from the +standard syntax table, while specifying other characters itself. The +``inherit'' syntax class means ``inherit this character's syntax from +the standard syntax table.'' Most major modes' syntax tables inherit +the syntax of character codes 0 through 31 and 128 through 255. This is +useful with character sets such as ISO Latin-1 that have additional +alphabetic characters in the range 128 to 255. Just changing the +standard syntax for these characters affects all major modes. + +@defun syntax-table-p object +This function returns @code{t} if @var{object} is a vector of length 256 +elements. This means that the vector may be a syntax table. However, +according to this test, any vector of length 256 is considered to be a +syntax table, no matter what its contents. +@end defun + +@node Syntax Descriptors +@section Syntax Descriptors +@cindex syntax classes + + This section describes the syntax classes and flags that denote the +syntax of a character, and how they are represented as a @dfn{syntax +descriptor}, which is a Lisp string that you pass to +@code{modify-syntax-entry} to specify the desired syntax. + + Emacs defines a number of @dfn{syntax classes}. Each syntax table +puts each character into one class. There is no necessary relationship +between the class of a character in one syntax table and its class in +any other table. + + Each class is designated by a mnemonic character which serves as the +name of the class when you need to specify a class. Usually the +designator character is one which is frequently put in that class; +however, its meaning as a designator is unvarying and independent of +what syntax that character currently has. + +@cindex syntax descriptor + A syntax descriptor is a Lisp string which specifies a syntax class, a +matching character (used only for the parenthesis classes) and flags. +The first character is the designator for a syntax class. The second +character is the character to match; if it is unused, put a space there. +Then come the characters for any desired flags. If no matching +character or flags are needed, one character is sufficient. + + For example, the descriptor for the character @samp{*} in C mode is +@samp{@w{. 23}} (i.e., punctuation, matching character slot unused, +second character of a comment-starter, first character of an +comment-ender), and the entry for @samp{/} is @samp{@w{. 14}} (i.e., +punctuation, matching character slot unused, first character of a +comment-starter, second character of a comment-ender). + +@menu +* Syntax Class Table:: Table of syntax classes. +* Syntax Flags:: Additional flags each character can have. +@end menu + +@node Syntax Class Table +@subsection Table of Syntax Classes + + Here is a table syntax classes, the characters that stand for them, +their meanings, and examples of their use. + +@deffn {Syntax class} @w{whitespace character} +@dfn{Whitespace characters} (designated with @w{@samp{@ }} or @samp{-}) +separate symbols and words from each other. Typically, whitespace +characters have no other syntactic significance, and multiple whitespace +characters are syntactically equivalent to a single one. Space, tab, +newline and formfeed are almost always classified as whitespace. +@end deffn + +@deffn {Syntax class} @w{word constituent} +@dfn{Word constituents} (designated with @samp{w}) are parts of normal +English words and are typically used in variable and command names in +programs. All upper and lower case letters and the digits are typically +word constituents. +@end deffn + +@deffn {Syntax class} @w{symbol constituent} +@dfn{Symbol constituents} (designated with @samp{_}) are the extra +characters that are used in variable and command names along with word +constituents. For example, the symbol constituents class is used in +Lisp mode to indicate that certain characters may be part of symbol +names even though they are not part of English words. These characters +are @samp{$&*+-_<>}. In standard C, the only non-word-constituent +character that is valid in symbols is underscore (@samp{_}). +@end deffn + +@deffn {Syntax class} @w{punctuation character} +@dfn{Punctuation characters} (@samp{.}) are those characters that are +used as punctuation in English, or are used in some way in a programming +language to separate symbols from one another. Most programming +language modes, including Emacs Lisp mode, have no characters in this +class since the few characters that are not symbol or word constituents +all have other uses. +@end deffn + +@deffn {Syntax class} @w{open parenthesis character} +@deffnx {Syntax class} @w{close parenthesis character} +@cindex parenthesis syntax +Open and close @dfn{parenthesis characters} are characters used in +dissimilar pairs to surround sentences or expressions. Such a grouping +is begun with an open parenthesis character and terminated with a close. +Each open parenthesis character matches a particular close parenthesis +character, and vice versa. Normally, Emacs indicates momentarily the +matching open parenthesis when you insert a close parenthesis. +@xref{Blinking}. + +The class of open parentheses is designated with @samp{(}, and that of +close parentheses with @samp{)}. + +In English text, and in C code, the parenthesis pairs are @samp{()}, +@samp{[]}, and @samp{@{@}}. In Emacs Lisp, the delimiters for lists and +vectors (@samp{()} and @samp{[]}) are classified as parenthesis +characters. +@end deffn + +@deffn {Syntax class} @w{string quote} +@dfn{String quote characters} (designated with @samp{"}) are used in +many languages, including Lisp and C, to delimit string constants. The +same string quote character appears at the beginning and the end of a +string. Such quoted strings do not nest. + +The parsing facilities of Emacs consider a string as a single token. +The usual syntactic meanings of the characters in the string are +suppressed. + +The Lisp modes have two string quote characters: double-quote (@samp{"}) +and vertical bar (@samp{|}). @samp{|} is not used in Emacs Lisp, but it +is used in Common Lisp. C also has two string quote characters: +double-quote for strings, and single-quote (@samp{'}) for character +constants. + +English text has no string quote characters because English is not a +programming language. Although quotation marks are used in English, +we do not want them to turn off the usual syntactic properties of +other characters in the quotation. +@end deffn + +@deffn {Syntax class} @w{escape} +An @dfn{escape character} (designated with @samp{\}) starts an escape +sequence such as is used in C string and character constants. The +character @samp{\} belongs to this class in both C and Lisp. (In C, it +is used thus only inside strings, but it turns out to cause no trouble +to treat it this way throughout C code.) + +Characters in this class count as part of words if +@code{words-include-escapes} is non-@code{nil}. @xref{Word Motion}. +@end deffn + +@deffn {Syntax class} @w{character quote} +A @dfn{character quote character} (designated with @samp{/}) quotes the +following character so that it loses its normal syntactic meaning. This +differs from an escape character in that only the character immediately +following is ever affected. + +Characters in this class count as part of words if +@code{words-include-escapes} is non-@code{nil}. @xref{Word Motion}. + +This class is not currently used in any standard Emacs modes. +@end deffn + +@deffn {Syntax class} @w{paired delimiter} +@dfn{Paired delimiter characters} (designated with @samp{$}) are like +string quote characters except that the syntactic properties of the +characters between the delimiters are not suppressed. Only @TeX{} mode +uses a paired identical delimiter presently---the @samp{$} that both +enters and leaves math mode. +@end deffn + +@deffn {Syntax class} @w{expression prefix} +An @dfn{expression prefix operator} (designated with @samp{'}) is used +for syntactic operators that are part of an expression if they appear +next to one. These characters in Lisp include the apostrophe, @samp{'} +(used for quoting), the comma, @samp{,} (used in macros), and @samp{#} +(used in the read syntax for certain data types). +@end deffn + +@deffn {Syntax class} @w{comment starter} +@deffnx {Syntax class} @w{comment ender} +@cindex comment syntax +The @dfn{comment starter} and @dfn{comment ender} characters are used in +various languages to delimit comments. These classes are designated +with @samp{<} and @samp{>}, respectively. + +English text has no comment characters. In Lisp, the semicolon +(@samp{;}) starts a comment and a newline or formfeed ends one. +@end deffn + +@deffn {Syntax class} @w{inherit} +This syntax class does not specify a syntax. It says to look in the +standard syntax table to find the syntax of this character. The +designator for this syntax code is @samp{@@}. +@end deffn + +@node Syntax Flags +@subsection Syntax Flags +@cindex syntax flags + + In addition to the classes, entries for characters in a syntax table +can include flags. There are six possible flags, represented by the +characters @samp{1}, @samp{2}, @samp{3}, @samp{4}, @samp{b} and +@samp{p}. + + All the flags except @samp{p} are used to describe multi-character +comment delimiters. The digit flags indicate that a character can +@emph{also} be part of a comment sequence, in addition to the syntactic +properties associated with its character class. The flags are +independent of the class and each other for the sake of characters such +as @samp{*} in C mode, which is a punctuation character, @emph{and} the +second character of a start-of-comment sequence (@samp{/*}), @emph{and} +the first character of an end-of-comment sequence (@samp{*/}). + +The flags for a character @var{c} are: + +@itemize @bullet +@item +@samp{1} means @var{c} is the start of a two-character comment start +sequence. + +@item +@samp{2} means @var{c} is the second character of such a sequence. + +@item +@samp{3} means @var{c} is the start of a two-character comment end +sequence. + +@item +@samp{4} means @var{c} is the second character of such a sequence. + +@item +@c Emacs 19 feature +@samp{b} means that @var{c} as a comment delimiter belongs to the +alternative ``b'' comment style. + +Emacs supports two comment styles simultaneously in any one syntax +table. This is for the sake of C++. Each style of comment syntax has +its own comment-start sequence and its own comment-end sequence. Each +comment must stick to one style or the other; thus, if it starts with +the comment-start sequence of style ``b'', it must also end with the +comment-end sequence of style ``b''. + +The two comment-start sequences must begin with the same character; only +the second character may differ. Mark the second character of the +``b''-style comment start sequence with the @samp{b} flag. + +A comment-end sequence (one or two characters) applies to the ``b'' +style if its first character has the @samp{b} flag set; otherwise, it +applies to the ``a'' style. + +The appropriate comment syntax settings for C++ are as follows: + +@table @asis +@item @samp{/} +@samp{124b} +@item @samp{*} +@samp{23} +@item newline +@samp{>b} +@end table + +Thus @samp{/*} is a comment-start sequence for ``a'' style, @samp{//} +is a comment-start sequence for ``b'' style, @samp{*/} is a +comment-end sequence for ``a'' style, and newline is a comment-end +sequence for ``b'' style. + +@item +@c Emacs 19 feature +@samp{p} identifies an additional ``prefix character'' for Lisp syntax. +These characters are treated as whitespace when they appear between +expressions. When they appear within an expression, they are handled +according to their usual syntax codes. + +The function @code{backward-prefix-chars} moves back over these +characters, as well as over characters whose primary syntax class is +prefix (@samp{'}). @xref{Motion and Syntax}. +@end itemize + +@node Syntax Table Functions +@section Syntax Table Functions + + In this section we describe functions for creating, accessing and +altering syntax tables. + +@defun make-syntax-table +This function creates a new syntax table. Character codes 0 through +31, and 128 through 255, are set up to inherit from the standard syntax +table. The other character codes are set up by copying what the +standard syntax table says about them. + +Most major mode syntax tables are created in this way. +@end defun + +@defun copy-syntax-table &optional table +This function constructs a copy of @var{table} and returns it. If +@var{table} is not supplied (or is @code{nil}), it returns a copy of the +current syntax table. Otherwise, an error is signaled if @var{table} is +not a syntax table. +@end defun + +@deffn Command modify-syntax-entry char syntax-descriptor &optional table +This function sets the syntax entry for @var{char} according to +@var{syntax-descriptor}. The syntax is changed only for @var{table}, +which defaults to the current buffer's syntax table, and not in any +other syntax table. The argument @var{syntax-descriptor} specifies the +desired syntax; this is a string beginning with a class designator +character, and optionally containing a matching character and flags as +well. @xref{Syntax Descriptors}. + +This function always returns @code{nil}. The old syntax information in +the table for this character is discarded. + +An error is signaled if the first character of the syntax descriptor is not +one of the twelve syntax class designator characters. An error is also +signaled if @var{char} is not a character. + +@example +@group +@exdent @r{Examples:} + +;; @r{Put the space character in class whitespace.} +(modify-syntax-entry ?\ " ") + @result{} nil +@end group + +@group +;; @r{Make @samp{$} an open parenthesis character,} +;; @r{with @samp{^} as its matching close.} +(modify-syntax-entry ?$ "(^") + @result{} nil +@end group + +@group +;; @r{Make @samp{^} a close parenthesis character,} +;; @r{with @samp{$} as its matching open.} +(modify-syntax-entry ?^ ")$") + @result{} nil +@end group + +@group +;; @r{Make @samp{/} a punctuation character,} +;; @r{the first character of a start-comment sequence,} +;; @r{and the second character of an end-comment sequence.} +;; @r{This is used in C mode.} +(modify-syntax-entry ?/ ".13") + @result{} nil +@end group +@end example +@end deffn + +@defun char-syntax character +This function returns the syntax class of @var{character}, represented +by its mnemonic designator character. This @emph{only} returns the +class, not any matching parenthesis or flags. + +An error is signaled if @var{char} is not a character. + +The following examples apply to C mode. The first example shows that +the syntax class of space is whitespace (represented by a space). The +second example shows that the syntax of @samp{/} is punctuation. This +does not show the fact that it is also part of comment start and end +sequence. The third example shows that open parenthesis is in the class +of open parentheses. This does not show the fact that it has a matching +character, @samp{)}. + +@example +@group +(char-to-string (char-syntax ?\ )) + @result{} " " +@end group + +@group +(char-to-string (char-syntax ?/)) + @result{} "." +@end group + +@group +(char-to-string (char-syntax ?\()) + @result{} "(" +@end group +@end example +@end defun + +@defun set-syntax-table table +This function makes @var{table} the syntax table for the current buffer. +It returns @var{table}. +@end defun + +@defun syntax-table +This function returns the current syntax table, which is the table for +the current buffer. +@end defun + +@node Motion and Syntax +@section Motion and Syntax + + This section describes functions for moving across characters in +certain syntax classes. None of these functions exists in Emacs +version 18 or earlier. + +@defun skip-syntax-forward syntaxes &optional limit +This function moves point forward across characters having syntax classes +mentioned in @var{syntaxes}. It stops when it encounters the end of +the buffer, or position @var{lim} (if specified), or a character it is +not supposed to skip. +@ignore @c may want to change this. +The return value is the distance traveled, which is a nonnegative +integer. +@end ignore +@end defun + +@defun skip-syntax-backward syntaxes &optional limit +This function moves point backward across characters whose syntax +classes are mentioned in @var{syntaxes}. It stops when it encounters +the beginning of the buffer, or position @var{lim} (if specified), or a +character it is not supposed to skip. +@ignore @c may want to change this. +The return value indicates the distance traveled. It is an integer that +is zero or less. +@end ignore +@end defun + +@defun backward-prefix-chars +This function moves point backward over any number of characters with +expression prefix syntax. This includes both characters in the +expression prefix syntax class, and characters with the @samp{p} flag. +@end defun + +@node Parsing Expressions +@section Parsing Balanced Expressions + + Here are several functions for parsing and scanning balanced +expressions, also known as @dfn{sexps}, in which parentheses match in +pairs. The syntax table controls the interpretation of characters, so +these functions can be used for Lisp expressions when in Lisp mode and +for C expressions when in C mode. @xref{List Motion}, for convenient +higher-level functions for moving over balanced expressions. + +@defun parse-partial-sexp start limit &optional target-depth stop-before state stop-comment +This function parses a sexp in the current buffer starting at +@var{start}, not scanning past @var{limit}. It stops at @var{limit} or +when certain criteria described below are met, and sets to the location +where parsing stops. It returns a value describing the status of the +parse at the point where it stops. + +If @var{state} is @code{nil}, @var{start} is assumed to be at the top +level of parenthesis structure, such as the beginning of a function +definition. Alternatively, you might wish to resume parsing in the +middle of the structure. To do this, you must provide a @var{state} +argument that describes the initial status of parsing. + +@cindex parenthesis depth +If the third argument @var{target-depth} is non-@code{nil}, parsing +stops if the depth in parentheses becomes equal to @var{target-depth}. +The depth starts at 0, or at whatever is given in @var{state}. + +If the fourth argument @var{stop-before} is non-@code{nil}, parsing +stops when it comes to any character that starts a sexp. If +@var{stop-comment} is non-@code{nil}, parsing stops when it comes to the +start of a comment. + +@cindex parse state +The fifth argument @var{state} is an eight-element list of the same +form as the value of this function, described below. The return value +of one call may be used to initialize the state of the parse on another +call to @code{parse-partial-sexp}. + +The result is a list of eight elements describing the final state of +the parse: + +@enumerate 0 +@item +The depth in parentheses, counting from 0. + +@item +@cindex innermost containing parentheses +The character position of the start of the innermost containing +parenthetical grouping; @code{nil} if none. + +@item +@cindex previous complete subexpression +The character position of the start of the last complete subexpression +terminated; @code{nil} if none. + +@item +@cindex inside string +Non-@code{nil} if inside a string. More precisely, this is the +character that will terminate the string. + +@item +@cindex inside comment +@code{t} if inside a comment. + +@item +@cindex quote character +@code{t} if point is just after a quote character. + +@item +The minimum parenthesis depth encountered during this scan. + +@item +@code{t} if inside a comment of style ``b''. +@end enumerate + +Elements 0, 3, 4, 5 and 7 are significant in the argument @var{state}. + +@cindex indenting with parentheses +This function is most often used to compute indentation for languages +that have nested parentheses. +@end defun + +@defun scan-lists from count depth +This function scans forward @var{count} balanced parenthetical groupings +from character number @var{from}. It returns the character position +where the scan stops. + +If @var{depth} is nonzero, parenthesis depth counting begins from that +value. The only candidates for stopping are places where the depth in +parentheses becomes zero; @code{scan-lists} counts @var{count} such +places and then stops. Thus, a positive value for @var{depth} means go +out levels of parenthesis. + +Scanning ignores comments if @code{parse-sexp-ignore-comments} is +non-@code{nil}. + +If scan reaches the beginning or end of the buffer (or its accessible +portion), and the depth is not zero, an error is signaled. If the depth +is zero but the count is not used up, @code{nil} is returned. +@end defun + +@defun scan-sexps from count +This function scans forward @var{count} sexps from character position +@var{from}. It returns the character position where the scan stops. + +Scanning ignores comments if @code{parse-sexp-ignore-comments} is +non-@code{nil}. + +If scan reaches the beginning or end of (the accessible part of) the +buffer in the middle of a parenthetical grouping, an error is signaled. +If it reaches the beginning or end between groupings but before count is +used up, @code{nil} is returned. +@end defun + +@defvar parse-sexp-ignore-comments +@cindex skipping comments +If the value is non-@code{nil}, then comments are treated as +whitespace by the functions in this section and by @code{forward-sexp}. + +In older Emacs versions, this feature worked only when the comment +terminator is something like @samp{*/}, and appears only to end a +comment. In languages where newlines terminate comments, it was +necessary make this variable @code{nil}, since not every newline is the +end of a comment. This limitation no longer exists. +@end defvar + +You can use @code{forward-comment} to move forward or backward over +one comment or several comments. + +@defun forward-comment count +This function moves point forward across @var{count} comments (backward, +if @var{count} is negative). If it finds anything other than a comment +or whitespace, it stops, leaving point at the place where it stopped. +It also stops after satisfying @var{count}. +@end defun + +To move forward over all comments and whitespace following point, use +@code{(forward-comment (buffer-size))}. @code{(buffer-size)} is a good +argument to use, because the number of comments to in the buffer cannot +exceed that many. + +@node Standard Syntax Tables +@section Some Standard Syntax Tables + + Each of the major modes in Emacs has its own syntax table. Here are +several of them: + +@defun standard-syntax-table +This function returns the standard syntax table, which is the syntax +table used in Fundamental mode. +@end defun + +@defvar text-mode-syntax-table +The value of this variable is the syntax table used in Text mode. +@end defvar + +@defvar c-mode-syntax-table +The value of this variable is the syntax table for C-mode buffers. +@end defvar + +@defvar emacs-lisp-mode-syntax-table +The value of this variable is the syntax table used in Emacs Lisp mode +by editing commands. (It has no effect on the Lisp @code{read} +function.) +@end defvar + +@node Syntax Table Internals +@section Syntax Table Internals +@cindex syntax table internals + + Each element of a syntax table is an integer that encodes the syntax +of one character: the syntax class, possible matching character, and +flags. Lisp programs don't usually work with the elements directly; the +Lisp-level syntax table functions usually work with syntax descriptors +(@pxref{Syntax Descriptors}). + + The low 8 bits of each element of a syntax table indicate the +syntax class. + +@table @asis +@item @i{Integer} +@i{Class} +@item 0 +whitespace +@item 1 +punctuation +@item 2 +word +@item 3 +symbol +@item 4 +open parenthesis +@item 5 +close parenthesis +@item 6 +expression prefix +@item 7 +string quote +@item 8 +paired delimiter +@item 9 +escape +@item 10 +character quote +@item 11 +comment-start +@item 12 +comment-end +@item 13 +inherit +@end table + + The next 8 bits are the matching opposite parenthesis (if the +character has parenthesis syntax); otherwise, they are not meaningful. +The next 6 bits are the flags. diff -r 99ca8123a3ca -r 3b84ed22f747 lispref/tips.texi --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/lispref/tips.texi Mon Mar 28 05:41:05 1994 +0000 @@ -0,0 +1,582 @@ +@c -*-texinfo-*- +@c This is part of the GNU Emacs Lisp Reference Manual. +@c Copyright (C) 1990, 1991, 1992, 1993 Free Software Foundation, Inc. +@c See the file elisp.texi for copying conditions. +@setfilename ../info/tips +@node Tips, GNU Emacs Internals, Calendar, Top +@appendix Tips and Standards +@cindex tips +@cindex standards of coding style +@cindex coding standards + + This chapter describes no additional features of Emacs Lisp. +Instead it gives advice on making effective use of the features described +in the previous chapters. + +@menu +* Style Tips:: Writing clean and robust programs. +* Compilation Tips:: Making compiled code run fast. +* Documentation Tips:: Writing readable documentation strings. +* Comment Tips:: Conventions for writing comments. +* Library Headers:: Standard headers for library packages. +@end menu + +@node Style Tips +@section Writing Clean Lisp Programs + + Here are some tips for avoiding common errors in writing Lisp code +intended for widespread use: + +@itemize @bullet +@item +Since all global variables share the same name space, and all functions +share another name space, you should choose a short word to distinguish +your program from other Lisp programs. Then take care to begin the +names of all global variables, constants, and functions with the chosen +prefix. This helps avoid name conflicts. + +This recommendation applies even to names for traditional Lisp +primitives that are not primitives in Emacs Lisp---even to @code{cadr}. +Believe it or not, there is more than one plausible way to define +@code{cadr}. Play it safe; append your name prefix to produce a name +like @code{foo-cadr} or @code{mylib-cadr} instead. + +If you write a function that you think ought to be added to Emacs under +a certain name, such as @code{twiddle-files}, don't call it by that name +in your program. Call it @code{mylib-twiddle-files} in your program, +and send mail to @samp{bug-gnu-emacs@@prep.ai.mit.edu} suggesting we add +it to Emacs. If and when we do, we can change the name easily enough. + +If one prefix is insufficient, your package may use two or three +alternative common prefixes, so long as they make sense. + +Separate the prefix from the rest of the symbol name with a hyphen, +@samp{-}. This will be consistent with Emacs itself and with most Emacs +Lisp programs. + +@item +It is often useful to put a call to @code{provide} in each separate +library program, at least if there is more than one entry point to the +program. + +@item +If one file @var{foo} uses a macro defined in another file @var{bar}, +@var{foo} should contain @code{(require '@var{bar})} before the first +use of the macro. (And @var{bar} should contain @code{(provide +'@var{bar})}, to make the @code{require} work.) This will cause +@var{bar} to be loaded when you byte-compile @var{foo}. Otherwise, you +risk compiling @var{foo} without the necessary macro loaded, and that +would produce compiled code that won't work right. @xref{Compiling +Macros}. + +@item +If you define a major mode, make sure to run a hook variable using +@code{run-hooks}, just as the existing major modes do. @xref{Hooks}. + +@item +Please do not define @kbd{C-c @var{letter}} as a key in your major +modes. These sequences are reserved for users; they are the +@strong{only} sequences reserved for users, so we cannot do without +them. + +Instead, define sequences consisting of @kbd{C-c} followed by a +non-letter. These sequences are reserved for major modes. + +Changing all the major modes in Emacs 18 so they would follow this +convention was a lot of work. Abandoning this convention would waste +that work and inconvenience the users. + +@item +You should not bind @kbd{C-h} following any prefix character (including +@kbd{C-c}). If you don't bind @kbd{C-h}, it is automatically available +as a help character for listing the subcommands of the prefix character. + +@item +You should not bind a key sequence ending in @key{ESC} except following +another @key{ESC}. (That is, it is ok to bind a sequence ending in +@kbd{@key{ESC} @key{ESC}}.) + +The reason for this rule is that a non-prefix binding for @key{ESC} in +any context prevents recognition of escape sequences as function keys in +that context. + +@item +It is a bad idea to define aliases for the Emacs primitives. +Use the standard names instead. + +@item +Redefining an Emacs primitive is an even worse idea. +It may do the right thing for a particular program, but +there is no telling what other programs might break as a result. + +@item +If a file does replace any of the functions or library programs of +standard Emacs, prominent comments at the beginning of the file should +say which functions are replaced, and how the behavior of the +replacements differs from that of the originals. + +@item +If a file requires certain standard library programs to be loaded +beforehand, then the comments at the beginning of the file should say +so. + +@item +Please keep the names of your Emacs Lisp source files to 13 characters +or less. This way, if the files are compiled, the compiled files' names +will be 14 characters or less, which is short enough to fit on all kinds +of Unix systems. + +@item +Don't use @code{next-line} or @code{previous-line} in programs; nearly +always, @code{forward-line} is more convenient as well as more +predictable and robust. @xref{Text Lines}. + +@item +Don't use functions that set the mark in your Lisp code (unless you are +writing a command to set the mark). The mark is a user-level feature, +so it is incorrect to change the mark except to supply a value for the +user's benefit. @xref{The Mark}. + +In particular, don't use these functions: + +@itemize @bullet +@item +@code{beginning-of-buffer}, @code{end-of-buffer} +@item +@code{replace-string}, @code{replace-regexp} +@end itemize + +If you just want to move point, or replace a certain string, without any +of the other features intended for interactive users, you can replace +these functions with one or two lines of simple Lisp code. + +@item +The recommended way to print a message in the echo area is with +the @code{message} function, not @code{princ}. @xref{The Echo Area}. + +@item +When you encounter an error condition, call the function @code{error} +(or @code{signal}). The function @code{error} does not return. +@xref{Signaling Errors}. + +Do not use @code{message}, @code{throw}, @code{sleep-for}, +or @code{beep} to report errors. + +@item +Avoid using recursive edits. Instead, do what the Rmail @kbd{w} command +does: use a new local keymap that contains one command defined to +switch back to the old local keymap. Or do what the @code{edit-options} +command does: switch to another buffer and let the user switch back at +will. @xref{Recursive Editing}. + +@item +In some other systems there is a convention of choosing variable names +that begin and end with @samp{*}. We don't use that convention in Emacs +Lisp, so please don't use it in your library. (In fact, in Emacs names +of this form are conventionally used for program-generated buffers.) The +users will find Emacs more coherent if all libraries use the same +conventions. + +@item +Indent each function with @kbd{C-M-q} (@code{indent-sexp}) using the +default indentation parameters. + +@item +Don't make a habit of putting close-parentheses on lines by themselves; +Lisp programmers find this disconcerting. Once in a while, when there +is a sequence of many consecutive close-parentheses, it may make sense +to split them in one or two significant places. + +@item +Please put a copyright notice on the file if you give copies to anyone. +Use the same lines that appear at the top of the Lisp files in Emacs +itself. If you have not signed papers to assign the copyright to the +Foundation, then place your name in the copyright notice in place of the +Foundation's name. +@end itemize + +@node Compilation Tips +@section Tips for Making Compiled Code Fast +@cindex execution speed +@cindex speedups + + Here are ways of improving the execution speed of byte-compiled +lisp programs. + +@itemize @bullet +@item +@cindex profiling +@cindex timing programs +@cindex @file{profile.el} +Use the @file{profile} library to profile your program. See the file +@file{profile.el} for instructions. + +@item +Use iteration rather than recursion whenever possible. +Function calls are slow in Emacs Lisp even when a compiled function +is calling another compiled function. + +@item +Using the primitive list-searching functions @code{memq}, @code{assq} or +@code{assoc} is even faster than explicit iteration. It may be worth +rearranging a data structure so that one of these primitive search +functions can be used. + +@item +Certain built-in functions are handled specially by the byte compiler +avoiding the need for an ordinary function call. It is a good idea to +use these functions rather than alternatives. To see whether a function +is handled specially by the compiler, examine its @code{byte-compile} +property. If the property is non-@code{nil}, then the function is +handled specially. + +For example, the following input will show you that @code{aref} is +compiled specially (@pxref{Array Functions}) while @code{elt} is not +(@pxref{Sequence Functions}): + +@smallexample +@group +(get 'aref 'byte-compile) + @result{} byte-compile-two-args +@end group + +@group +(get 'elt 'byte-compile) + @result{} nil +@end group +@end smallexample + +@item +If calling a small function accounts for a substantial part of your +program's running time, make the function inline. This eliminates +the function call overhead. Since making a function inline reduces +the flexibility of changing the program, don't do it unless it gives +a noticeable speedup in something slow enough for users to care about +the speed. @xref{Inline Functions}. +@end itemize + +@node Documentation Tips +@section Tips for Documentation Strings + + Here are some tips for the writing of documentation strings. + +@itemize @bullet +@item +Every command, function or variable intended for users to know about +should have a documentation string. + +@item +An internal subroutine of a Lisp program need not have a documentation +string, and you can save space by using a comment instead. + +@item +The first line of the documentation string should consist of one or two +complete sentences which stand on their own as a summary. In particular, +start the line with a capital letter and end with a period. +For instance, use ``Return the cons of A and B.'' in preference to +``Returns the cons of A and B@.'' + +The documentation string can have additional lines which expand on the +details of how to use the function or variable. The additional lines +should be made up of complete sentences also, but they may be filled if +that looks good. + +@item +Write documentation strings in the active voice, not the passive, and in +the present tense, not the future. For instance, use ``Return a list +containing A and B.'' instead of ``A list containing A and B will be +returned.'' + +@item +Avoid using the word ``cause'' (or its equivalents) unnecessarily. +Instead of, ``Cause Emacs to display text in boldface,'' write just +``Display text in boldface.'' + +@item +Do not start or end a documentation string with whitespace. + +@item +Format the documentation string so that it fits in an Emacs window on an +80 column screen. It is a good idea for most lines to be no wider than +60 characters. The first line can be wider if necessary to fit the +information that ought to be there. + +However, rather than simply filling the entire documentation string, you +can make it much more readable by choosing line breaks with care. +Use blank lines between topics if the documentation string is long. + +@item +@strong{Do not} indent subsequent lines of a documentation string so +that the text is lined up in the source code with the text of the first +line. This looks nice in the source code, but looks bizarre when users +view the documentation. Remember that the indentation before the +starting double-quote is not part of the string! + +@item +A variable's documentation string should start with @samp{*} if the +variable is one that users would want to set interactively often. If +the value is a long list, or a function, or if the variable would only +be set in init files, then don't start the documentation string with +@samp{*}. @xref{Defining Variables}. + +@item +The documentation string for a variable that is a yes-or-no flag should +start with words such as ``Non-nil means@dots{}'', to make it clear both +that the variable only has two meaningfully distinct values and which value +means ``yes''. + +@item +When a function's documentation string mentions the value of an argument +of the function, use the argument name in capital letters as if it were +a name for that value. Thus, the documentation string of the function +@code{/} refers to its second argument as @samp{DIVISOR}. + +Also use all caps for meta-syntactic variables, such as when you show +the decomposition of a list or vector into subunits, some of which may +vary. + +@item +@iftex +When a documentation string refers to a Lisp symbol, write it as it +would be printed (which usually means in lower case), with single-quotes +around it. For example: @samp{`lambda'}. There are two exceptions: +write @code{t} and @code{nil} without single-quotes. +@end iftex +@ifinfo +When a documentation string refers to a Lisp symbol, write it as it +would be printed (which usually means in lower case), with single-quotes +around it. For example: @samp{lambda}. There are two exceptions: write +t and nil without single-quotes. (In this manual, we normally do use +single-quotes for those symbols.) +@end ifinfo + +@item +Don't write key sequences directly in documentation strings. Instead, +use the @samp{\\[@dots{}]} construct to stand for them. For example, +instead of writing @samp{C-f}, write @samp{\\[forward-char]}. When the +documentation string is printed, Emacs will substitute whatever key is +currently bound to @code{forward-char}. This will usually be +@samp{C-f}, but if the user has moved key bindings, it will be the +correct key for that user. @xref{Keys in Documentation}. + +@item +In documentation strings for a major mode, you will want to refer to the +key bindings of that mode's local map, rather than global ones. +Therefore, use the construct @samp{\\<@dots{}>} once in the +documentation string to specify which key map to use. Do this before +the first use of @samp{\\[@dots{}]}. The text inside the +@samp{\\<@dots{}>} should be the name of the variable containing the +local keymap for the major mode. + +It is not practical to use @samp{\\[@dots{}]} very many times, because +display of the documentation string will become slow. So use this to +describe the most important commands in your major mode, and then use +@samp{\\@{@dots{}@}} to display the rest of the mode's keymap. + +@item +Don't use the term ``Elisp'', since that is or was a trademark. +Use the term ``Emacs Lisp''. +@end itemize + +@node Comment Tips +@section Tips on Writing Comments + + We recommend these conventions for where to put comments and how to +indent them: + +@table @samp +@item ; +Comments that start with a single semicolon, @samp{;}, should all be +aligned to the same column on the right of the source code. Such +comments usually explain how the code on the same line does its job. In +Lisp mode and related modes, the @kbd{M-;} (@code{indent-for-comment}) +command automatically inserts such a @samp{;} in the right place, or +aligns such a comment if it is already inserted. + +(The following examples are taken from the Emacs sources.) + +@smallexample +@group +(setq base-version-list ; there was a base + (assoc (substring fn 0 start-vn) ; version to which + file-version-assoc-list)) ; this looks like + ; a subversion +@end group +@end smallexample + +@item ;; +Comments that start with two semicolons, @samp{;;}, should be aligned to +the same level of indentation as the code. Such comments are used to +describe the purpose of the following lines or the state of the program +at that point. For example: + +@smallexample +@group +(prog1 (setq auto-fill-function + @dots{} + @dots{} + ;; update mode-line + (force-mode-line-update))) +@end group +@end smallexample + +These comments are also written before a function definition to explain +what the function does and how to call it properly. + +@item ;;; +Comments that start with three semicolons, @samp{;;;}, should start at +the left margin. Such comments are not used within function +definitions, but are used to make more general comments. For example: + +@smallexample +@group +;;; This Lisp code is run in Emacs +;;; when it is to operate as a server +;;; for other processes. +@end group +@end smallexample + +@item ;;;; +Comments that start with four semicolons, @samp{;;;;}, should be aligned +to the left margin and are used for headings of major sections of a +program. For example: + +@smallexample +;;;; The kill ring +@end smallexample +@end table + +@noindent +The indentation commands of the Lisp modes in Emacs, such as @kbd{M-;} +(@code{indent-for-comment}) and @key{TAB} (@code{lisp-indent-line}) +automatically indent comments according to these conventions, +depending on the the number of semicolons. @xref{Comments,, +Manipulating Comments, emacs, The GNU Emacs Manual}. + + If you wish to ``comment out'' a number of lines of code, use triple +semicolons at the beginnings of the lines. + + Any character may be included in a comment, but it is advisable to +precede a character with syntactic significance in Lisp (such as +@samp{\} or unpaired @samp{(} or @samp{)}) with a @samp{\}, to prevent +it from confusing the Emacs commands for editing Lisp. + +@node Library Headers +@section Conventional Headers for Emacs Libraries +@cindex header comments +@cindex library header comments + + Emacs 19 has conventions for using special comments in Lisp libraries +to divide them into sections and give information such as who wrote +them. This section explains these conventions. First, an example: + +@smallexample +@group +;;; lisp-mnt.el --- minor mode for Emacs Lisp maintainers + +;; Copyright (C) 1992 Free Software Foundation, Inc. +@end group + +;; Author: Eric S. Raymond +;; Maintainer: Eric S. Raymond +;; Created: 14 Jul 1992 +;; Version: 1.2 +@group +;; Keywords: docs + +;; This file is part of GNU Emacs. +@var{copying conditions}@dots{} +@end group +@end smallexample + + The very first line should have this format: + +@example +;;; @var{filename} --- @var{description} +@end example + +@noindent +The description should be complete in one line. + + After the copyright notice come several @dfn{header comment} lines, +each beginning with @samp{;;; @var{header-name}:}. Here is a table of +the conventional possibilities for @var{header-name}: + +@table @samp +@item Author +This line states the name and net address of at least the principal +author of the library. + +If there are multiple authors, you can list them on continuation lines +led by @code{;;}, like this: + +@smallexample +@group +;; Author: Ashwin Ram +;; Dave Sill +;; Dave Brennan +;; Eric Raymond +@end group +@end smallexample + +@item Maintainer +This line should contain a single name/address as in the Author line, or +an address only, or the string ``FSF''. If there is no maintainer line, +the person(s) in the Author field are presumed to be the maintainers. +The example above is mildly bogus because the maintainer line is +redundant. + +The idea behind the @samp{Author} and @samp{Maintainer} lines is to make +possible a Lisp function to ``send mail to the maintainer'' without +having to mine the name out by hand. + +Be sure to surround the network address with @samp{<@dots{}>} if +you include the person's full name as well as the network address. + +@item Created +This optional line gives the original creation date of the +file. For historical interest only. + +@item Version +If you wish to record version numbers for the individual Lisp program, put +them in this line. + +@item Adapted-By +In this header line, place the name of the person who adapted the +library for installation (to make it fit the style conventions, for +example). + +@item Keywords +This line lists keywords for the @code{finder-by-keyword} help command. +This field is important; it's how people will find your package when +they're looking for things by topic area. +@end table + + Just about every Lisp library ought to have the @samp{Author} and +@samp{Keywords} header comment lines. Use the others if they are +appropriate. You can also put in header lines with other header +names---they have no standard meanings, so they can't do any harm. + + We use additional stylized comments to subdivide the contents of the +library file. Here is a table of them: + +@table @samp +@item ;;; Commentary: +This begins introductory comments that explain how the library works. +It should come right after the copying permissions. + +@item ;;; Change log: +This begins change log information stored in the library file (if you +store the change history there). For most of the Lisp +files distributed with Emacs, the change history is kept in the file +@file{ChangeLog} and not in the source file at all; these files do +not have a @samp{;;; Change log:} line. + +@item ;;; Code: +This begins the actual code of the program. + +@item ;;; @var{filename} ends here +This is the @dfn{footer line}; it appears at the very end of the file. +Its purpose is to enable people to detect truncated versions of the file +from the lack of a footer line. +@end table