comparison info/eintr-2 @ 73591:b214bd8be620

info/eintr-2: Updated Info file to Third Edition for `Introduction to Programming in Emacs Lisp'
author Robert J. Chassell <bob@rattlesnake.com>
date Tue, 31 Oct 2006 17:00:32 +0000
parents
children f93366072a0b
comparison
equal deleted inserted replaced
73590:dcc218a536a8 73591:b214bd8be620
1 This is ../info/eintr, produced by makeinfo version 4.8 from
2 emacs-lisp-intro.texi.
3
4 INFO-DIR-SECTION Emacs
5 START-INFO-DIR-ENTRY
6 * Emacs Lisp Intro: (eintr).
7 A simple introduction to Emacs Lisp programming.
8 END-INFO-DIR-ENTRY
9
10 This is an `Introduction to Programming in Emacs Lisp', for people who
11 are not programmers.
12
13 Edition 3.00, 2006 Oct 31
14
15 Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1997, 2001, 2002,
16 2003, 2004, 2005, 2006 Free Software Foundation, Inc.
17
18 Published by the:
19
20 GNU Press, Website: http://www.gnupress.org
21 a division of the General: press@gnu.org
22 Free Software Foundation, Inc. Orders: sales@gnu.org
23 51 Franklin Street, Fifth Floor Tel: +1 (617) 542-5942
24 Boston, MA 02110-1301 USA Fax: +1 (617) 542-2652
25
26
27 ISBN 1-882114-43-4
28
29 Permission is granted to copy, distribute and/or modify this document
30 under the terms of the GNU Free Documentation License, Version 1.2 or
31 any later version published by the Free Software Foundation; there
32 being no Invariant Section, with the Front-Cover Texts being "A GNU
33 Manual", and with the Back-Cover Texts as in (a) below. A copy of the
34 license is included in the section entitled "GNU Free Documentation
35 License".
36
37 (a) The FSF's Back-Cover Text is: "You have freedom to copy and modify
38 this GNU Manual, like GNU software. Copies published by the Free
39 Software Foundation raise funds for GNU development."
40
41 
42 File: eintr, Node: defvar and asterisk, Prev: See variable current value, Up: defvar
43
44 8.5.1 `defvar' and an asterisk
45 ------------------------------
46
47 In the past, Emacs used the `defvar' special form both for internal
48 variables that you would not expect a user to change and for variables
49 that you do expect a user to change. Although you can still use
50 `defvar' for user customizable variables, please use `defcustom'
51 instead, since that special form provides a path into the Customization
52 commands. (*Note Specifying Variables using `defcustom': defcustom.)
53
54 When you specified a variable using the `defvar' special form, you
55 could distinguish a readily settable variable from others by typing an
56 asterisk, `*', in the first column of its documentation string. For
57 example:
58
59 (defvar shell-command-default-error-buffer nil
60 "*Buffer name for `shell-command' ... error output.
61 ... ")
62
63 You could (and still can) use the `set-variable' command to change the
64 value of `shell-command-default-error-buffer' temporarily. However,
65 options set using `set-variable' are set only for the duration of your
66 editing session. The new values are not saved between sessions. Each
67 time Emacs starts, it reads the original value, unless you change the
68 value within your `.emacs' file, either by setting it manually or by
69 using `customize'. *Note Your `.emacs' File: Emacs Initialization.
70
71 For me, the major use of the `set-variable' command is to suggest
72 variables that I might want to set in my `.emacs' file. There are now
73 more than 700 such variables -- far too many to remember readily.
74 Fortunately, you can press <TAB> after calling the `M-x set-variable'
75 command to see the list of variables. (*Note Examining and Setting
76 Variables: (emacs)Examining.)
77
78 
79 File: eintr, Node: cons & search-fwd Review, Next: search Exercises, Prev: defvar, Up: Cutting & Storing Text
80
81 8.6 Review
82 ==========
83
84 Here is a brief summary of some recently introduced functions.
85
86 `car'
87 `cdr'
88 `car' returns the first element of a list; `cdr' returns the
89 second and subsequent elements of a list.
90
91 For example:
92
93 (car '(1 2 3 4 5 6 7))
94 => 1
95 (cdr '(1 2 3 4 5 6 7))
96 => (2 3 4 5 6 7)
97
98 `cons'
99 `cons' constructs a list by prepending its first argument to its
100 second argument.
101
102 For example:
103
104 (cons 1 '(2 3 4))
105 => (1 2 3 4)
106
107 `nthcdr'
108 Return the result of taking CDR `n' times on a list. The `rest of
109 the rest', as it were.
110
111 For example:
112
113 (nthcdr 3 '(1 2 3 4 5 6 7))
114 => (4 5 6 7)
115
116 `setcar'
117 `setcdr'
118 `setcar' changes the first element of a list; `setcdr' changes the
119 second and subsequent elements of a list.
120
121 For example:
122
123 (setq triple '(1 2 3))
124
125 (setcar triple '37)
126
127 triple
128 => (37 2 3)
129
130 (setcdr triple '("foo" "bar"))
131
132 triple
133 => (37 "foo" "bar")
134
135 `progn'
136 Evaluate each argument in sequence and then return the value of the
137 last.
138
139 For example:
140
141 (progn 1 2 3 4)
142 => 4
143
144 `save-restriction'
145 Record whatever narrowing is in effect in the current buffer, if
146 any, and restore that narrowing after evaluating the arguments.
147
148 `search-forward'
149 Search for a string, and if the string is found, move point.
150
151 Takes four arguments:
152
153 1. The string to search for.
154
155 2. Optionally, the limit of the search.
156
157 3. Optionally, what to do if the search fails, return `nil' or an
158 error message.
159
160 4. Optionally, how many times to repeat the search; if negative,
161 the search goes backwards.
162
163 `kill-region'
164 `delete-and-extract-region'
165 `copy-region-as-kill'
166 `kill-region' cuts the text between point and mark from the buffer
167 and stores that text in the kill ring, so you can get it back by
168 yanking.
169
170 `copy-region-as-kill' copies the text between point and mark into
171 the kill ring, from which you can get it by yanking. The function
172 does not cut or remove the text from the buffer.
173
174 `delete-and-extract-region' removes the text between point and mark
175 from the buffer and throws it away. You cannot get it back. (This is
176 not an interactive command.)
177
178 
179 File: eintr, Node: search Exercises, Prev: cons & search-fwd Review, Up: Cutting & Storing Text
180
181 8.7 Searching Exercises
182 =======================
183
184 * Write an interactive function that searches for a string. If the
185 search finds the string, leave point after it and display a message
186 that says "Found!". (Do not use `search-forward' for the name of
187 this function; if you do, you will overwrite the existing version
188 of `search-forward' that comes with Emacs. Use a name such as
189 `test-search' instead.)
190
191 * Write a function that prints the third element of the kill ring in
192 the echo area, if any; if the kill ring does not contain a third
193 element, print an appropriate message.
194
195 
196 File: eintr, Node: List Implementation, Next: Yanking, Prev: Cutting & Storing Text, Up: Top
197
198 9 How Lists are Implemented
199 ***************************
200
201 In Lisp, atoms are recorded in a straightforward fashion; if the
202 implementation is not straightforward in practice, it is, nonetheless,
203 straightforward in theory. The atom `rose', for example, is recorded
204 as the four contiguous letters `r', `o', `s', `e'. A list, on the
205 other hand, is kept differently. The mechanism is equally simple, but
206 it takes a moment to get used to the idea. A list is kept using a
207 series of pairs of pointers. In the series, the first pointer in each
208 pair points to an atom or to another list, and the second pointer in
209 each pair points to the next pair, or to the symbol `nil', which marks
210 the end of the list.
211
212 A pointer itself is quite simply the electronic address of what is
213 pointed to. Hence, a list is kept as a series of electronic addresses.
214
215 * Menu:
216
217 * Lists diagrammed::
218 * Symbols as Chest::
219 * List Exercise::
220
221 
222 File: eintr, Node: Lists diagrammed, Next: Symbols as Chest, Prev: List Implementation, Up: List Implementation
223
224 Lists diagrammed
225 ================
226
227 For example, the list `(rose violet buttercup)' has three elements,
228 `rose', `violet', and `buttercup'. In the computer, the electronic
229 address of `rose' is recorded in a segment of computer memory along
230 with the address that gives the electronic address of where the atom
231 `violet' is located; and that address (the one that tells where
232 `violet' is located) is kept along with an address that tells where the
233 address for the atom `buttercup' is located.
234
235 This sounds more complicated than it is and is easier seen in a diagram:
236
237 ___ ___ ___ ___ ___ ___
238 |___|___|--> |___|___|--> |___|___|--> nil
239 | | |
240 | | |
241 --> rose --> violet --> buttercup
242
243
244
245 In the diagram, each box represents a word of computer memory that
246 holds a Lisp object, usually in the form of a memory address. The
247 boxes, i.e. the addresses, are in pairs. Each arrow points to what the
248 address is the address of, either an atom or another pair of addresses.
249 The first box is the electronic address of `rose' and the arrow points
250 to `rose'; the second box is the address of the next pair of boxes, the
251 first part of which is the address of `violet' and the second part of
252 which is the address of the next pair. The very last box points to the
253 symbol `nil', which marks the end of the list.
254
255 When a variable is set to a list with a function such as `setq', it
256 stores the address of the first box in the variable. Thus, evaluation
257 of the expression
258
259 (setq bouquet '(rose violet buttercup))
260
261 creates a situation like this:
262
263 bouquet
264 |
265 | ___ ___ ___ ___ ___ ___
266 --> |___|___|--> |___|___|--> |___|___|--> nil
267 | | |
268 | | |
269 --> rose --> violet --> buttercup
270
271
272
273 In this example, the symbol `bouquet' holds the address of the first
274 pair of boxes.
275
276 This same list can be illustrated in a different sort of box notation
277 like this:
278
279 bouquet
280 |
281 | -------------- --------------- ----------------
282 | | car | cdr | | car | cdr | | car | cdr |
283 -->| rose | o------->| violet | o------->| butter- | nil |
284 | | | | | | | cup | |
285 -------------- --------------- ----------------
286
287
288
289 (Symbols consist of more than pairs of addresses, but the structure of
290 a symbol is made up of addresses. Indeed, the symbol `bouquet'
291 consists of a group of address-boxes, one of which is the address of
292 the printed word `bouquet', a second of which is the address of a
293 function definition attached to the symbol, if any, a third of which is
294 the address of the first pair of address-boxes for the list `(rose
295 violet buttercup)', and so on. Here we are showing that the symbol's
296 third address-box points to the first pair of address-boxes for the
297 list.)
298
299 If a symbol is set to the CDR of a list, the list itself is not
300 changed; the symbol simply has an address further down the list. (In
301 the jargon, CAR and CDR are `non-destructive'.) Thus, evaluation of
302 the following expression
303
304 (setq flowers (cdr bouquet))
305
306 produces this:
307
308
309 bouquet flowers
310 | |
311 | ___ ___ | ___ ___ ___ ___
312 --> | | | --> | | | | | |
313 |___|___|----> |___|___|--> |___|___|--> nil
314 | | |
315 | | |
316 --> rose --> violet --> buttercup
317
318
319
320
321 The value of `flowers' is `(violet buttercup)', which is to say, the
322 symbol `flowers' holds the address of the pair of address-boxes, the
323 first of which holds the address of `violet', and the second of which
324 holds the address of `buttercup'.
325
326 A pair of address-boxes is called a "cons cell" or "dotted pair".
327 *Note Cons Cell and List Types: (elisp)Cons Cell Type, and *Note Dotted
328 Pair Notation: (elisp)Dotted Pair Notation, for more information about
329 cons cells and dotted pairs.
330
331 The function `cons' adds a new pair of addresses to the front of a
332 series of addresses like that shown above. For example, evaluating the
333 expression
334
335 (setq bouquet (cons 'lily bouquet))
336
337 produces:
338
339
340 bouquet flowers
341 | |
342 | ___ ___ ___ ___ | ___ ___ ___ ___
343 --> | | | | | | --> | | | | | |
344 |___|___|----> |___|___|----> |___|___|---->|___|___|--> nil
345 | | | |
346 | | | |
347 --> lily --> rose --> violet --> buttercup
348
349
350
351
352 However, this does not change the value of the symbol `flowers', as you
353 can see by evaluating the following,
354
355 (eq (cdr (cdr bouquet)) flowers)
356
357 which returns `t' for true.
358
359 Until it is reset, `flowers' still has the value `(violet buttercup)';
360 that is, it has the address of the cons cell whose first address is of
361 `violet'. Also, this does not alter any of the pre-existing cons
362 cells; they are all still there.
363
364 Thus, in Lisp, to get the CDR of a list, you just get the address of
365 the next cons cell in the series; to get the CAR of a list, you get the
366 address of the first element of the list; to `cons' a new element on a
367 list, you add a new cons cell to the front of the list. That is all
368 there is to it! The underlying structure of Lisp is brilliantly simple!
369
370 And what does the last address in a series of cons cells refer to? It
371 is the address of the empty list, of `nil'.
372
373 In summary, when a Lisp variable is set to a value, it is provided with
374 the address of the list to which the variable refers.
375
376 
377 File: eintr, Node: Symbols as Chest, Next: List Exercise, Prev: Lists diagrammed, Up: List Implementation
378
379 9.1 Symbols as a Chest of Drawers
380 =================================
381
382 In an earlier section, I suggested that you might imagine a symbol as
383 being a chest of drawers. The function definition is put in one
384 drawer, the value in another, and so on. What is put in the drawer
385 holding the value can be changed without affecting the contents of the
386 drawer holding the function definition, and vice-verse.
387
388 Actually, what is put in each drawer is the address of the value or
389 function definition. It is as if you found an old chest in the attic,
390 and in one of its drawers you found a map giving you directions to
391 where the buried treasure lies.
392
393 (In addition to its name, symbol definition, and variable value, a
394 symbol has a `drawer' for a "property list" which can be used to record
395 other information. Property lists are not discussed here; see *Note
396 Property Lists: (elisp)Property Lists.)
397
398 Here is a fanciful representation:
399
400
401 Chest of Drawers Contents of Drawers
402
403 __ o0O0o __
404 / \
405 ---------------------
406 | directions to | [map to]
407 | symbol name | bouquet
408 | |
409 +---------------------+
410 | directions to |
411 | symbol definition | [none]
412 | |
413 +---------------------+
414 | directions to | [map to]
415 | variable value | (rose violet buttercup)
416 | |
417 +---------------------+
418 | directions to |
419 | property list | [not described here]
420 | |
421 +---------------------+
422 |/ \|
423
424
425
426
427 
428 File: eintr, Node: List Exercise, Prev: Symbols as Chest, Up: List Implementation
429
430 9.2 Exercise
431 ============
432
433 Set `flowers' to `violet' and `buttercup'. Cons two more flowers on to
434 this list and set this new list to `more-flowers'. Set the CAR of
435 `flowers' to a fish. What does the `more-flowers' list now contain?
436
437 
438 File: eintr, Node: Yanking, Next: Loops & Recursion, Prev: List Implementation, Up: Top
439
440 10 Yanking Text Back
441 ********************
442
443 Whenever you cut text out of a buffer with a `kill' command in GNU
444 Emacs, you can bring it back with a `yank' command. The text that is
445 cut out of the buffer is put in the kill ring and the yank commands
446 insert the appropriate contents of the kill ring back into a buffer
447 (not necessarily the original buffer).
448
449 A simple `C-y' (`yank') command inserts the first item from the kill
450 ring into the current buffer. If the `C-y' command is followed
451 immediately by `M-y', the first element is replaced by the second
452 element. Successive `M-y' commands replace the second element with the
453 third, fourth, or fifth element, and so on. When the last element in
454 the kill ring is reached, it is replaced by the first element and the
455 cycle is repeated. (Thus the kill ring is called a `ring' rather than
456 just a `list'. However, the actual data structure that holds the text
457 is a list. *Note Handling the Kill Ring: Kill Ring, for the details of
458 how the list is handled as a ring.)
459
460 * Menu:
461
462 * Kill Ring Overview::
463 * kill-ring-yank-pointer::
464 * yank nthcdr Exercises::
465
466 
467 File: eintr, Node: Kill Ring Overview, Next: kill-ring-yank-pointer, Prev: Yanking, Up: Yanking
468
469 10.1 Kill Ring Overview
470 =======================
471
472 The kill ring is a list of textual strings. This is what it looks like:
473
474 ("some text" "a different piece of text" "yet more text")
475
476 If this were the contents of my kill ring and I pressed `C-y', the
477 string of characters saying `some text' would be inserted in this
478 buffer where my cursor is located.
479
480 The `yank' command is also used for duplicating text by copying it.
481 The copied text is not cut from the buffer, but a copy of it is put on
482 the kill ring and is inserted by yanking it back.
483
484 Three functions are used for bringing text back from the kill ring:
485 `yank', which is usually bound to `C-y'; `yank-pop', which is usually
486 bound to `M-y'; and `rotate-yank-pointer', which is used by the two
487 other functions.
488
489 These functions refer to the kill ring through a variable called the
490 `kill-ring-yank-pointer'. Indeed, the insertion code for both the
491 `yank' and `yank-pop' functions is:
492
493 (insert (car kill-ring-yank-pointer))
494
495 (Well, no more. In GNU Emacs 22, the function has been replaced by
496 `insert-for-yank' which calls `insert-for-yank-1' repetitively for each
497 `yank-handler' segment. In turn, `insert-for-yank-1' strips text
498 properties from the inserted text according to
499 `yank-excluded-properties'. Otherwise, it is just like `insert'. We
500 will stick with plain `insert' since it is easier to understand.)
501
502 To begin to understand how `yank' and `yank-pop' work, it is first
503 necessary to look at the `kill-ring-yank-pointer' variable and the
504 `rotate-yank-pointer' function.
505
506 
507 File: eintr, Node: kill-ring-yank-pointer, Next: yank nthcdr Exercises, Prev: Kill Ring Overview, Up: Yanking
508
509 10.2 The `kill-ring-yank-pointer' Variable
510 ==========================================
511
512 `kill-ring-yank-pointer' is a variable, just as `kill-ring' is a
513 variable. It points to something by being bound to the value of what
514 it points to, like any other Lisp variable.
515
516 Thus, if the value of the kill ring is:
517
518 ("some text" "a different piece of text" "yet more text")
519
520 and the `kill-ring-yank-pointer' points to the second clause, the value
521 of `kill-ring-yank-pointer' is:
522
523 ("a different piece of text" "yet more text")
524
525 As explained in the previous chapter (*note List Implementation::), the
526 computer does not keep two different copies of the text being pointed to
527 by both the `kill-ring' and the `kill-ring-yank-pointer'. The words "a
528 different piece of text" and "yet more text" are not duplicated.
529 Instead, the two Lisp variables point to the same pieces of text. Here
530 is a diagram:
531
532 kill-ring kill-ring-yank-pointer
533 | |
534 | ___ ___ | ___ ___ ___ ___
535 ---> | | | --> | | | | | |
536 |___|___|----> |___|___|--> |___|___|--> nil
537 | | |
538 | | |
539 | | --> "yet more text"
540 | |
541 | --> "a different piece of text
542 |
543 --> "some text"
544
545
546
547
548 Both the variable `kill-ring' and the variable `kill-ring-yank-pointer'
549 are pointers. But the kill ring itself is usually described as if it
550 were actually what it is composed of. The `kill-ring' is spoken of as
551 if it were the list rather than that it points to the list.
552 Conversely, the `kill-ring-yank-pointer' is spoken of as pointing to a
553 list.
554
555 These two ways of talking about the same thing sound confusing at first
556 but make sense on reflection. The kill ring is generally thought of as
557 the complete structure of data that holds the information of what has
558 recently been cut out of the Emacs buffers. The
559 `kill-ring-yank-pointer' on the other hand, serves to indicate--that
560 is, to `point to'--that part of the kill ring of which the first
561 element (the CAR) will be inserted.
562
563 
564 File: eintr, Node: yank nthcdr Exercises, Prev: kill-ring-yank-pointer, Up: Yanking
565
566 10.3 Exercises with `yank' and `nthcdr'
567 =======================================
568
569 * Using `C-h v' (`describe-variable'), look at the value of your
570 kill ring. Add several items to your kill ring; look at its value
571 again. Using `M-y' (`yank-pop)', move all the way around the kill
572 ring. How many items were in your kill ring? Find the value of
573 `kill-ring-max'. Was your kill ring full, or could you have kept
574 more blocks of text within it?
575
576 * Using `nthcdr' and `car', construct a series of expressions to
577 return the first, second, third, and fourth elements of a list.
578
579 
580 File: eintr, Node: Loops & Recursion, Next: Regexp Search, Prev: Yanking, Up: Top
581
582 11 Loops and Recursion
583 **********************
584
585 Emacs Lisp has two primary ways to cause an expression, or a series of
586 expressions, to be evaluated repeatedly: one uses a `while' loop, and
587 the other uses "recursion".
588
589 Repetition can be very valuable. For example, to move forward four
590 sentences, you need only write a program that will move forward one
591 sentence and then repeat the process four times. Since a computer does
592 not get bored or tired, such repetitive action does not have the
593 deleterious effects that excessive or the wrong kinds of repetition can
594 have on humans.
595
596 People mostly write Emacs Lisp functions using `while' loops and their
597 kin; but you can use recursion, which provides a very powerful way to
598 think about and then to solve problems(1).
599
600 * Menu:
601
602 * while::
603 * dolist dotimes::
604 * Recursion::
605 * Looping exercise::
606
607 ---------- Footnotes ----------
608
609 (1) You can write recursive functions to be frugal or wasteful of
610 mental or computer resources; as it happens, methods that people find
611 easy--that are frugal of `mental resources'--sometimes use considerable
612 computer resources. Emacs was designed to run on machines that we now
613 consider limited and its default settings are conservative. You may
614 want to increase the values of `max-specpdl-size' and
615 `max-lisp-eval-depth'. In my `.emacs' file, I set them to 15 and 30
616 times their default value.
617
618 
619 File: eintr, Node: while, Next: dolist dotimes, Prev: Loops & Recursion, Up: Loops & Recursion
620
621 11.1 `while'
622 ============
623
624 The `while' special form tests whether the value returned by evaluating
625 its first argument is true or false. This is similar to what the Lisp
626 interpreter does with an `if'; what the interpreter does next, however,
627 is different.
628
629 In a `while' expression, if the value returned by evaluating the first
630 argument is false, the Lisp interpreter skips the rest of the
631 expression (the "body" of the expression) and does not evaluate it.
632 However, if the value is true, the Lisp interpreter evaluates the body
633 of the expression and then again tests whether the first argument to
634 `while' is true or false. If the value returned by evaluating the
635 first argument is again true, the Lisp interpreter again evaluates the
636 body of the expression.
637
638 The template for a `while' expression looks like this:
639
640 (while TRUE-OR-FALSE-TEST
641 BODY...)
642
643 * Menu:
644
645 * Looping with while::
646 * Loop Example::
647 * print-elements-of-list::
648 * Incrementing Loop::
649 * Decrementing Loop::
650
651 
652 File: eintr, Node: Looping with while, Next: Loop Example, Prev: while, Up: while
653
654 Looping with `while'
655 --------------------
656
657 So long as the true-or-false-test of the `while' expression returns a
658 true value when it is evaluated, the body is repeatedly evaluated.
659 This process is called a loop since the Lisp interpreter repeats the
660 same thing again and again, like an airplane doing a loop. When the
661 result of evaluating the true-or-false-test is false, the Lisp
662 interpreter does not evaluate the rest of the `while' expression and
663 `exits the loop'.
664
665 Clearly, if the value returned by evaluating the first argument to
666 `while' is always true, the body following will be evaluated again and
667 again ... and again ... forever. Conversely, if the value returned is
668 never true, the expressions in the body will never be evaluated. The
669 craft of writing a `while' loop consists of choosing a mechanism such
670 that the true-or-false-test returns true just the number of times that
671 you want the subsequent expressions to be evaluated, and then have the
672 test return false.
673
674 The value returned by evaluating a `while' is the value of the
675 true-or-false-test. An interesting consequence of this is that a
676 `while' loop that evaluates without error will return `nil' or false
677 regardless of whether it has looped 1 or 100 times or none at all. A
678 `while' expression that evaluates successfully never returns a true
679 value! What this means is that `while' is always evaluated for its
680 side effects, which is to say, the consequences of evaluating the
681 expressions within the body of the `while' loop. This makes sense. It
682 is not the mere act of looping that is desired, but the consequences of
683 what happens when the expressions in the loop are repeatedly evaluated.
684
685 
686 File: eintr, Node: Loop Example, Next: print-elements-of-list, Prev: Looping with while, Up: while
687
688 11.1.1 A `while' Loop and a List
689 --------------------------------
690
691 A common way to control a `while' loop is to test whether a list has
692 any elements. If it does, the loop is repeated; but if it does not,
693 the repetition is ended. Since this is an important technique, we will
694 create a short example to illustrate it.
695
696 A simple way to test whether a list has elements is to evaluate the
697 list: if it has no elements, it is an empty list and will return the
698 empty list, `()', which is a synonym for `nil' or false. On the other
699 hand, a list with elements will return those elements when it is
700 evaluated. Since Emacs Lisp considers as true any value that is not
701 `nil', a list that returns elements will test true in a `while' loop.
702
703 For example, you can set the variable `empty-list' to `nil' by
704 evaluating the following `setq' expression:
705
706 (setq empty-list ())
707
708 After evaluating the `setq' expression, you can evaluate the variable
709 `empty-list' in the usual way, by placing the cursor after the symbol
710 and typing `C-x C-e'; `nil' will appear in your echo area:
711
712 empty-list
713
714 On the other hand, if you set a variable to be a list with elements, the
715 list will appear when you evaluate the variable, as you can see by
716 evaluating the following two expressions:
717
718 (setq animals '(gazelle giraffe lion tiger))
719
720 animals
721
722 Thus, to create a `while' loop that tests whether there are any items
723 in the list `animals', the first part of the loop will be written like
724 this:
725
726 (while animals
727 ...
728
729 When the `while' tests its first argument, the variable `animals' is
730 evaluated. It returns a list. So long as the list has elements, the
731 `while' considers the results of the test to be true; but when the list
732 is empty, it considers the results of the test to be false.
733
734 To prevent the `while' loop from running forever, some mechanism needs
735 to be provided to empty the list eventually. An oft-used technique is
736 to have one of the subsequent forms in the `while' expression set the
737 value of the list to be the CDR of the list. Each time the `cdr'
738 function is evaluated, the list will be made shorter, until eventually
739 only the empty list will be left. At this point, the test of the
740 `while' loop will return false, and the arguments to the `while' will
741 no longer be evaluated.
742
743 For example, the list of animals bound to the variable `animals' can be
744 set to be the CDR of the original list with the following expression:
745
746 (setq animals (cdr animals))
747
748 If you have evaluated the previous expressions and then evaluate this
749 expression, you will see `(giraffe lion tiger)' appear in the echo
750 area. If you evaluate the expression again, `(lion tiger)' will appear
751 in the echo area. If you evaluate it again and yet again, `(tiger)'
752 appears and then the empty list, shown by `nil'.
753
754 A template for a `while' loop that uses the `cdr' function repeatedly
755 to cause the true-or-false-test eventually to test false looks like
756 this:
757
758 (while TEST-WHETHER-LIST-IS-EMPTY
759 BODY...
760 SET-LIST-TO-CDR-OF-LIST)
761
762 This test and use of `cdr' can be put together in a function that goes
763 through a list and prints each element of the list on a line of its own.
764
765 
766 File: eintr, Node: print-elements-of-list, Next: Incrementing Loop, Prev: Loop Example, Up: while
767
768 11.1.2 An Example: `print-elements-of-list'
769 -------------------------------------------
770
771 The `print-elements-of-list' function illustrates a `while' loop with a
772 list.
773
774 The function requires several lines for its output. If you are reading
775 this in a recent instance of GNU Emacs, you can evaluate the following
776 expression inside of Info, as usual.
777
778 If you are using an earlier version of Emacs, you need to copy the
779 necessary expressions to your `*scratch*' buffer and evaluate them
780 there. This is because the echo area had only one line in the earlier
781 versions.
782
783 You can copy the expressions by marking the beginning of the region
784 with `C-<SPC>' (`set-mark-command'), moving the cursor to the end of
785 the region and then copying the region using `M-w' (`kill-ring-save',
786 which calls `copy-region-as-kill' and then provides visual feedback).
787 In the `*scratch*' buffer, you can yank the expressions back by typing
788 `C-y' (`yank').
789
790 After you have copied the expressions to the `*scratch*' buffer,
791 evaluate each expression in turn. Be sure to evaluate the last
792 expression, `(print-elements-of-list animals)', by typing `C-u C-x
793 C-e', that is, by giving an argument to `eval-last-sexp'. This will
794 cause the result of the evaluation to be printed in the `*scratch*'
795 buffer instead of being printed in the echo area. (Otherwise you will
796 see something like this in your echo area:
797 `^Jgazelle^J^Jgiraffe^J^Jlion^J^Jtiger^Jnil', in which each `^J' stands
798 for a `newline'.)
799
800 In a recent instance of GNU Emacs, you can evaluate these expressions
801 directly in the Info buffer, and the echo area will grow to show the
802 results.
803
804 (setq animals '(gazelle giraffe lion tiger))
805
806 (defun print-elements-of-list (list)
807 "Print each element of LIST on a line of its own."
808 (while list
809 (print (car list))
810 (setq list (cdr list))))
811
812 (print-elements-of-list animals)
813
814 When you evaluate the three expressions in sequence, you will see this:
815
816 gazelle
817
818 giraffe
819
820 lion
821
822 tiger
823 nil
824
825 Each element of the list is printed on a line of its own (that is what
826 the function `print' does) and then the value returned by the function
827 is printed. Since the last expression in the function is the `while'
828 loop, and since `while' loops always return `nil', a `nil' is printed
829 after the last element of the list.
830
831 
832 File: eintr, Node: Incrementing Loop, Next: Decrementing Loop, Prev: print-elements-of-list, Up: while
833
834 11.1.3 A Loop with an Incrementing Counter
835 ------------------------------------------
836
837 A loop is not useful unless it stops when it ought. Besides
838 controlling a loop with a list, a common way of stopping a loop is to
839 write the first argument as a test that returns false when the correct
840 number of repetitions are complete. This means that the loop must have
841 a counter--an expression that counts how many times the loop repeats
842 itself.
843
844 The test can be an expression such as `(< count desired-number)' which
845 returns `t' for true if the value of `count' is less than the
846 `desired-number' of repetitions and `nil' for false if the value of
847 `count' is equal to or is greater than the `desired-number'. The
848 expression that increments the count can be a simple `setq' such as
849 `(setq count (1+ count))', where `1+' is a built-in function in Emacs
850 Lisp that adds 1 to its argument. (The expression `(1+ count)' has the
851 same result as `(+ count 1)', but is easier for a human to read.)
852
853 The template for a `while' loop controlled by an incrementing counter
854 looks like this:
855
856 SET-COUNT-TO-INITIAL-VALUE
857 (while (< count desired-number) ; true-or-false-test
858 BODY...
859 (setq count (1+ count))) ; incrementer
860
861 Note that you need to set the initial value of `count'; usually it is
862 set to 1.
863
864 * Menu:
865
866 * Incrementing Example::
867 * Inc Example parts::
868 * Inc Example altogether::
869
870 
871 File: eintr, Node: Incrementing Example, Next: Inc Example parts, Prev: Incrementing Loop, Up: Incrementing Loop
872
873 Example with incrementing counter
874 .................................
875
876 Suppose you are playing on the beach and decide to make a triangle of
877 pebbles, putting one pebble in the first row, two in the second row,
878 three in the third row and so on, like this:
879
880
881 *
882 * *
883 * * *
884 * * * *
885
886
887 (About 2500 years ago, Pythagoras and others developed the beginnings of
888 number theory by considering questions such as this.)
889
890 Suppose you want to know how many pebbles you will need to make a
891 triangle with 7 rows?
892
893 Clearly, what you need to do is add up the numbers from 1 to 7. There
894 are two ways to do this; start with the smallest number, one, and add up
895 the list in sequence, 1, 2, 3, 4 and so on; or start with the largest
896 number and add the list going down: 7, 6, 5, 4 and so on. Because both
897 mechanisms illustrate common ways of writing `while' loops, we will
898 create two examples, one counting up and the other counting down. In
899 this first example, we will start with 1 and add 2, 3, 4 and so on.
900
901 If you are just adding up a short list of numbers, the easiest way to do
902 it is to add up all the numbers at once. However, if you do not know
903 ahead of time how many numbers your list will have, or if you want to be
904 prepared for a very long list, then you need to design your addition so
905 that what you do is repeat a simple process many times instead of doing
906 a more complex process once.
907
908 For example, instead of adding up all the pebbles all at once, what you
909 can do is add the number of pebbles in the first row, 1, to the number
910 in the second row, 2, and then add the total of those two rows to the
911 third row, 3. Then you can add the number in the fourth row, 4, to the
912 total of the first three rows; and so on.
913
914 The critical characteristic of the process is that each repetitive
915 action is simple. In this case, at each step we add only two numbers,
916 the number of pebbles in the row and the total already found. This
917 process of adding two numbers is repeated again and again until the last
918 row has been added to the total of all the preceding rows. In a more
919 complex loop the repetitive action might not be so simple, but it will
920 be simpler than doing everything all at once.
921
922 
923 File: eintr, Node: Inc Example parts, Next: Inc Example altogether, Prev: Incrementing Example, Up: Incrementing Loop
924
925 The parts of the function definition
926 ....................................
927
928 The preceding analysis gives us the bones of our function definition:
929 first, we will need a variable that we can call `total' that will be
930 the total number of pebbles. This will be the value returned by the
931 function.
932
933 Second, we know that the function will require an argument: this
934 argument will be the total number of rows in the triangle. It can be
935 called `number-of-rows'.
936
937 Finally, we need a variable to use as a counter. We could call this
938 variable `counter', but a better name is `row-number'. That is because
939 what the counter does in this function is count rows, and a program
940 should be written to be as understandable as possible.
941
942 When the Lisp interpreter first starts evaluating the expressions in the
943 function, the value of `total' should be set to zero, since we have not
944 added anything to it. Then the function should add the number of
945 pebbles in the first row to the total, and then add the number of
946 pebbles in the second to the total, and then add the number of pebbles
947 in the third row to the total, and so on, until there are no more rows
948 left to add.
949
950 Both `total' and `row-number' are used only inside the function, so
951 they can be declared as local variables with `let' and given initial
952 values. Clearly, the initial value for `total' should be 0. The
953 initial value of `row-number' should be 1, since we start with the
954 first row. This means that the `let' statement will look like this:
955
956 (let ((total 0)
957 (row-number 1))
958 BODY...)
959
960 After the internal variables are declared and bound to their initial
961 values, we can begin the `while' loop. The expression that serves as
962 the test should return a value of `t' for true so long as the
963 `row-number' is less than or equal to the `number-of-rows'. (If the
964 expression tests true only so long as the row number is less than the
965 number of rows in the triangle, the last row will never be added to the
966 total; hence the row number has to be either less than or equal to the
967 number of rows.)
968
969 Lisp provides the `<=' function that returns true if the value of its
970 first argument is less than or equal to the value of its second
971 argument and false otherwise. So the expression that the `while' will
972 evaluate as its test should look like this:
973
974 (<= row-number number-of-rows)
975
976 The total number of pebbles can be found by repeatedly adding the number
977 of pebbles in a row to the total already found. Since the number of
978 pebbles in the row is equal to the row number, the total can be found by
979 adding the row number to the total. (Clearly, in a more complex
980 situation, the number of pebbles in the row might be related to the row
981 number in a more complicated way; if this were the case, the row number
982 would be replaced by the appropriate expression.)
983
984 (setq total (+ total row-number))
985
986 What this does is set the new value of `total' to be equal to the sum
987 of adding the number of pebbles in the row to the previous total.
988
989 After setting the value of `total', the conditions need to be
990 established for the next repetition of the loop, if there is one. This
991 is done by incrementing the value of the `row-number' variable, which
992 serves as a counter. After the `row-number' variable has been
993 incremented, the true-or-false-test at the beginning of the `while'
994 loop tests whether its value is still less than or equal to the value
995 of the `number-of-rows' and if it is, adds the new value of the
996 `row-number' variable to the `total' of the previous repetition of the
997 loop.
998
999 The built-in Emacs Lisp function `1+' adds 1 to a number, so the
1000 `row-number' variable can be incremented with this expression:
1001
1002 (setq row-number (1+ row-number))
1003
1004 
1005 File: eintr, Node: Inc Example altogether, Prev: Inc Example parts, Up: Incrementing Loop
1006
1007 Putting the function definition together
1008 ........................................
1009
1010 We have created the parts for the function definition; now we need to
1011 put them together.
1012
1013 First, the contents of the `while' expression:
1014
1015 (while (<= row-number number-of-rows) ; true-or-false-test
1016 (setq total (+ total row-number))
1017 (setq row-number (1+ row-number))) ; incrementer
1018
1019 Along with the `let' expression varlist, this very nearly completes the
1020 body of the function definition. However, it requires one final
1021 element, the need for which is somewhat subtle.
1022
1023 The final touch is to place the variable `total' on a line by itself
1024 after the `while' expression. Otherwise, the value returned by the
1025 whole function is the value of the last expression that is evaluated in
1026 the body of the `let', and this is the value returned by the `while',
1027 which is always `nil'.
1028
1029 This may not be evident at first sight. It almost looks as if the
1030 incrementing expression is the last expression of the whole function.
1031 But that expression is part of the body of the `while'; it is the last
1032 element of the list that starts with the symbol `while'. Moreover, the
1033 whole of the `while' loop is a list within the body of the `let'.
1034
1035 In outline, the function will look like this:
1036
1037 (defun NAME-OF-FUNCTION (ARGUMENT-LIST)
1038 "DOCUMENTATION..."
1039 (let (VARLIST)
1040 (while (TRUE-OR-FALSE-TEST)
1041 BODY-OF-WHILE... )
1042 ... )) ; Need final expression here.
1043
1044 The result of evaluating the `let' is what is going to be returned by
1045 the `defun' since the `let' is not embedded within any containing list,
1046 except for the `defun' as a whole. However, if the `while' is the last
1047 element of the `let' expression, the function will always return `nil'.
1048 This is not what we want! Instead, what we want is the value of the
1049 variable `total'. This is returned by simply placing the symbol as the
1050 last element of the list starting with `let'. It gets evaluated after
1051 the preceding elements of the list are evaluated, which means it gets
1052 evaluated after it has been assigned the correct value for the total.
1053
1054 It may be easier to see this by printing the list starting with `let'
1055 all on one line. This format makes it evident that the VARLIST and
1056 `while' expressions are the second and third elements of the list
1057 starting with `let', and the `total' is the last element:
1058
1059 (let (VARLIST) (while (TRUE-OR-FALSE-TEST) BODY-OF-WHILE... ) total)
1060
1061 Putting everything together, the `triangle' function definition looks
1062 like this:
1063
1064 (defun triangle (number-of-rows) ; Version with
1065 ; incrementing counter.
1066 "Add up the number of pebbles in a triangle.
1067 The first row has one pebble, the second row two pebbles,
1068 the third row three pebbles, and so on.
1069 The argument is NUMBER-OF-ROWS."
1070 (let ((total 0)
1071 (row-number 1))
1072 (while (<= row-number number-of-rows)
1073 (setq total (+ total row-number))
1074 (setq row-number (1+ row-number)))
1075 total))
1076
1077 After you have installed `triangle' by evaluating the function, you can
1078 try it out. Here are two examples:
1079
1080 (triangle 4)
1081
1082 (triangle 7)
1083
1084 The sum of the first four numbers is 10 and the sum of the first seven
1085 numbers is 28.
1086
1087 
1088 File: eintr, Node: Decrementing Loop, Prev: Incrementing Loop, Up: while
1089
1090 11.1.4 Loop with a Decrementing Counter
1091 ---------------------------------------
1092
1093 Another common way to write a `while' loop is to write the test so that
1094 it determines whether a counter is greater than zero. So long as the
1095 counter is greater than zero, the loop is repeated. But when the
1096 counter is equal to or less than zero, the loop is stopped. For this
1097 to work, the counter has to start out greater than zero and then be
1098 made smaller and smaller by a form that is evaluated repeatedly.
1099
1100 The test will be an expression such as `(> counter 0)' which returns
1101 `t' for true if the value of `counter' is greater than zero, and `nil'
1102 for false if the value of `counter' is equal to or less than zero. The
1103 expression that makes the number smaller and smaller can be a simple
1104 `setq' such as `(setq counter (1- counter))', where `1-' is a built-in
1105 function in Emacs Lisp that subtracts 1 from its argument.
1106
1107 The template for a decrementing `while' loop looks like this:
1108
1109 (while (> counter 0) ; true-or-false-test
1110 BODY...
1111 (setq counter (1- counter))) ; decrementer
1112
1113 * Menu:
1114
1115 * Decrementing Example::
1116 * Dec Example parts::
1117 * Dec Example altogether::
1118
1119 
1120 File: eintr, Node: Decrementing Example, Next: Dec Example parts, Prev: Decrementing Loop, Up: Decrementing Loop
1121
1122 Example with decrementing counter
1123 .................................
1124
1125 To illustrate a loop with a decrementing counter, we will rewrite the
1126 `triangle' function so the counter decreases to zero.
1127
1128 This is the reverse of the earlier version of the function. In this
1129 case, to find out how many pebbles are needed to make a triangle with 3
1130 rows, add the number of pebbles in the third row, 3, to the number in
1131 the preceding row, 2, and then add the total of those two rows to the
1132 row that precedes them, which is 1.
1133
1134 Likewise, to find the number of pebbles in a triangle with 7 rows, add
1135 the number of pebbles in the seventh row, 7, to the number in the
1136 preceding row, which is 6, and then add the total of those two rows to
1137 the row that precedes them, which is 5, and so on. As in the previous
1138 example, each addition only involves adding two numbers, the total of
1139 the rows already added up and the number of pebbles in the row that is
1140 being added to the total. This process of adding two numbers is
1141 repeated again and again until there are no more pebbles to add.
1142
1143 We know how many pebbles to start with: the number of pebbles in the
1144 last row is equal to the number of rows. If the triangle has seven
1145 rows, the number of pebbles in the last row is 7. Likewise, we know how
1146 many pebbles are in the preceding row: it is one less than the number in
1147 the row.
1148
1149 
1150 File: eintr, Node: Dec Example parts, Next: Dec Example altogether, Prev: Decrementing Example, Up: Decrementing Loop
1151
1152 The parts of the function definition
1153 ....................................
1154
1155 We start with three variables: the total number of rows in the
1156 triangle; the number of pebbles in a row; and the total number of
1157 pebbles, which is what we want to calculate. These variables can be
1158 named `number-of-rows', `number-of-pebbles-in-row', and `total',
1159 respectively.
1160
1161 Both `total' and `number-of-pebbles-in-row' are used only inside the
1162 function and are declared with `let'. The initial value of `total'
1163 should, of course, be zero. However, the initial value of
1164 `number-of-pebbles-in-row' should be equal to the number of rows in the
1165 triangle, since the addition will start with the longest row.
1166
1167 This means that the beginning of the `let' expression will look like
1168 this:
1169
1170 (let ((total 0)
1171 (number-of-pebbles-in-row number-of-rows))
1172 BODY...)
1173
1174 The total number of pebbles can be found by repeatedly adding the number
1175 of pebbles in a row to the total already found, that is, by repeatedly
1176 evaluating the following expression:
1177
1178 (setq total (+ total number-of-pebbles-in-row))
1179
1180 After the `number-of-pebbles-in-row' is added to the `total', the
1181 `number-of-pebbles-in-row' should be decremented by one, since the next
1182 time the loop repeats, the preceding row will be added to the total.
1183
1184 The number of pebbles in a preceding row is one less than the number of
1185 pebbles in a row, so the built-in Emacs Lisp function `1-' can be used
1186 to compute the number of pebbles in the preceding row. This can be
1187 done with the following expression:
1188
1189 (setq number-of-pebbles-in-row
1190 (1- number-of-pebbles-in-row))
1191
1192 Finally, we know that the `while' loop should stop making repeated
1193 additions when there are no pebbles in a row. So the test for the
1194 `while' loop is simply:
1195
1196 (while (> number-of-pebbles-in-row 0)
1197
1198 
1199 File: eintr, Node: Dec Example altogether, Prev: Dec Example parts, Up: Decrementing Loop
1200
1201 Putting the function definition together
1202 ........................................
1203
1204 We can put these expressions together to create a function definition
1205 that works. However, on examination, we find that one of the local
1206 variables is unneeded!
1207
1208 The function definition looks like this:
1209
1210 ;;; First subtractive version.
1211 (defun triangle (number-of-rows)
1212 "Add up the number of pebbles in a triangle."
1213 (let ((total 0)
1214 (number-of-pebbles-in-row number-of-rows))
1215 (while (> number-of-pebbles-in-row 0)
1216 (setq total (+ total number-of-pebbles-in-row))
1217 (setq number-of-pebbles-in-row
1218 (1- number-of-pebbles-in-row)))
1219 total))
1220
1221 As written, this function works.
1222
1223 However, we do not need `number-of-pebbles-in-row'.
1224
1225 When the `triangle' function is evaluated, the symbol `number-of-rows'
1226 will be bound to a number, giving it an initial value. That number can
1227 be changed in the body of the function as if it were a local variable,
1228 without any fear that such a change will effect the value of the
1229 variable outside of the function. This is a very useful characteristic
1230 of Lisp; it means that the variable `number-of-rows' can be used
1231 anywhere in the function where `number-of-pebbles-in-row' is used.
1232
1233 Here is a second version of the function written a bit more cleanly:
1234
1235 (defun triangle (number) ; Second version.
1236 "Return sum of numbers 1 through NUMBER inclusive."
1237 (let ((total 0))
1238 (while (> number 0)
1239 (setq total (+ total number))
1240 (setq number (1- number)))
1241 total))
1242
1243 In brief, a properly written `while' loop will consist of three parts:
1244
1245 1. A test that will return false after the loop has repeated itself
1246 the correct number of times.
1247
1248 2. An expression the evaluation of which will return the value desired
1249 after being repeatedly evaluated.
1250
1251 3. An expression to change the value passed to the true-or-false-test
1252 so that the test returns false after the loop has repeated itself
1253 the right number of times.
1254
1255 
1256 File: eintr, Node: dolist dotimes, Next: Recursion, Prev: while, Up: Loops & Recursion
1257
1258 11.2 Save your time: `dolist' and `dotimes'
1259 ===========================================
1260
1261 In addition to `while', both `dolist' and `dotimes' provide for
1262 looping. Sometimes these are quicker to write than the equivalent
1263 `while' loop. Both are Lisp macros. (*Note Macros: (elisp)Macros. )
1264
1265 `dolist' works like a `while' loop that `CDRs down a list': `dolist'
1266 automatically shortens the list each time it loops--takes the CDR of
1267 the list--and binds the CAR of each shorter version of the list to the
1268 first of its arguments.
1269
1270 `dotimes' loops a specific number of times: you specify the number.
1271
1272 * Menu:
1273
1274 * dolist::
1275 * dotimes::
1276
1277 
1278 File: eintr, Node: dolist, Next: dotimes, Prev: dolist dotimes, Up: dolist dotimes
1279
1280 The `dolist' Macro
1281 ..................
1282
1283 Suppose, for example, you want to reverse a list, so that "first"
1284 "second" "third" becomes "third" "second" "first".
1285
1286 In practice, you would use the `reverse' function, like this:
1287
1288 (setq animals '(gazelle giraffe lion tiger))
1289
1290 (reverse animals)
1291
1292 Here is how you could reverse the list using a `while' loop:
1293
1294 (setq animals '(gazelle giraffe lion tiger))
1295
1296 (defun reverse-list-with-while (list)
1297 "Using while, reverse the order of LIST."
1298 (let (value) ; make sure list starts empty
1299 (while list
1300 (setq value (cons (car list) value))
1301 (setq list (cdr list)))
1302 value))
1303
1304 (reverse-list-with-while animals)
1305
1306 And here is how you could use the `dolist' macro:
1307
1308 (setq animals '(gazelle giraffe lion tiger))
1309
1310 (defun reverse-list-with-dolist (list)
1311 "Using dolist, reverse the order of LIST."
1312 (let (value) ; make sure list starts empty
1313 (dolist (element list value)
1314 (setq value (cons element value)))))
1315
1316 (reverse-list-with-dolist animals)
1317
1318 In Info, you can place your cursor after the closing parenthesis of
1319 each expression and type `C-x C-e'; in each case, you should see
1320
1321 (tiger lion giraffe gazelle)
1322
1323 in the echo area.
1324
1325 For this example, the existing `reverse' function is obviously best.
1326 The `while' loop is just like our first example (*note A `while' Loop
1327 and a List: Loop Example.). The `while' first checks whether the list
1328 has elements; if so, it constructs a new list by adding the first
1329 element of the list to the existing list (which in the first iteration
1330 of the loop is `nil'). Since the second element is prepended in front
1331 of the first element, and the third element is prepended in front of
1332 the second element, the list is reversed.
1333
1334 In the expression using a `while' loop, the `(setq list (cdr list))'
1335 expression shortens the list, so the `while' loop eventually stops. In
1336 addition, it provides the `cons' expression with a new first element by
1337 creating a new and shorter list at each repetition of the loop.
1338
1339 The `dolist' expression does very much the same as the `while'
1340 expression, except that the `dolist' macro does some of the work you
1341 have to do when writing a `while' expression.
1342
1343 Like a `while' loop, a `dolist' loops. What is different is that it
1344 automatically shortens the list each time it loops -- it `CDRs down the
1345 list' on its own -- and it automatically binds the CAR of each shorter
1346 version of the list to the first of its arguments.
1347
1348 In the example, the CAR of each shorter version of the list is referred
1349 to using the symbol `element', the list itself is called `list', and
1350 the value returned is called `value'. The remainder of the `dolist'
1351 expression is the body.
1352
1353 The `dolist' expression binds the CAR of each shorter version of the
1354 list to `element' and then evaluates the body of the expression; and
1355 repeats the loop. The result is returned in `value'.
1356
1357 
1358 File: eintr, Node: dotimes, Prev: dolist, Up: dolist dotimes
1359
1360 The `dotimes' Macro
1361 ...................
1362
1363 The `dotimes' macro is similar to `dolist', except that it loops a
1364 specific number of times.
1365
1366 The first argument to `dotimes' is assigned the numbers 0, 1, 2 and so
1367 forth each time around the loop, and the value of the third argument is
1368 returned. You need to provide the value of the second argument, which
1369 is how many times the macro loops.
1370
1371 For example, the following binds the numbers from 0 up to, but not
1372 including, the number 3 to the first argument, NUMBER, and then
1373 constructs a list of the three numbers. (The first number is 0, the
1374 second number is 1, and the third number is 2; this makes a total of
1375 three numbers in all, starting with zero as the first number.)
1376
1377 (let (value) ; otherwise a value is a void variable
1378 (dotimes (number 3 value)
1379 (setq value (cons number value))))
1380
1381 => (2 1 0)
1382
1383 `dotimes' returns `value', so the way to use `dotimes' is to operate on
1384 some expression NUMBER number of times and then return the result,
1385 either as a list or an atom.
1386
1387 Here is an example of a `defun' that uses `dotimes' to add up the
1388 number of pebbles in a triangle.
1389
1390 (defun triangle-using-dotimes (number-of-rows)
1391 "Using dotimes, add up the number of pebbles in a triangle."
1392 (let ((total 0)) ; otherwise a total is a void variable
1393 (dotimes (number number-of-rows total)
1394 (setq total (+ total (1+ number))))))
1395
1396 (triangle-using-dotimes 4)
1397
1398 
1399 File: eintr, Node: Recursion, Next: Looping exercise, Prev: dolist dotimes, Up: Loops & Recursion
1400
1401 11.3 Recursion
1402 ==============
1403
1404 A recursive function contains code that tells the Lisp interpreter to
1405 call a program that runs exactly like itself, but with slightly
1406 different arguments. The code runs exactly the same because it has the
1407 same name. However, even though the program has the same name, it is
1408 not the same entity. It is different. In the jargon, it is a
1409 different `instance'.
1410
1411 Eventually, if the program is written correctly, the `slightly
1412 different arguments' will become sufficiently different from the first
1413 arguments that the final instance will stop.
1414
1415 * Menu:
1416
1417 * Building Robots::
1418 * Recursive Definition Parts::
1419 * Recursion with list::
1420 * Recursive triangle function::
1421 * Recursion with cond::
1422 * Recursive Patterns::
1423 * No Deferment::
1424 * No deferment solution::
1425
1426 
1427 File: eintr, Node: Building Robots, Next: Recursive Definition Parts, Prev: Recursion, Up: Recursion
1428
1429 11.3.1 Building Robots: Extending the Metaphor
1430 ----------------------------------------------
1431
1432 It is sometimes helpful to think of a running program as a robot that
1433 does a job. In doing its job, a recursive function calls on a second
1434 robot to help it. The second robot is identical to the first in every
1435 way, except that the second robot helps the first and has been passed
1436 different arguments than the first.
1437
1438 In a recursive function, the second robot may call a third; and the
1439 third may call a fourth, and so on. Each of these is a different
1440 entity; but all are clones.
1441
1442 Since each robot has slightly different instructions--the arguments
1443 will differ from one robot to the next--the last robot should know when
1444 to stop.
1445
1446 Let's expand on the metaphor in which a computer program is a robot.
1447
1448 A function definition provides the blueprints for a robot. When you
1449 install a function definition, that is, when you evaluate a `defun'
1450 special form, you install the necessary equipment to build robots. It
1451 is as if you were in a factory, setting up an assembly line. Robots
1452 with the same name are built according to the same blueprints. So they
1453 have, as it were, the same `model number', but a different `serial
1454 number'.
1455
1456 We often say that a recursive function `calls itself'. What we mean is
1457 that the instructions in a recursive function cause the Lisp
1458 interpreter to run a different function that has the same name and does
1459 the same job as the first, but with different arguments.
1460
1461 It is important that the arguments differ from one instance to the
1462 next; otherwise, the process will never stop.
1463
1464 
1465 File: eintr, Node: Recursive Definition Parts, Next: Recursion with list, Prev: Building Robots, Up: Recursion
1466
1467 11.3.2 The Parts of a Recursive Definition
1468 ------------------------------------------
1469
1470 A recursive function typically contains a conditional expression which
1471 has three parts:
1472
1473 1. A true-or-false-test that determines whether the function is called
1474 again, here called the "do-again-test".
1475
1476 2. The name of the function. When this name is called, a new
1477 instance of the function--a new robot, as it were--is created and
1478 told what to do.
1479
1480 3. An expression that returns a different value each time the
1481 function is called, here called the "next-step-expression".
1482 Consequently, the argument (or arguments) passed to the new
1483 instance of the function will be different from that passed to the
1484 previous instance. This causes the conditional expression, the
1485 "do-again-test", to test false after the correct number of
1486 repetitions.
1487
1488 Recursive functions can be much simpler than any other kind of
1489 function. Indeed, when people first start to use them, they often look
1490 so mysteriously simple as to be incomprehensible. Like riding a
1491 bicycle, reading a recursive function definition takes a certain knack
1492 which is hard at first but then seems simple.
1493
1494 There are several different common recursive patterns. A very simple
1495 pattern looks like this:
1496
1497 (defun NAME-OF-RECURSIVE-FUNCTION (ARGUMENT-LIST)
1498 "DOCUMENTATION..."
1499 (if DO-AGAIN-TEST
1500 BODY...
1501 (NAME-OF-RECURSIVE-FUNCTION
1502 NEXT-STEP-EXPRESSION)))
1503
1504 Each time a recursive function is evaluated, a new instance of it is
1505 created and told what to do. The arguments tell the instance what to
1506 do.
1507
1508 An argument is bound to the value of the next-step-expression. Each
1509 instance runs with a different value of the next-step-expression.
1510
1511 The value in the next-step-expression is used in the do-again-test.
1512
1513 The value returned by the next-step-expression is passed to the new
1514 instance of the function, which evaluates it (or some
1515 transmogrification of it) to determine whether to continue or stop.
1516 The next-step-expression is designed so that the do-again-test returns
1517 false when the function should no longer be repeated.
1518
1519 The do-again-test is sometimes called the "stop condition", since it
1520 stops the repetitions when it tests false.
1521
1522 
1523 File: eintr, Node: Recursion with list, Next: Recursive triangle function, Prev: Recursive Definition Parts, Up: Recursion
1524
1525 11.3.3 Recursion with a List
1526 ----------------------------
1527
1528 The example of a `while' loop that printed the elements of a list of
1529 numbers can be written recursively. Here is the code, including an
1530 expression to set the value of the variable `animals' to a list.
1531
1532 If you are using GNU Emacs 20 or before, this example must be copied to
1533 the `*scratch*' buffer and each expression must be evaluated there.
1534 Use `C-u C-x C-e' to evaluate the `(print-elements-recursively
1535 animals)' expression so that the results are printed in the buffer;
1536 otherwise the Lisp interpreter will try to squeeze the results into the
1537 one line of the echo area.
1538
1539 Also, place your cursor immediately after the last closing parenthesis
1540 of the `print-elements-recursively' function, before the comment.
1541 Otherwise, the Lisp interpreter will try to evaluate the comment.
1542
1543 If you are using a more recent version, you can evaluate this
1544 expression directly in Info.
1545
1546 (setq animals '(gazelle giraffe lion tiger))
1547
1548 (defun print-elements-recursively (list)
1549 "Print each element of LIST on a line of its own.
1550 Uses recursion."
1551 (if list ; do-again-test
1552 (progn
1553 (print (car list)) ; body
1554 (print-elements-recursively ; recursive call
1555 (cdr list))))) ; next-step-expression
1556
1557 (print-elements-recursively animals)
1558
1559 The `print-elements-recursively' function first tests whether there is
1560 any content in the list; if there is, the function prints the first
1561 element of the list, the CAR of the list. Then the function `invokes
1562 itself', but gives itself as its argument, not the whole list, but the
1563 second and subsequent elements of the list, the CDR of the list.
1564
1565 Put another way, if the list is not empty, the function invokes another
1566 instance of code that is similar to the initial code, but is a
1567 different thread of execution, with different arguments than the first
1568 instance.
1569
1570 Put in yet another way, if the list is not empty, the first robot
1571 assemblies a second robot and tells it what to do; the second robot is
1572 a different individual from the first, but is the same model.
1573
1574 When the second evaluation occurs, the `if' expression is evaluated and
1575 if true, prints the first element of the list it receives as its
1576 argument (which is the second element of the original list). Then the
1577 function `calls itself' with the CDR of the list it is invoked with,
1578 which (the second time around) is the CDR of the CDR of the original
1579 list.
1580
1581 Note that although we say that the function `calls itself', what we
1582 mean is that the Lisp interpreter assembles and instructs a new
1583 instance of the program. The new instance is a clone of the first, but
1584 is a separate individual.
1585
1586 Each time the function `invokes itself', it invokes itself on a shorter
1587 version of the original list. It creates a new instance that works on
1588 a shorter list.
1589
1590 Eventually, the function invokes itself on an empty list. It creates a
1591 new instance whose argument is `nil'. The conditional expression tests
1592 the value of `list'. Since the value of `list' is `nil', the `if'
1593 expression tests false so the then-part is not evaluated. The function
1594 as a whole then returns `nil'.
1595
1596 When you evaluate `(print-elements-recursively animals)' in the
1597 `*scratch*' buffer, you see this result:
1598
1599 gazelle
1600
1601 giraffe
1602
1603 lion
1604
1605 tiger
1606 nil
1607
1608 
1609 File: eintr, Node: Recursive triangle function, Next: Recursion with cond, Prev: Recursion with list, Up: Recursion
1610
1611 11.3.4 Recursion in Place of a Counter
1612 --------------------------------------
1613
1614 The `triangle' function described in a previous section can also be
1615 written recursively. It looks like this:
1616
1617 (defun triangle-recursively (number)
1618 "Return the sum of the numbers 1 through NUMBER inclusive.
1619 Uses recursion."
1620 (if (= number 1) ; do-again-test
1621 1 ; then-part
1622 (+ number ; else-part
1623 (triangle-recursively ; recursive call
1624 (1- number))))) ; next-step-expression
1625
1626 (triangle-recursively 7)
1627
1628 You can install this function by evaluating it and then try it by
1629 evaluating `(triangle-recursively 7)'. (Remember to put your cursor
1630 immediately after the last parenthesis of the function definition,
1631 before the comment.) The function evaluates to 28.
1632
1633 To understand how this function works, let's consider what happens in
1634 the various cases when the function is passed 1, 2, 3, or 4 as the
1635 value of its argument.
1636
1637 * Menu:
1638
1639 * Recursive Example arg of 1 or 2::
1640 * Recursive Example arg of 3 or 4::
1641
1642 
1643 File: eintr, Node: Recursive Example arg of 1 or 2, Next: Recursive Example arg of 3 or 4, Prev: Recursive triangle function, Up: Recursive triangle function
1644
1645 An argument of 1 or 2
1646 .....................
1647
1648 First, what happens if the value of the argument is 1?
1649
1650 The function has an `if' expression after the documentation string. It
1651 tests whether the value of `number' is equal to 1; if so, Emacs
1652 evaluates the then-part of the `if' expression, which returns the
1653 number 1 as the value of the function. (A triangle with one row has
1654 one pebble in it.)
1655
1656 Suppose, however, that the value of the argument is 2. In this case,
1657 Emacs evaluates the else-part of the `if' expression.
1658
1659 The else-part consists of an addition, the recursive call to
1660 `triangle-recursively' and a decrementing action; and it looks like
1661 this:
1662
1663 (+ number (triangle-recursively (1- number)))
1664
1665 When Emacs evaluates this expression, the innermost expression is
1666 evaluated first; then the other parts in sequence. Here are the steps
1667 in detail:
1668
1669 Step 1 Evaluate the innermost expression.
1670 The innermost expression is `(1- number)' so Emacs decrements the
1671 value of `number' from 2 to 1.
1672
1673 Step 2 Evaluate the `triangle-recursively' function.
1674 The Lisp interpreter creates an individual instance of
1675 `triangle-recursively'. It does not matter that this function is
1676 contained within itself. Emacs passes the result Step 1 as the
1677 argument used by this instance of the `triangle-recursively'
1678 function
1679
1680 In this case, Emacs evaluates `triangle-recursively' with an
1681 argument of 1. This means that this evaluation of
1682 `triangle-recursively' returns 1.
1683
1684 Step 3 Evaluate the value of `number'.
1685 The variable `number' is the second element of the list that
1686 starts with `+'; its value is 2.
1687
1688 Step 4 Evaluate the `+' expression.
1689 The `+' expression receives two arguments, the first from the
1690 evaluation of `number' (Step 3) and the second from the evaluation
1691 of `triangle-recursively' (Step 2).
1692
1693 The result of the addition is the sum of 2 plus 1, and the number
1694 3 is returned, which is correct. A triangle with two rows has
1695 three pebbles in it.
1696
1697 
1698 File: eintr, Node: Recursive Example arg of 3 or 4, Prev: Recursive Example arg of 1 or 2, Up: Recursive triangle function
1699
1700 An argument of 3 or 4
1701 .....................
1702
1703 Suppose that `triangle-recursively' is called with an argument of 3.
1704
1705 Step 1 Evaluate the do-again-test.
1706 The `if' expression is evaluated first. This is the do-again test
1707 and returns false, so the else-part of the `if' expression is
1708 evaluated. (Note that in this example, the do-again-test causes
1709 the function to call itself when it tests false, not when it tests
1710 true.)
1711
1712 Step 2 Evaluate the innermost expression of the else-part.
1713 The innermost expression of the else-part is evaluated, which
1714 decrements 3 to 2. This is the next-step-expression.
1715
1716 Step 3 Evaluate the `triangle-recursively' function.
1717 The number 2 is passed to the `triangle-recursively' function.
1718
1719 We know what happens when Emacs evaluates `triangle-recursively'
1720 with an argument of 2. After going through the sequence of
1721 actions described earlier, it returns a value of 3. So that is
1722 what will happen here.
1723
1724 Step 4 Evaluate the addition.
1725 3 will be passed as an argument to the addition and will be added
1726 to the number with which the function was called, which is 3.
1727
1728 The value returned by the function as a whole will be 6.
1729
1730 Now that we know what will happen when `triangle-recursively' is called
1731 with an argument of 3, it is evident what will happen if it is called
1732 with an argument of 4:
1733
1734 In the recursive call, the evaluation of
1735
1736 (triangle-recursively (1- 4))
1737
1738 will return the value of evaluating
1739
1740 (triangle-recursively 3)
1741
1742 which is 6 and this value will be added to 4 by the addition in the
1743 third line.
1744
1745 The value returned by the function as a whole will be 10.
1746
1747 Each time `triangle-recursively' is evaluated, it evaluates a version
1748 of itself--a different instance of itself--with a smaller argument,
1749 until the argument is small enough so that it does not evaluate itself.
1750
1751 Note that this particular design for a recursive function requires that
1752 operations be deferred.
1753
1754 Before `(triangle-recursively 7)' can calculate its answer, it must
1755 call `(triangle-recursively 6)'; and before `(triangle-recursively 6)'
1756 can calculate its answer, it must call `(triangle-recursively 5)'; and
1757 so on. That is to say, the calculation that `(triangle-recursively 7)'
1758 makes must be deferred until `(triangle-recursively 6)' makes its
1759 calculation; and `(triangle-recursively 6)' must defer until
1760 `(triangle-recursively 5)' completes; and so on.
1761
1762 If each of these instances of `triangle-recursively' are thought of as
1763 different robots, the first robot must wait for the second to complete
1764 its job, which must wait until the third completes, and so on.
1765
1766 There is a way around this kind of waiting, which we will discuss in
1767 *Note Recursion without Deferments: No Deferment.
1768
1769 
1770 File: eintr, Node: Recursion with cond, Next: Recursive Patterns, Prev: Recursive triangle function, Up: Recursion
1771
1772 11.3.5 Recursion Example Using `cond'
1773 -------------------------------------
1774
1775 The version of `triangle-recursively' described earlier is written with
1776 the `if' special form. It can also be written using another special
1777 form called `cond'. The name of the special form `cond' is an
1778 abbreviation of the word `conditional'.
1779
1780 Although the `cond' special form is not used as often in the Emacs Lisp
1781 sources as `if', it is used often enough to justify explaining it.
1782
1783 The template for a `cond' expression looks like this:
1784
1785 (cond
1786 BODY...)
1787
1788 where the BODY is a series of lists.
1789
1790 Written out more fully, the template looks like this:
1791
1792 (cond
1793 (FIRST-TRUE-OR-FALSE-TEST FIRST-CONSEQUENT)
1794 (SECOND-TRUE-OR-FALSE-TEST SECOND-CONSEQUENT)
1795 (THIRD-TRUE-OR-FALSE-TEST THIRD-CONSEQUENT)
1796 ...)
1797
1798 When the Lisp interpreter evaluates the `cond' expression, it evaluates
1799 the first element (the CAR or true-or-false-test) of the first
1800 expression in a series of expressions within the body of the `cond'.
1801
1802 If the true-or-false-test returns `nil' the rest of that expression,
1803 the consequent, is skipped and the true-or-false-test of the next
1804 expression is evaluated. When an expression is found whose
1805 true-or-false-test returns a value that is not `nil', the consequent of
1806 that expression is evaluated. The consequent can be one or more
1807 expressions. If the consequent consists of more than one expression,
1808 the expressions are evaluated in sequence and the value of the last one
1809 is returned. If the expression does not have a consequent, the value
1810 of the true-or-false-test is returned.
1811
1812 If none of the true-or-false-tests test true, the `cond' expression
1813 returns `nil'.
1814
1815 Written using `cond', the `triangle' function looks like this:
1816
1817 (defun triangle-using-cond (number)
1818 (cond ((<= number 0) 0)
1819 ((= number 1) 1)
1820 ((> number 1)
1821 (+ number (triangle-using-cond (1- number))))))
1822
1823 In this example, the `cond' returns 0 if the number is less than or
1824 equal to 0, it returns 1 if the number is 1 and it evaluates `(+ number
1825 (triangle-using-cond (1- number)))' if the number is greater than 1.
1826
1827 
1828 File: eintr, Node: Recursive Patterns, Next: No Deferment, Prev: Recursion with cond, Up: Recursion
1829
1830 11.3.6 Recursive Patterns
1831 -------------------------
1832
1833 Here are three common recursive patterns. Each involves a list.
1834 Recursion does not need to involve lists, but Lisp is designed for lists
1835 and this provides a sense of its primal capabilities.
1836
1837 * Menu:
1838
1839 * Every::
1840 * Accumulate::
1841 * Keep::
1842
1843 
1844 File: eintr, Node: Every, Next: Accumulate, Prev: Recursive Patterns, Up: Recursive Patterns
1845
1846 Recursive Pattern: _every_
1847 ..........................
1848
1849 In the `every' recursive pattern, an action is performed on every
1850 element of a list.
1851
1852 The basic pattern is:
1853
1854 * If a list be empty, return `nil'.
1855
1856 * Else, act on the beginning of the list (the CAR of the list)
1857 - through a recursive call by the function on the rest (the
1858 CDR) of the list,
1859
1860 - and, optionally, combine the acted-on element, using
1861 `cons', with the results of acting on the rest.
1862
1863 Here is example:
1864
1865 (defun square-each (numbers-list)
1866 "Square each of a NUMBERS LIST, recursively."
1867 (if (not numbers-list) ; do-again-test
1868 nil
1869 (cons
1870 (* (car numbers-list) (car numbers-list))
1871 (square-each (cdr numbers-list))))) ; next-step-expression
1872
1873 (square-each '(1 2 3))
1874 => (1 4 9)
1875
1876 If `numbers-list' is empty, do nothing. But if it has content,
1877 construct a list combining the square of the first number in the list
1878 with the result of the recursive call.
1879
1880 (The example follows the pattern exactly: `nil' is returned if the
1881 numbers' list is empty. In practice, you would write the conditional
1882 so it carries out the action when the numbers' list is not empty.)
1883
1884 The `print-elements-recursively' function (*note Recursion with a List:
1885 Recursion with list.) is another example of an `every' pattern, except
1886 in this case, rather than bring the results together using `cons', we
1887 print each element of output.
1888
1889 The `print-elements-recursively' function looks like this:
1890
1891 (setq animals '(gazelle giraffe lion tiger))
1892
1893 (defun print-elements-recursively (list)
1894 "Print each element of LIST on a line of its own.
1895 Uses recursion."
1896 (if list ; do-again-test
1897 (progn
1898 (print (car list)) ; body
1899 (print-elements-recursively ; recursive call
1900 (cdr list))))) ; next-step-expression
1901
1902 (print-elements-recursively animals)
1903
1904 The pattern for `print-elements-recursively' is:
1905
1906 * If the list be empty, do nothing.
1907
1908 * But if the list has at least one element,
1909 - act on the beginning of the list (the CAR of the list),
1910
1911 - and make a recursive call on the rest (the CDR) of the
1912 list.
1913
1914 
1915 File: eintr, Node: Accumulate, Next: Keep, Prev: Every, Up: Recursive Patterns
1916
1917 Recursive Pattern: _accumulate_
1918 ...............................
1919
1920 Another recursive pattern is called the `accumulate' pattern. In the
1921 `accumulate' recursive pattern, an action is performed on every element
1922 of a list and the result of that action is accumulated with the results
1923 of performing the action on the other elements.
1924
1925 This is very like the `every' pattern using `cons', except that `cons'
1926 is not used, but some other combiner.
1927
1928 The pattern is:
1929
1930 * If a list be empty, return zero or some other constant.
1931
1932 * Else, act on the beginning of the list (the CAR of the list),
1933 - and combine that acted-on element, using `+' or some
1934 other combining function, with
1935
1936 - a recursive call by the function on the rest (the CDR) of
1937 the list.
1938
1939 Here is an example:
1940
1941 (defun add-elements (numbers-list)
1942 "Add the elements of NUMBERS-LIST together."
1943 (if (not numbers-list)
1944 0
1945 (+ (car numbers-list) (add-elements (cdr numbers-list)))))
1946
1947 (add-elements '(1 2 3 4))
1948 => 10
1949
1950 *Note Making a List of Files: Files List, for an example of the
1951 accumulate pattern.
1952
1953 
1954 File: eintr, Node: Keep, Prev: Accumulate, Up: Recursive Patterns
1955
1956 Recursive Pattern: _keep_
1957 .........................
1958
1959 A third recursive pattern is called the `keep' pattern. In the `keep'
1960 recursive pattern, each element of a list is tested; the element is
1961 acted on and the results are kept only if the element meets a criterion.
1962
1963 Again, this is very like the `every' pattern, except the element is
1964 skipped unless it meets a criterion.
1965
1966 The pattern has three parts:
1967
1968 * If a list be empty, return `nil'.
1969
1970 * Else, if the beginning of the list (the CAR of the list) passes
1971 a test
1972 - act on that element and combine it, using `cons' with
1973
1974 - a recursive call by the function on the rest (the CDR) of
1975 the list.
1976
1977 * Otherwise, if the beginning of the list (the CAR of the list) fails
1978 the test
1979 - skip on that element,
1980
1981 - and, recursively call the function on the rest (the CDR)
1982 of the list.
1983
1984 Here is an example that uses `cond':
1985
1986 (defun keep-three-letter-words (word-list)
1987 "Keep three letter words in WORD-LIST."
1988 (cond
1989 ;; First do-again-test: stop-condition
1990 ((not word-list) nil)
1991
1992 ;; Second do-again-test: when to act
1993 ((eq 3 (length (symbol-name (car word-list))))
1994 ;; combine acted-on element with recursive call on shorter list
1995 (cons (car word-list) (keep-three-letter-words (cdr word-list))))
1996
1997 ;; Third do-again-test: when to skip element;
1998 ;; recursively call shorter list with next-step expression
1999 (t (keep-three-letter-words (cdr word-list)))))
2000
2001 (keep-three-letter-words '(one two three four five six))
2002 => (one two six)
2003
2004 It goes without saying that you need not use `nil' as the test for when
2005 to stop; and you can, of course, combine these patterns.
2006
2007 
2008 File: eintr, Node: No Deferment, Next: No deferment solution, Prev: Recursive Patterns, Up: Recursion
2009
2010 11.3.7 Recursion without Deferments
2011 -----------------------------------
2012
2013 Let's consider again what happens with the `triangle-recursively'
2014 function. We will find that the intermediate calculations are deferred
2015 until all can be done.
2016
2017 Here is the function definition:
2018
2019 (defun triangle-recursively (number)
2020 "Return the sum of the numbers 1 through NUMBER inclusive.
2021 Uses recursion."
2022 (if (= number 1) ; do-again-test
2023 1 ; then-part
2024 (+ number ; else-part
2025 (triangle-recursively ; recursive call
2026 (1- number))))) ; next-step-expression
2027
2028 What happens when we call this function with a argument of 7?
2029
2030 The first instance of the `triangle-recursively' function adds the
2031 number 7 to the value returned by a second instance of
2032 `triangle-recursively', an instance that has been passed an argument of
2033 6. That is to say, the first calculation is:
2034
2035 (+ 7 (triangle-recursively 6))
2036
2037 The first instance of `triangle-recursively'--you may want to think of
2038 it as a little robot--cannot complete its job. It must hand off the
2039 calculation for `(triangle-recursively 6)' to a second instance of the
2040 program, to a second robot. This second individual is completely
2041 different from the first one; it is, in the jargon, a `different
2042 instantiation'. Or, put another way, it is a different robot. It is
2043 the same model as the first; it calculates triangle numbers
2044 recursively; but it has a different serial number.
2045
2046 And what does `(triangle-recursively 6)' return? It returns the number
2047 6 added to the value returned by evaluating `triangle-recursively' with
2048 an argument of 5. Using the robot metaphor, it asks yet another robot
2049 to help it.
2050
2051 Now the total is:
2052
2053 (+ 7 6 (triangle-recursively 5))
2054
2055 And what happens next?
2056
2057 (+ 7 6 5 (triangle-recursively 4))
2058
2059 Each time `triangle-recursively' is called, except for the last time,
2060 it creates another instance of the program--another robot--and asks it
2061 to make a calculation.
2062
2063 Eventually, the full addition is set up and performed:
2064
2065 (+ 7 6 5 4 3 2 1)
2066
2067 This design for the function defers the calculation of the first step
2068 until the second can be done, and defers that until the third can be
2069 done, and so on. Each deferment means the computer must remember what
2070 is being waited on. This is not a problem when there are only a few
2071 steps, as in this example. But it can be a problem when there are more
2072 steps.
2073
2074 
2075 File: eintr, Node: No deferment solution, Prev: No Deferment, Up: Recursion
2076
2077 11.3.8 No Deferment Solution
2078 ----------------------------
2079
2080 The solution to the problem of deferred operations is to write in a
2081 manner that does not defer operations(1). This requires writing to a
2082 different pattern, often one that involves writing two function
2083 definitions, an `initialization' function and a `helper' function.
2084
2085 The `initialization' function sets up the job; the `helper' function
2086 does the work.
2087
2088 Here are the two function definitions for adding up numbers. They are
2089 so simple, I find them hard to understand.
2090
2091 (defun triangle-initialization (number)
2092 "Return the sum of the numbers 1 through NUMBER inclusive.
2093 This is the `initialization' component of a two function
2094 duo that uses recursion."
2095 (triangle-recursive-helper 0 0 number))
2096
2097 (defun triangle-recursive-helper (sum counter number)
2098 "Return SUM, using COUNTER, through NUMBER inclusive.
2099 This is the `helper' component of a two function duo
2100 that uses recursion."
2101 (if (> counter number)
2102 sum
2103 (triangle-recursive-helper (+ sum counter) ; sum
2104 (1+ counter) ; counter
2105 number))) ; number
2106
2107 Install both function definitions by evaluating them, then call
2108 `triangle-initialization' with 2 rows:
2109
2110 (triangle-initialization 2)
2111 => 3
2112
2113 The `initialization' function calls the first instance of the `helper'
2114 function with three arguments: zero, zero, and a number which is the
2115 number of rows in the triangle.
2116
2117 The first two arguments passed to the `helper' function are
2118 initialization values. These values are changed when
2119 `triangle-recursive-helper' invokes new instances.(2)
2120
2121 Let's see what happens when we have a triangle that has one row. (This
2122 triangle will have one pebble in it!)
2123
2124 `triangle-initialization' will call its helper with the arguments
2125 `0 0 1'. That function will run the conditional test whether `(>
2126 counter number)':
2127
2128 (> 0 1)
2129
2130 and find that the result is false, so it will invoke the else-part of
2131 the `if' clause:
2132
2133 (triangle-recursive-helper
2134 (+ sum counter) ; sum plus counter => sum
2135 (1+ counter) ; increment counter => counter
2136 number) ; number stays the same
2137
2138 which will first compute:
2139
2140 (triangle-recursive-helper (+ 0 0) ; sum
2141 (1+ 0) ; counter
2142 1) ; number
2143 which is:
2144
2145 (triangle-recursive-helper 0 1 1)
2146
2147 Again, `(> counter number)' will be false, so again, the Lisp
2148 interpreter will evaluate `triangle-recursive-helper', creating a new
2149 instance with new arguments.
2150
2151 This new instance will be;
2152
2153 (triangle-recursive-helper
2154 (+ sum counter) ; sum plus counter => sum
2155 (1+ counter) ; increment counter => counter
2156 number) ; number stays the same
2157
2158 which is:
2159
2160 (triangle-recursive-helper 1 2 1)
2161
2162 In this case, the `(> counter number)' test will be true! So the
2163 instance will return the value of the sum, which will be 1, as expected.
2164
2165 Now, let's pass `triangle-initialization' an argument of 2, to find out
2166 how many pebbles there are in a triangle with two rows.
2167
2168 That function calls `(triangle-recursive-helper 0 0 2)'.
2169
2170 In stages, the instances called will be:
2171
2172 sum counter number
2173 (triangle-recursive-helper 0 1 2)
2174
2175 (triangle-recursive-helper 1 2 2)
2176
2177 (triangle-recursive-helper 3 3 2)
2178
2179 When the last instance is called, the `(> counter number)' test will be
2180 true, so the instance will return the value of `sum', which will be 3.
2181
2182 This kind of pattern helps when you are writing functions that can use
2183 many resources in a computer.
2184
2185 ---------- Footnotes ----------
2186
2187 (1) The phrase "tail recursive" is used to describe such a process, one
2188 that uses `constant space'.
2189
2190 (2) The jargon is mildly confusing: `triangle-recursive-helper' uses a
2191 process that is iterative in a procedure that is recursive. The
2192 process is called iterative because the computer need only record the
2193 three values, `sum', `counter', and `number'; the procedure is
2194 recursive because the function `calls itself'. On the other hand, both
2195 the process and the procedure used by `triangle-recursively' are called
2196 recursive. The word `recursive' has different meanings in the two
2197 contexts.
2198
2199 
2200 File: eintr, Node: Looping exercise, Prev: Recursion, Up: Loops & Recursion
2201
2202 11.4 Looping Exercise
2203 =====================
2204
2205 * Write a function similar to `triangle' in which each row has a
2206 value which is the square of the row number. Use a `while' loop.
2207
2208 * Write a function similar to `triangle' that multiplies instead of
2209 adds the values.
2210
2211 * Rewrite these two functions recursively. Rewrite these functions
2212 using `cond'.
2213
2214 * Write a function for Texinfo mode that creates an index entry at
2215 the beginning of a paragraph for every `@dfn' within the paragraph.
2216 (In a Texinfo file, `@dfn' marks a definition. This book is
2217 written in Texinfo.)
2218
2219 Many of the functions you will need are described in two of the
2220 previous chapters, *Note Cutting and Storing Text: Cutting &
2221 Storing Text, and *Note Yanking Text Back: Yanking. If you use
2222 `forward-paragraph' to put the index entry at the beginning of the
2223 paragraph, you will have to use `C-h f' (`describe-function') to
2224 find out how to make the command go backwards.
2225
2226 For more information, see *Note Indicating Definitions:
2227 (texinfo)Indicating.
2228
2229 
2230 File: eintr, Node: Regexp Search, Next: Counting Words, Prev: Loops & Recursion, Up: Top
2231
2232 12 Regular Expression Searches
2233 ******************************
2234
2235 Regular expression searches are used extensively in GNU Emacs. The two
2236 functions, `forward-sentence' and `forward-paragraph', illustrate these
2237 searches well. They use regular expressions to find where to move
2238 point. The phrase `regular expression' is often written as `regexp'.
2239
2240 Regular expression searches are described in *Note Regular Expression
2241 Search: (emacs)Regexp Search, as well as in *Note Regular Expressions:
2242 (elisp)Regular Expressions. In writing this chapter, I am presuming
2243 that you have at least a mild acquaintance with them. The major point
2244 to remember is that regular expressions permit you to search for
2245 patterns as well as for literal strings of characters. For example,
2246 the code in `forward-sentence' searches for the pattern of possible
2247 characters that could mark the end of a sentence, and moves point to
2248 that spot.
2249
2250 Before looking at the code for the `forward-sentence' function, it is
2251 worth considering what the pattern that marks the end of a sentence
2252 must be. The pattern is discussed in the next section; following that
2253 is a description of the regular expression search function,
2254 `re-search-forward'. The `forward-sentence' function is described in
2255 the section following. Finally, the `forward-paragraph' function is
2256 described in the last section of this chapter. `forward-paragraph' is
2257 a complex function that introduces several new features.
2258
2259 * Menu:
2260
2261 * sentence-end::
2262 * re-search-forward::
2263 * forward-sentence::
2264 * forward-paragraph::
2265 * etags::
2266 * Regexp Review::
2267 * re-search Exercises::
2268
2269 
2270 File: eintr, Node: sentence-end, Next: re-search-forward, Prev: Regexp Search, Up: Regexp Search
2271
2272 12.1 The Regular Expression for `sentence-end'
2273 ==============================================
2274
2275 The symbol `sentence-end' is bound to the pattern that marks the end of
2276 a sentence. What should this regular expression be?
2277
2278 Clearly, a sentence may be ended by a period, a question mark, or an
2279 exclamation mark. Indeed, only clauses that end with one of those three
2280 characters should be considered the end of a sentence. This means that
2281 the pattern should include the character set:
2282
2283 [.?!]
2284
2285 However, we do not want `forward-sentence' merely to jump to a period,
2286 a question mark, or an exclamation mark, because such a character might
2287 be used in the middle of a sentence. A period, for example, is used
2288 after abbreviations. So other information is needed.
2289
2290 According to convention, you type two spaces after every sentence, but
2291 only one space after a period, a question mark, or an exclamation mark
2292 in the body of a sentence. So a period, a question mark, or an
2293 exclamation mark followed by two spaces is a good indicator of an end
2294 of sentence. However, in a file, the two spaces may instead be a tab
2295 or the end of a line. This means that the regular expression should
2296 include these three items as alternatives.
2297
2298 This group of alternatives will look like this:
2299
2300 \\($\\| \\| \\)
2301 ^ ^^
2302 TAB SPC
2303
2304 Here, `$' indicates the end of the line, and I have pointed out where
2305 the tab and two spaces are inserted in the expression. Both are
2306 inserted by putting the actual characters into the expression.
2307
2308 Two backslashes, `\\', are required before the parentheses and vertical
2309 bars: the first backslash quotes the following backslash in Emacs; and
2310 the second indicates that the following character, the parenthesis or
2311 the vertical bar, is special.
2312
2313 Also, a sentence may be followed by one or more carriage returns, like
2314 this:
2315
2316 [
2317 ]*
2318
2319 Like tabs and spaces, a carriage return is inserted into a regular
2320 expression by inserting it literally. The asterisk indicates that the
2321 <RET> is repeated zero or more times.
2322
2323 But a sentence end does not consist only of a period, a question mark or
2324 an exclamation mark followed by appropriate space: a closing quotation
2325 mark or a closing brace of some kind may precede the space. Indeed more
2326 than one such mark or brace may precede the space. These require a
2327 expression that looks like this:
2328
2329 []\"')}]*
2330
2331 In this expression, the first `]' is the first character in the
2332 expression; the second character is `"', which is preceded by a `\' to
2333 tell Emacs the `"' is _not_ special. The last three characters are
2334 `'', `)', and `}'.
2335
2336 All this suggests what the regular expression pattern for matching the
2337 end of a sentence should be; and, indeed, if we evaluate `sentence-end'
2338 we find that it returns the following value:
2339
2340 sentence-end
2341 => "[.?!][]\"')}]*\\($\\| \\| \\)[
2342 ]*"
2343
2344 (Well, not in GNU Emacs 22; that is because of an effort to make the
2345 process simpler. When its value is `nil', then use the value defined
2346 by the function `sentence-end', and that returns a value constructed
2347 from the variables `sentence-end-base', `sentence-end-double-space',
2348 `sentence-end-without-period', and `sentence-end-without-space'. The
2349 critical variable is `sentence-end-base'; its global value is similar
2350 to the one described above but it also contains two additional
2351 quotation marks. These have differing degrees of curliness. The
2352 `sentence-end-without-period' variable, when true, tells Emacs that a
2353 sentence may end without a period, such as text in Thai.)
2354
2355 
2356 File: eintr, Node: re-search-forward, Next: forward-sentence, Prev: sentence-end, Up: Regexp Search
2357
2358 12.2 The `re-search-forward' Function
2359 =====================================
2360
2361 The `re-search-forward' function is very like the `search-forward'
2362 function. (*Note The `search-forward' Function: search-forward.)
2363
2364 `re-search-forward' searches for a regular expression. If the search
2365 is successful, it leaves point immediately after the last character in
2366 the target. If the search is backwards, it leaves point just before
2367 the first character in the target. You may tell `re-search-forward' to
2368 return `t' for true. (Moving point is therefore a `side effect'.)
2369
2370 Like `search-forward', the `re-search-forward' function takes four
2371 arguments:
2372
2373 1. The first argument is the regular expression that the function
2374 searches for. The regular expression will be a string between
2375 quotations marks.
2376
2377 2. The optional second argument limits how far the function will
2378 search; it is a bound, which is specified as a position in the
2379 buffer.
2380
2381 3. The optional third argument specifies how the function responds to
2382 failure: `nil' as the third argument causes the function to signal
2383 an error (and print a message) when the search fails; any other
2384 value causes it to return `nil' if the search fails and `t' if the
2385 search succeeds.
2386
2387 4. The optional fourth argument is the repeat count. A negative
2388 repeat count causes `re-search-forward' to search backwards.
2389
2390 The template for `re-search-forward' looks like this:
2391
2392 (re-search-forward "REGULAR-EXPRESSION"
2393 LIMIT-OF-SEARCH
2394 WHAT-TO-DO-IF-SEARCH-FAILS
2395 REPEAT-COUNT)
2396
2397 The second, third, and fourth arguments are optional. However, if you
2398 want to pass a value to either or both of the last two arguments, you
2399 must also pass a value to all the preceding arguments. Otherwise, the
2400 Lisp interpreter will mistake which argument you are passing the value
2401 to.
2402
2403 In the `forward-sentence' function, the regular expression will be the
2404 value of the variable `sentence-end'. In simple form, that is:
2405
2406 "[.?!][]\"')}]*\\($\\| \\| \\)[
2407 ]*"
2408
2409 The limit of the search will be the end of the paragraph (since a
2410 sentence cannot go beyond a paragraph). If the search fails, the
2411 function will return `nil'; and the repeat count will be provided by
2412 the argument to the `forward-sentence' function.
2413
2414 
2415 File: eintr, Node: forward-sentence, Next: forward-paragraph, Prev: re-search-forward, Up: Regexp Search
2416
2417 12.3 `forward-sentence'
2418 =======================
2419
2420 The command to move the cursor forward a sentence is a straightforward
2421 illustration of how to use regular expression searches in Emacs Lisp.
2422 Indeed, the function looks longer and more complicated than it is; this
2423 is because the function is designed to go backwards as well as forwards;
2424 and, optionally, over more than one sentence. The function is usually
2425 bound to the key command `M-e'.
2426
2427 * Menu:
2428
2429 * Complete forward-sentence::
2430 * fwd-sentence while loops::
2431 * fwd-sentence re-search::
2432
2433 
2434 File: eintr, Node: Complete forward-sentence, Next: fwd-sentence while loops, Prev: forward-sentence, Up: forward-sentence
2435
2436 Complete `forward-sentence' function definition
2437 -----------------------------------------------
2438
2439 Here is the code for `forward-sentence':
2440
2441 (defun forward-sentence (&optional arg)
2442 "Move forward to next `sentence-end'. With argument, repeat.
2443 With negative argument, move backward repeatedly to `sentence-beginning'.
2444
2445 The variable `sentence-end' is a regular expression that matches ends of
2446 sentences. Also, every paragraph boundary terminates sentences as well."
2447 (interactive "p")
2448 (or arg (setq arg 1))
2449 (let ((opoint (point))
2450 (sentence-end (sentence-end)))
2451 (while (< arg 0)
2452 (let ((pos (point))
2453 (par-beg (save-excursion (start-of-paragraph-text) (point))))
2454 (if (and (re-search-backward sentence-end par-beg t)
2455 (or (< (match-end 0) pos)
2456 (re-search-backward sentence-end par-beg t)))
2457 (goto-char (match-end 0))
2458 (goto-char par-beg)))
2459 (setq arg (1+ arg)))
2460 (while (> arg 0)
2461 (let ((par-end (save-excursion (end-of-paragraph-text) (point))))
2462 (if (re-search-forward sentence-end par-end t)
2463 (skip-chars-backward " \t\n")
2464 (goto-char par-end)))
2465 (setq arg (1- arg)))
2466 (constrain-to-field nil opoint t)))
2467
2468 The function looks long at first sight and it is best to look at its
2469 skeleton first, and then its muscle. The way to see the skeleton is to
2470 look at the expressions that start in the left-most columns:
2471
2472 (defun forward-sentence (&optional arg)
2473 "DOCUMENTATION..."
2474 (interactive "p")
2475 (or arg (setq arg 1))
2476 (let ((opoint (point)) (sentence-end (sentence-end)))
2477 (while (< arg 0)
2478 (let ((pos (point))
2479 (par-beg (save-excursion (start-of-paragraph-text) (point))))
2480 REST-OF-BODY-OF-WHILE-LOOP-WHEN-GOING-BACKWARDS
2481 (while (> arg 0)
2482 (let ((par-end (save-excursion (end-of-paragraph-text) (point))))
2483 REST-OF-BODY-OF-WHILE-LOOP-WHEN-GOING-FORWARDS
2484 HANDLE-FORMS-AND-EQUIVALENT
2485
2486 This looks much simpler! The function definition consists of
2487 documentation, an `interactive' expression, an `or' expression, a `let'
2488 expression, and `while' loops.
2489
2490 Let's look at each of these parts in turn.
2491
2492 We note that the documentation is thorough and understandable.
2493
2494 The function has an `interactive "p"' declaration. This means that the
2495 processed prefix argument, if any, is passed to the function as its
2496 argument. (This will be a number.) If the function is not passed an
2497 argument (it is optional) then the argument `arg' will be bound to 1.
2498
2499 When `forward-sentence' is called non-interactively without an
2500 argument, `arg' is bound to `nil'. The `or' expression handles this.
2501 What it does is either leave the value of `arg' as it is, but only if
2502 `arg' is bound to a value; or it sets the value of `arg' to 1, in the
2503 case when `arg' is bound to `nil'.
2504
2505 Next is a `let'. That specifies the values of two local variables,
2506 `point' and `sentence-end'. The local value of point, from before the
2507 search, is used in the `constrain-to-field' function which handles
2508 forms and equivalents. The `sentence-end' variable is set by the
2509 `sentence-end' function.
2510
2511 
2512 File: eintr, Node: fwd-sentence while loops, Next: fwd-sentence re-search, Prev: Complete forward-sentence, Up: forward-sentence
2513
2514 The `while' loops
2515 -----------------
2516
2517 Two `while' loops follow. The first `while' has a true-or-false-test
2518 that tests true if the prefix argument for `forward-sentence' is a
2519 negative number. This is for going backwards. The body of this loop
2520 is similar to the body of the second `while' clause, but it is not
2521 exactly the same. We will skip this `while' loop and concentrate on
2522 the second `while' loop.
2523
2524 The second `while' loop is for moving point forward. Its skeleton
2525 looks like this:
2526
2527 (while (> arg 0) ; true-or-false-test
2528 (let VARLIST
2529 (if (TRUE-OR-FALSE-TEST)
2530 THEN-PART
2531 ELSE-PART
2532 (setq arg (1- arg)))) ; `while' loop decrementer
2533
2534 The `while' loop is of the decrementing kind. (*Note A Loop with a
2535 Decrementing Counter: Decrementing Loop.) It has a true-or-false-test
2536 that tests true so long as the counter (in this case, the variable
2537 `arg') is greater than zero; and it has a decrementer that subtracts 1
2538 from the value of the counter every time the loop repeats.
2539
2540 If no prefix argument is given to `forward-sentence', which is the most
2541 common way the command is used, this `while' loop will run once, since
2542 the value of `arg' will be 1.
2543
2544 The body of the `while' loop consists of a `let' expression, which
2545 creates and binds a local variable, and has, as its body, an `if'
2546 expression.
2547
2548 The body of the `while' loop looks like this:
2549
2550 (let ((par-end
2551 (save-excursion (end-of-paragraph-text) (point))))
2552 (if (re-search-forward sentence-end par-end t)
2553 (skip-chars-backward " \t\n")
2554 (goto-char par-end)))
2555
2556 The `let' expression creates and binds the local variable `par-end'.
2557 As we shall see, this local variable is designed to provide a bound or
2558 limit to the regular expression search. If the search fails to find a
2559 proper sentence ending in the paragraph, it will stop on reaching the
2560 end of the paragraph.
2561
2562 But first, let us examine how `par-end' is bound to the value of the
2563 end of the paragraph. What happens is that the `let' sets the value of
2564 `par-end' to the value returned when the Lisp interpreter evaluates the
2565 expression
2566
2567 (save-excursion (end-of-paragraph-text) (point))
2568
2569 In this expression, `(end-of-paragraph-text)' moves point to the end of
2570 the paragraph, `(point)' returns the value of point, and then
2571 `save-excursion' restores point to its original position. Thus, the
2572 `let' binds `par-end' to the value returned by the `save-excursion'
2573 expression, which is the position of the end of the paragraph. (The
2574 `(end-of-paragraph-text)' function uses `forward-paragraph', which we
2575 will discuss shortly.)
2576
2577 Emacs next evaluates the body of the `let', which is an `if' expression
2578 that looks like this:
2579
2580 (if (re-search-forward sentence-end par-end t) ; if-part
2581 (skip-chars-backward " \t\n") ; then-part
2582 (goto-char par-end))) ; else-part
2583
2584 The `if' tests whether its first argument is true and if so, evaluates
2585 its then-part; otherwise, the Emacs Lisp interpreter evaluates the
2586 else-part. The true-or-false-test of the `if' expression is the
2587 regular expression search.
2588
2589 It may seem odd to have what looks like the `real work' of the
2590 `forward-sentence' function buried here, but this is a common way this
2591 kind of operation is carried out in Lisp.
2592
2593 
2594 File: eintr, Node: fwd-sentence re-search, Prev: fwd-sentence while loops, Up: forward-sentence
2595
2596 The regular expression search
2597 -----------------------------
2598
2599 The `re-search-forward' function searches for the end of the sentence,
2600 that is, for the pattern defined by the `sentence-end' regular
2601 expression. If the pattern is found--if the end of the sentence is
2602 found--then the `re-search-forward' function does two things:
2603
2604 1. The `re-search-forward' function carries out a side effect, which
2605 is to move point to the end of the occurrence found.
2606
2607 2. The `re-search-forward' function returns a value of true. This is
2608 the value received by the `if', and means that the search was
2609 successful.
2610
2611 The side effect, the movement of point, is completed before the `if'
2612 function is handed the value returned by the successful conclusion of
2613 the search.
2614
2615 When the `if' function receives the value of true from a successful
2616 call to `re-search-forward', the `if' evaluates the then-part, which is
2617 the expression `(skip-chars-backward " \t\n")'. This expression moves
2618 backwards over any blank spaces, tabs or carriage returns until a
2619 printed character is found and then leaves point after the character.
2620 Since point has already been moved to the end of the pattern that marks
2621 the end of the sentence, this action leaves point right after the
2622 closing printed character of the sentence, which is usually a period.
2623
2624 On the other hand, if the `re-search-forward' function fails to find a
2625 pattern marking the end of the sentence, the function returns false.
2626 The false then causes the `if' to evaluate its third argument, which is
2627 `(goto-char par-end)': it moves point to the end of the paragraph.
2628
2629 (And if the text is in a form or equivalent, and point may not move
2630 fully, then the `constrain-to-field' function comes into play.)
2631
2632 Regular expression searches are exceptionally useful and the pattern
2633 illustrated by `re-search-forward', in which the search is the test of
2634 an `if' expression, is handy. You will see or write code incorporating
2635 this pattern often.
2636
2637 
2638 File: eintr, Node: forward-paragraph, Next: etags, Prev: forward-sentence, Up: Regexp Search
2639
2640 12.4 `forward-paragraph': a Goldmine of Functions
2641 =================================================
2642
2643 The `forward-paragraph' function moves point forward to the end of the
2644 paragraph. It is usually bound to `M-}' and makes use of a number of
2645 functions that are important in themselves, including `let*',
2646 `match-beginning', and `looking-at'.
2647
2648 The function definition for `forward-paragraph' is considerably longer
2649 than the function definition for `forward-sentence' because it works
2650 with a paragraph, each line of which may begin with a fill prefix.
2651
2652 A fill prefix consists of a string of characters that are repeated at
2653 the beginning of each line. For example, in Lisp code, it is a
2654 convention to start each line of a paragraph-long comment with `;;; '.
2655 In Text mode, four blank spaces make up another common fill prefix,
2656 creating an indented paragraph. (*Note Fill Prefix: (emacs)Fill
2657 Prefix, for more information about fill prefixes.)
2658
2659 The existence of a fill prefix means that in addition to being able to
2660 find the end of a paragraph whose lines begin on the left-most column,
2661 the `forward-paragraph' function must be able to find the end of a
2662 paragraph when all or many of the lines in the buffer begin with the
2663 fill prefix.
2664
2665 Moreover, it is sometimes practical to ignore a fill prefix that
2666 exists, especially when blank lines separate paragraphs. This is an
2667 added complication.
2668
2669 * Menu:
2670
2671 * forward-paragraph in brief::
2672 * fwd-para let::
2673 * fwd-para while::
2674
2675 
2676 File: eintr, Node: forward-paragraph in brief, Next: fwd-para let, Prev: forward-paragraph, Up: forward-paragraph
2677
2678 Shortened `forward-paragraph' function definition
2679 -------------------------------------------------
2680
2681 Rather than print all of the `forward-paragraph' function, we will only
2682 print parts of it. Read without preparation, the function can be
2683 daunting!
2684
2685 In outline, the function looks like this:
2686
2687 (defun forward-paragraph (&optional arg)
2688 "DOCUMENTATION..."
2689 (interactive "p")
2690 (or arg (setq arg 1))
2691 (let*
2692 VARLIST
2693 (while (and (< arg 0) (not (bobp))) ; backward-moving-code
2694 ...
2695 (while (and (> arg 0) (not (eobp))) ; forward-moving-code
2696 ...
2697
2698 The first parts of the function are routine: the function's argument
2699 list consists of one optional argument. Documentation follows.
2700
2701 The lower case `p' in the `interactive' declaration means that the
2702 processed prefix argument, if any, is passed to the function. This
2703 will be a number, and is the repeat count of how many paragraphs point
2704 will move. The `or' expression in the next line handles the common
2705 case when no argument is passed to the function, which occurs if the
2706 function is called from other code rather than interactively. This
2707 case was described earlier. (*Note The `forward-sentence' function:
2708 forward-sentence.) Now we reach the end of the familiar part of this
2709 function.
2710
2711 
2712 File: eintr, Node: fwd-para let, Next: fwd-para while, Prev: forward-paragraph in brief, Up: forward-paragraph
2713
2714 The `let*' expression
2715 ---------------------
2716
2717 The next line of the `forward-paragraph' function begins a `let*'
2718 expression. This is a different than `let'. The symbol is `let*' not
2719 `let'.
2720
2721 The `let*' special form is like `let' except that Emacs sets each
2722 variable in sequence, one after another, and variables in the latter
2723 part of the varlist can make use of the values to which Emacs set
2724 variables in the earlier part of the varlist.
2725
2726 (*Note `save-excursion' in `append-to-buffer': append save-excursion.)
2727
2728 In the `let*' expression in this function, Emacs binds a total of seven
2729 variables: `opoint', `fill-prefix-regexp', `parstart', `parsep',
2730 `sp-parstart', `start', and `found-start'.
2731
2732 The variable `parsep' appears twice, first, to remove instances of `^',
2733 and second, to handle fill prefixes.
2734
2735 The variable `opoint' is just the value of `point'. As you can guess,
2736 it is used in a `constrain-to-field' expression, just as in
2737 `forward-sentence'.
2738
2739 The variable `fill-prefix-regexp' is set to the value returned by
2740 evaluating the following list:
2741
2742 (and fill-prefix
2743 (not (equal fill-prefix ""))
2744 (not paragraph-ignore-fill-prefix)
2745 (regexp-quote fill-prefix))
2746
2747 This is an expression whose first element is the `and' special form.
2748
2749 As we learned earlier (*note The `kill-new' function: kill-new
2750 function.), the `and' special form evaluates each of its arguments
2751 until one of the arguments returns a value of `nil', in which case the
2752 `and' expression returns `nil'; however, if none of the arguments
2753 returns a value of `nil', the value resulting from evaluating the last
2754 argument is returned. (Since such a value is not `nil', it is
2755 considered true in Lisp.) In other words, an `and' expression returns
2756 a true value only if all its arguments are true.
2757
2758 In this case, the variable `fill-prefix-regexp' is bound to a non-`nil'
2759 value only if the following four expressions produce a true (i.e., a
2760 non-`nil') value when they are evaluated; otherwise,
2761 `fill-prefix-regexp' is bound to `nil'.
2762
2763 `fill-prefix'
2764 When this variable is evaluated, the value of the fill prefix, if
2765 any, is returned. If there is no fill prefix, this variable
2766 returns `nil'.
2767
2768 `(not (equal fill-prefix "")'
2769 This expression checks whether an existing fill prefix is an empty
2770 string, that is, a string with no characters in it. An empty
2771 string is not a useful fill prefix.
2772
2773 `(not paragraph-ignore-fill-prefix)'
2774 This expression returns `nil' if the variable
2775 `paragraph-ignore-fill-prefix' has been turned on by being set to a
2776 true value such as `t'.
2777
2778 `(regexp-quote fill-prefix)'
2779 This is the last argument to the `and' special form. If all the
2780 arguments to the `and' are true, the value resulting from
2781 evaluating this expression will be returned by the `and' expression
2782 and bound to the variable `fill-prefix-regexp',
2783
2784 The result of evaluating this `and' expression successfully is that
2785 `fill-prefix-regexp' will be bound to the value of `fill-prefix' as
2786 modified by the `regexp-quote' function. What `regexp-quote' does is
2787 read a string and return a regular expression that will exactly match
2788 the string and match nothing else. This means that
2789 `fill-prefix-regexp' will be set to a value that will exactly match the
2790 fill prefix if the fill prefix exists. Otherwise, the variable will be
2791 set to `nil'.
2792
2793 The next two local variables in the `let*' expression are designed to
2794 remove instances of `^' from `parstart' and `parsep', the local
2795 variables indicate the paragraph start and the paragraph separator.
2796 The next expression sets `parsep' again. That is to handle fill
2797 prefixes.
2798
2799 This is the setting that requires the definition call `let*' rather
2800 than `let'. The true-or-false-test for the `if' depends on whether the
2801 variable `fill-prefix-regexp' evaluates to `nil' or some other value.
2802
2803 If `fill-prefix-regexp' does not have a value, Emacs evaluates the
2804 else-part of the `if' expression and binds `parsep' to its local value.
2805 (`parsep' is a regular expression that matches what separates
2806 paragraphs.)
2807
2808 But if `fill-prefix-regexp' does have a value, Emacs evaluates the
2809 then-part of the `if' expression and binds `parsep' to a regular
2810 expression that includes the `fill-prefix-regexp' as part of the
2811 pattern.
2812
2813 Specifically, `parsep' is set to the original value of the paragraph
2814 separate regular expression concatenated with an alternative expression
2815 that consists of the `fill-prefix-regexp' followed by optional
2816 whitespace to the end of the line. The whitespace is defined by
2817 `"[ \t]*$"'.) The `\\|' defines this portion of the regexp as an
2818 alternative to `parsep'.
2819
2820 According to a comment in the code, the next local variable,
2821 `sp-parstart', is used for searching, and then the final two, `start'
2822 and `found-start', are set to `nil'.
2823
2824 Now we get into the body of the `let*'. The first part of the body of
2825 the `let*' deals with the case when the function is given a negative
2826 argument and is therefore moving backwards. We will skip this section.
2827
2828 
2829 File: eintr, Node: fwd-para while, Prev: fwd-para let, Up: forward-paragraph
2830
2831 The forward motion `while' loop
2832 -------------------------------
2833
2834 The second part of the body of the `let*' deals with forward motion.
2835 It is a `while' loop that repeats itself so long as the value of `arg'
2836 is greater than zero. In the most common use of the function, the
2837 value of the argument is 1, so the body of the `while' loop is
2838 evaluated exactly once, and the cursor moves forward one paragraph.
2839
2840 This part handles three situations: when point is between paragraphs,
2841 when there is a fill prefix and when there is no fill prefix.
2842
2843 The `while' loop looks like this:
2844
2845 ;; going forwards and not at the end of the buffer
2846 (while (and (> arg 0) (not (eobp)))
2847
2848 ;; between paragraphs
2849 ;; Move forward over separator lines...
2850 (while (and (not (eobp))
2851 (progn (move-to-left-margin) (not (eobp)))
2852 (looking-at parsep))
2853 (forward-line 1))
2854 ;; This decrements the loop
2855 (unless (eobp) (setq arg (1- arg)))
2856 ;; ... and one more line.
2857 (forward-line 1)
2858
2859 (if fill-prefix-regexp
2860 ;; There is a fill prefix; it overrides parstart;
2861 ;; we go forward line by line
2862 (while (and (not (eobp))
2863 (progn (move-to-left-margin) (not (eobp)))
2864 (not (looking-at parsep))
2865 (looking-at fill-prefix-regexp))
2866 (forward-line 1))
2867
2868 ;; There is no fill prefix;
2869 ;; we go forward character by character
2870 (while (and (re-search-forward sp-parstart nil 1)
2871 (progn (setq start (match-beginning 0))
2872 (goto-char start)
2873 (not (eobp)))
2874 (progn (move-to-left-margin)
2875 (not (looking-at parsep)))
2876 (or (not (looking-at parstart))
2877 (and use-hard-newlines
2878 (not (get-text-property (1- start) 'hard)))))
2879 (forward-char 1))
2880
2881 ;; and if there is no fill prefix and if we are not at the end,
2882 ;; go to whatever was found in the regular expression search
2883 ;; for sp-parstart
2884 (if (< (point) (point-max))
2885 (goto-char start))))
2886
2887 We can see that this is a decrementing counter `while' loop, using the
2888 expression `(setq arg (1- arg))' as the decrementer. That expression
2889 is not far from the `while', but is hidden in another Lisp macro, an
2890 `unless' macro. Unless we are at the end of the buffer -- that is what
2891 the `eobp' function determines; it is an abbreviation of `End Of Buffer
2892 P' -- we decrease the value of `arg' by one.
2893
2894 (If we are at the end of the buffer, we cannot go forward any more and
2895 the next loop of the `while' expression will test false since the test
2896 is an `and' with `(not (eobp))'. The `not' function means exactly as
2897 you expect; it is another name for `null', a function that returns true
2898 when its argument is false.)
2899
2900 Interestingly, the loop count is not decremented until we leave the
2901 space between paragraphs, unless we come to the end of buffer or stop
2902 seeing the local value of the paragraph separator.
2903
2904 That second `while' also has a `(move-to-left-margin)' expression. The
2905 function is self-explanatory. It is inside a `progn' expression and
2906 not the last element of its body, so it is only invoked for its side
2907 effect, which is to move point to the left margin of the current line.
2908
2909 The `looking-at' function is also self-explanatory; it returns true if
2910 the text after point matches the regular expression given as its
2911 argument.
2912
2913 The rest of the body of the loop looks difficult at first, but makes
2914 sense as you come to understand it.
2915
2916 First consider what happens if there is a fill prefix:
2917
2918 (if fill-prefix-regexp
2919 ;; There is a fill prefix; it overrides parstart;
2920 ;; we go forward line by line
2921 (while (and (not (eobp))
2922 (progn (move-to-left-margin) (not (eobp)))
2923 (not (looking-at parsep))
2924 (looking-at fill-prefix-regexp))
2925 (forward-line 1))
2926
2927 This expression moves point forward line by line so long as four
2928 conditions are true:
2929
2930 1. Point is not at the end of the buffer.
2931
2932 2. We can move to the left margin of the text and are not at the end
2933 of the buffer.
2934
2935 3. The text following point does not separate paragraphs.
2936
2937 4. The pattern following point is the fill prefix regular expression.
2938
2939 The last condition may be puzzling, until you remember that point was
2940 moved to the beginning of the line early in the `forward-paragraph'
2941 function. This means that if the text has a fill prefix, the
2942 `looking-at' function will see it.
2943
2944 Consider what happens when there is no fill prefix.
2945
2946 (while (and (re-search-forward sp-parstart nil 1)
2947 (progn (setq start (match-beginning 0))
2948 (goto-char start)
2949 (not (eobp)))
2950 (progn (move-to-left-margin)
2951 (not (looking-at parsep)))
2952 (or (not (looking-at parstart))
2953 (and use-hard-newlines
2954 (not (get-text-property (1- start) 'hard)))))
2955 (forward-char 1))
2956
2957 This `while' loop has us searching forward for `sp-parstart', which is
2958 the combination of possible whitespace with a the local value of the
2959 start of a paragraph or of a paragraph separator. (The latter two are
2960 within an expression starting `\(?:' so that they are not referenced by
2961 the `match-beginning' function.)
2962
2963 The two expressions,
2964
2965 (setq start (match-beginning 0))
2966 (goto-char start)
2967
2968 mean go to the start of the text matched by the regular expression
2969 search.
2970
2971 The `(match-beginning 0)' expression is new. It returns a number
2972 specifying the location of the start of the text that was matched by
2973 the last search.
2974
2975 The `match-beginning' function is used here because of a characteristic
2976 of a forward search: a successful forward search, regardless of whether
2977 it is a plain search or a regular expression search, moves point to the
2978 end of the text that is found. In this case, a successful search moves
2979 point to the end of the pattern for `sp-parstart'.
2980
2981 However, we want to put point at the end of the current paragraph, not
2982 somewhere else. Indeed, since the search possibly includes the
2983 paragraph separator, point may end up at the beginning of the next one
2984 unless we use an expression that includes `match-beginning'.
2985
2986 When given an argument of 0, `match-beginning' returns the position
2987 that is the start of the text matched by the most recent search. In
2988 this case, the most recent search looks for `sp-parstart'. The
2989 `(match-beginning 0)' expression returns the beginning position of that
2990 pattern, rather than the end position of that pattern.
2991
2992 (Incidentally, when passed a positive number as an argument, the
2993 `match-beginning' function returns the location of point at that
2994 parenthesized expression in the last search unless that parenthesized
2995 expression begins with `\(?:'. I don't know why `\(?:' appears here
2996 since the argument is 0.)
2997
2998 The last expression when there is no fill prefix is
2999
3000 (if (< (point) (point-max))
3001 (goto-char start))))
3002
3003 This says that if there is no fill prefix and if we are not at the end,
3004 point should move to the beginning of whatever was found by the regular
3005 expression search for `sp-parstart'.
3006
3007 The full definition for the `forward-paragraph' function not only
3008 includes code for going forwards, but also code for going backwards.
3009
3010 If you are reading this inside of GNU Emacs and you want to see the
3011 whole function, you can type `C-h f' (`describe-function') and the name
3012 of the function. This gives you the function documentation and the
3013 name of the library containing the function's source. Place point over
3014 the name of the library and press the RET key; you will be taken
3015 directly to the source. (Be sure to install your sources! Without
3016 them, you are like a person who tries to drive a car with his eyes
3017 shut!)
3018
3019 
3020 File: eintr, Node: etags, Next: Regexp Review, Prev: forward-paragraph, Up: Regexp Search
3021
3022 12.5 Create Your Own `TAGS' File
3023 ================================
3024
3025 Besides `C-h f' (`describe-function'), another way to see the source of
3026 a function is to type `M-.' (`find-tag') and the name of the function
3027 when prompted for it. This is a good habit to get into. This will
3028 take you directly to the source. If the `find-tag' function first asks
3029 you for the name of a `TAGS' table, give it the name of a `TAGS' file
3030 such as `/usr/local/src/emacs/src/TAGS'. (The exact path to your
3031 `TAGS' file depends on how your copy of Emacs was installed. I just
3032 told you the location that provides both my C and my Emacs Lisp
3033 sources.)
3034
3035 You can also create your own `TAGS' file for directories that lack one.
3036
3037 The `M-.' (`find-tag') command takes you directly to the source for a
3038 function, variable, node, or other source. The function depends on
3039 tags tables to tell it where to go.
3040
3041 You often need to build and install tags tables yourself. They are not
3042 built automatically. A tags table is called a `TAGS' file; the name is
3043 in upper case letters.
3044
3045 You can create a `TAGS' file by calling the `etags' program that comes
3046 as a part of the Emacs distribution. Usually, `etags' is compiled and
3047 installed when Emacs is built. (`etags' is not an Emacs Lisp function
3048 or a part of Emacs; it is a C program.)
3049
3050 To create a `TAGS' file, first switch to the directory in which you
3051 want to create the file. In Emacs you can do this with the `M-x cd'
3052 command, or by visiting a file in the directory, or by listing the
3053 directory with `C-x d' (`dired'). Then run the compile command, with
3054 `etags *.el' as the command to execute
3055
3056 M-x compile RET etags *.el RET
3057
3058 to create a `TAGS' file.
3059
3060 For example, if you have a large number of files in your `~/emacs'
3061 directory, as I do--I have 137 `.el' files in it, of which I load
3062 12--you can create a `TAGS' file for the Emacs Lisp files in that
3063 directory.
3064
3065 The `etags' program takes all the usual shell `wildcards'. For
3066 example, if you have two directories for which you want a single `TAGS
3067 file', type `etags *.el ../elisp/*.el', where `../elisp/' is the second
3068 directory:
3069
3070 M-x compile RET etags *.el ../elisp/*.el RET
3071
3072 Type
3073
3074 M-x compile RET etags --help RET
3075
3076 to see a list of the options accepted by `etags' as well as a list of
3077 supported languages.
3078
3079 The `etags' program handles more than 20 languages, including Emacs
3080 Lisp, Common Lisp, Scheme, C, C++, Ada, Fortran, Java, LaTeX, Pascal,
3081 Perl, Python, Texinfo, makefiles, and most assemblers. The program has
3082 no switches for specifying the language; it recognizes the language in
3083 an input file according to its file name and contents.
3084
3085 `etags' is very helpful when you are writing code yourself and want to
3086 refer back to functions you have already written. Just run `etags'
3087 again at intervals as you write new functions, so they become part of
3088 the `TAGS' file.
3089
3090 If you think an appropriate `TAGS' file already exists for what you
3091 want, but do not know where it is, you can use the `locate' program to
3092 attempt to find it.
3093
3094 Type `M-x locate <RET> TAGS <RET>' and Emacs will list for you the full
3095 path names of all your `TAGS' files. On my system, this command lists
3096 34 `TAGS' files. On the other hand, a `plain vanilla' system I
3097 recently installed did not contain any `TAGS' files.
3098
3099 If the tags table you want has been created, you can use the `M-x
3100 visit-tags-table' command to specify it. Otherwise, you will need to
3101 create the tag table yourself and then use `M-x visit-tags-table'.
3102
3103 Building Tags in the Emacs sources
3104 ..................................
3105
3106 The GNU Emacs sources come with a `Makefile' that contains a
3107 sophisticated `etags' command that creates, collects, and merges tags
3108 tables from all over the Emacs sources and puts the information into
3109 one `TAGS' file in the `src/' directory below the top level of your
3110 Emacs source directory.
3111
3112 To build this `TAGS' file, go to the top level of your Emacs source
3113 directory and run the compile command `make tags':
3114
3115 M-x compile RET make tags RET
3116
3117 (The `make tags' command works well with the GNU Emacs sources, as well
3118 as with some other source packages.)
3119
3120 For more information, see *Note Tag Tables: (emacs)Tags.
3121
3122 
3123 File: eintr, Node: Regexp Review, Next: re-search Exercises, Prev: etags, Up: Regexp Search
3124
3125 12.6 Review
3126 ===========
3127
3128 Here is a brief summary of some recently introduced functions.
3129
3130 `while'
3131 Repeatedly evaluate the body of the expression so long as the first
3132 element of the body tests true. Then return `nil'. (The
3133 expression is evaluated only for its side effects.)
3134
3135 For example:
3136
3137 (let ((foo 2))
3138 (while (> foo 0)
3139 (insert (format "foo is %d.\n" foo))
3140 (setq foo (1- foo))))
3141
3142 => foo is 2.
3143 foo is 1.
3144 nil
3145
3146 (The `insert' function inserts its arguments at point; the
3147 `format' function returns a string formatted from its arguments
3148 the way `message' formats its arguments; `\n' produces a new line.)
3149
3150 `re-search-forward'
3151 Search for a pattern, and if the pattern is found, move point to
3152 rest just after it.
3153
3154 Takes four arguments, like `search-forward':
3155
3156 1. A regular expression that specifies the pattern to search for.
3157 (Remember to put quotation marks around this argument!)
3158
3159 2. Optionally, the limit of the search.
3160
3161 3. Optionally, what to do if the search fails, return `nil' or an
3162 error message.
3163
3164 4. Optionally, how many times to repeat the search; if negative,
3165 the search goes backwards.
3166
3167 `let*'
3168 Bind some variables locally to particular values, and then
3169 evaluate the remaining arguments, returning the value of the last
3170 one. While binding the local variables, use the local values of
3171 variables bound earlier, if any.
3172
3173 For example:
3174
3175 (let* ((foo 7)
3176 (bar (* 3 foo)))
3177 (message "`bar' is %d." bar))
3178 => `bar' is 21.
3179
3180 `match-beginning'
3181 Return the position of the start of the text found by the last
3182 regular expression search.
3183
3184 `looking-at'
3185 Return `t' for true if the text after point matches the argument,
3186 which should be a regular expression.
3187
3188 `eobp'
3189 Return `t' for true if point is at the end of the accessible part
3190 of a buffer. The end of the accessible part is the end of the
3191 buffer if the buffer is not narrowed; it is the end of the
3192 narrowed part if the buffer is narrowed.
3193
3194 
3195 File: eintr, Node: re-search Exercises, Prev: Regexp Review, Up: Regexp Search
3196
3197 12.7 Exercises with `re-search-forward'
3198 =======================================
3199
3200 * Write a function to search for a regular expression that matches
3201 two or more blank lines in sequence.
3202
3203 * Write a function to search for duplicated words, such as `the the'.
3204 *Note Syntax of Regular Expressions: (emacs)Regexps, for
3205 information on how to write a regexp (a regular expression) to
3206 match a string that is composed of two identical halves. You can
3207 devise several regexps; some are better than others. The function
3208 I use is described in an appendix, along with several regexps.
3209 *Note `the-the' Duplicated Words Function: the-the.
3210
3211 
3212 File: eintr, Node: Counting Words, Next: Words in a defun, Prev: Regexp Search, Up: Top
3213
3214 13 Counting: Repetition and Regexps
3215 ***********************************
3216
3217 Repetition and regular expression searches are powerful tools that you
3218 often use when you write code in Emacs Lisp. This chapter illustrates
3219 the use of regular expression searches through the construction of word
3220 count commands using `while' loops and recursion.
3221
3222 * Menu:
3223
3224 * Why Count Words::
3225 * count-words-region::
3226 * recursive-count-words::
3227 * Counting Exercise::
3228
3229 
3230 File: eintr, Node: Why Count Words, Next: count-words-region, Prev: Counting Words, Up: Counting Words
3231
3232 Counting words
3233 ==============
3234
3235 The standard Emacs distribution contains a function for counting the
3236 number of lines within a region. However, there is no corresponding
3237 function for counting words.
3238
3239 Certain types of writing ask you to count words. Thus, if you write an
3240 essay, you may be limited to 800 words; if you write a novel, you may
3241 discipline yourself to write 1000 words a day. It seems odd to me that
3242 Emacs lacks a word count command. Perhaps people use Emacs mostly for
3243 code or types of documentation that do not require word counts; or
3244 perhaps they restrict themselves to the operating system word count
3245 command, `wc'. Alternatively, people may follow the publishers'
3246 convention and compute a word count by dividing the number of
3247 characters in a document by five. In any event, here are commands to
3248 count words.
3249
3250 
3251 File: eintr, Node: count-words-region, Next: recursive-count-words, Prev: Why Count Words, Up: Counting Words
3252
3253 13.1 The `count-words-region' Function
3254 ======================================
3255
3256 A word count command could count words in a line, paragraph, region, or
3257 buffer. What should the command cover? You could design the command
3258 to count the number of words in a complete buffer. However, the Emacs
3259 tradition encourages flexibility--you may want to count words in just a
3260 section, rather than all of a buffer. So it makes more sense to design
3261 the command to count the number of words in a region. Once you have a
3262 `count-words-region' command, you can, if you wish, count words in a
3263 whole buffer by marking it with `C-x h' (`mark-whole-buffer').
3264
3265 Clearly, counting words is a repetitive act: starting from the
3266 beginning of the region, you count the first word, then the second
3267 word, then the third word, and so on, until you reach the end of the
3268 region. This means that word counting is ideally suited to recursion
3269 or to a `while' loop.
3270
3271 * Menu:
3272
3273 * Design count-words-region::
3274 * Whitespace Bug::
3275
3276 
3277 File: eintr, Node: Design count-words-region, Next: Whitespace Bug, Prev: count-words-region, Up: count-words-region
3278
3279 Designing `count-words-region'
3280 ------------------------------
3281
3282 First, we will implement the word count command with a `while' loop,
3283 then with recursion. The command will, of course, be interactive.
3284
3285 The template for an interactive function definition is, as always:
3286
3287 (defun NAME-OF-FUNCTION (ARGUMENT-LIST)
3288 "DOCUMENTATION..."
3289 (INTERACTIVE-EXPRESSION...)
3290 BODY...)
3291
3292 What we need to do is fill in the slots.
3293
3294 The name of the function should be self-explanatory and similar to the
3295 existing `count-lines-region' name. This makes the name easier to
3296 remember. `count-words-region' is a good choice.
3297
3298 The function counts words within a region. This means that the
3299 argument list must contain symbols that are bound to the two positions,
3300 the beginning and end of the region. These two positions can be called
3301 `beginning' and `end' respectively. The first line of the
3302 documentation should be a single sentence, since that is all that is
3303 printed as documentation by a command such as `apropos'. The
3304 interactive expression will be of the form `(interactive "r")', since
3305 that will cause Emacs to pass the beginning and end of the region to
3306 the function's argument list. All this is routine.
3307
3308 The body of the function needs to be written to do three tasks: first,
3309 to set up conditions under which the `while' loop can count words,
3310 second, to run the `while' loop, and third, to send a message to the
3311 user.
3312
3313 When a user calls `count-words-region', point may be at the beginning
3314 or the end of the region. However, the counting process must start at
3315 the beginning of the region. This means we will want to put point
3316 there if it is not already there. Executing `(goto-char beginning)'
3317 ensures this. Of course, we will want to return point to its expected
3318 position when the function finishes its work. For this reason, the
3319 body must be enclosed in a `save-excursion' expression.
3320
3321 The central part of the body of the function consists of a `while' loop
3322 in which one expression jumps point forward word by word, and another
3323 expression counts those jumps. The true-or-false-test of the `while'
3324 loop should test true so long as point should jump forward, and false
3325 when point is at the end of the region.
3326
3327 We could use `(forward-word 1)' as the expression for moving point
3328 forward word by word, but it is easier to see what Emacs identifies as a
3329 `word' if we use a regular expression search.
3330
3331 A regular expression search that finds the pattern for which it is
3332 searching leaves point after the last character matched. This means
3333 that a succession of successful word searches will move point forward
3334 word by word.
3335
3336 As a practical matter, we want the regular expression search to jump
3337 over whitespace and punctuation between words as well as over the words
3338 themselves. A regexp that refuses to jump over interword whitespace
3339 would never jump more than one word! This means that the regexp should
3340 include the whitespace and punctuation that follows a word, if any, as
3341 well as the word itself. (A word may end a buffer and not have any
3342 following whitespace or punctuation, so that part of the regexp must be
3343 optional.)
3344
3345 Thus, what we want for the regexp is a pattern defining one or more
3346 word constituent characters followed, optionally, by one or more
3347 characters that are not word constituents. The regular expression for
3348 this is:
3349
3350 \w+\W*
3351
3352 The buffer's syntax table determines which characters are and are not
3353 word constituents. (*Note What Constitutes a Word or Symbol?: Syntax,
3354 for more about syntax. Also, see *Note Syntax: (emacs)Syntax, and
3355 *Note Syntax Tables: (elisp)Syntax Tables.)
3356
3357 The search expression looks like this:
3358
3359 (re-search-forward "\\w+\\W*")
3360
3361 (Note that paired backslashes precede the `w' and `W'. A single
3362 backslash has special meaning to the Emacs Lisp interpreter. It
3363 indicates that the following character is interpreted differently than
3364 usual. For example, the two characters, `\n', stand for `newline',
3365 rather than for a backslash followed by `n'. Two backslashes in a row
3366 stand for an ordinary, `unspecial' backslash, which in this case is
3367 followed by a letter, the combination of which is important to
3368 `re-search-forward'.)
3369
3370 We need a counter to count how many words there are; this variable must
3371 first be set to 0 and then incremented each time Emacs goes around the
3372 `while' loop. The incrementing expression is simply:
3373
3374 (setq count (1+ count))
3375
3376 Finally, we want to tell the user how many words there are in the
3377 region. The `message' function is intended for presenting this kind of
3378 information to the user. The message has to be phrased so that it
3379 reads properly regardless of how many words there are in the region: we
3380 don't want to say that "there are 1 words in the region". The conflict
3381 between singular and plural is ungrammatical. We can solve this
3382 problem by using a conditional expression that evaluates different
3383 messages depending on the number of words in the region. There are
3384 three possibilities: no words in the region, one word in the region,
3385 and more than one word. This means that the `cond' special form is
3386 appropriate.
3387
3388 All this leads to the following function definition:
3389
3390 ;;; First version; has bugs!
3391 (defun count-words-region (beginning end)
3392 "Print number of words in the region.
3393 Words are defined as at least one word-constituent
3394 character followed by at least one character that
3395 is not a word-constituent. The buffer's syntax
3396 table determines which characters these are."
3397 (interactive "r")
3398 (message "Counting words in region ... ")
3399
3400 ;;; 1. Set up appropriate conditions.
3401 (save-excursion
3402 (goto-char beginning)
3403 (let ((count 0))
3404
3405 ;;; 2. Run the while loop.
3406 (while (< (point) end)
3407 (re-search-forward "\\w+\\W*")
3408 (setq count (1+ count)))
3409
3410 ;;; 3. Send a message to the user.
3411 (cond ((zerop count)
3412 (message
3413 "The region does NOT have any words."))
3414 ((= 1 count)
3415 (message
3416 "The region has 1 word."))
3417 (t
3418 (message
3419 "The region has %d words." count))))))
3420
3421 As written, the function works, but not in all circumstances.
3422
3423 
3424 File: eintr, Node: Whitespace Bug, Prev: Design count-words-region, Up: count-words-region
3425
3426 13.1.1 The Whitespace Bug in `count-words-region'
3427 -------------------------------------------------
3428
3429 The `count-words-region' command described in the preceding section has
3430 two bugs, or rather, one bug with two manifestations. First, if you
3431 mark a region containing only whitespace in the middle of some text,
3432 the `count-words-region' command tells you that the region contains one
3433 word! Second, if you mark a region containing only whitespace at the
3434 end of the buffer or the accessible portion of a narrowed buffer, the
3435 command displays an error message that looks like this:
3436
3437 Search failed: "\\w+\\W*"
3438
3439 If you are reading this in Info in GNU Emacs, you can test for these
3440 bugs yourself.
3441
3442 First, evaluate the function in the usual manner to install it. Here
3443 is a copy of the definition. Place your cursor after the closing
3444 parenthesis and type `C-x C-e' to install it.
3445
3446 ;; First version; has bugs!
3447 (defun count-words-region (beginning end)
3448 "Print number of words in the region.
3449 Words are defined as at least one word-constituent character followed
3450 by at least one character that is not a word-constituent. The buffer's
3451 syntax table determines which characters these are."
3452 (interactive "r")
3453 (message "Counting words in region ... ")
3454
3455 ;;; 1. Set up appropriate conditions.
3456 (save-excursion
3457 (goto-char beginning)
3458 (let ((count 0))
3459
3460 ;;; 2. Run the while loop.
3461 (while (< (point) end)
3462 (re-search-forward "\\w+\\W*")
3463 (setq count (1+ count)))
3464
3465 ;;; 3. Send a message to the user.
3466 (cond ((zerop count)
3467 (message "The region does NOT have any words."))
3468 ((= 1 count) (message "The region has 1 word."))
3469 (t (message "The region has %d words." count))))))
3470
3471 If you wish, you can also install this keybinding by evaluating it:
3472
3473 (global-set-key "\C-c=" 'count-words-region)
3474
3475 To conduct the first test, set mark and point to the beginning and end
3476 of the following line and then type `C-c =' (or `M-x
3477 count-words-region' if you have not bound `C-c ='):
3478
3479 one two three
3480
3481 Emacs will tell you, correctly, that the region has three words.
3482
3483 Repeat the test, but place mark at the beginning of the line and place
3484 point just _before_ the word `one'. Again type the command `C-c =' (or
3485 `M-x count-words-region'). Emacs should tell you that the region has
3486 no words, since it is composed only of the whitespace at the beginning
3487 of the line. But instead Emacs tells you that the region has one word!
3488
3489 For the third test, copy the sample line to the end of the `*scratch*'
3490 buffer and then type several spaces at the end of the line. Place mark
3491 right after the word `three' and point at the end of line. (The end of
3492 the line will be the end of the buffer.) Type `C-c =' (or `M-x
3493 count-words-region') as you did before. Again, Emacs should tell you
3494 that the region has no words, since it is composed only of the
3495 whitespace at the end of the line. Instead, Emacs displays an error
3496 message saying `Search failed'.
3497
3498 The two bugs stem from the same problem.
3499
3500 Consider the first manifestation of the bug, in which the command tells
3501 you that the whitespace at the beginning of the line contains one word.
3502 What happens is this: The `M-x count-words-region' command moves point
3503 to the beginning of the region. The `while' tests whether the value of
3504 point is smaller than the value of `end', which it is. Consequently,
3505 the regular expression search looks for and finds the first word. It
3506 leaves point after the word. `count' is set to one. The `while' loop
3507 repeats; but this time the value of point is larger than the value of
3508 `end', the loop is exited; and the function displays a message saying
3509 the number of words in the region is one. In brief, the regular
3510 expression search looks for and finds the word even though it is outside
3511 the marked region.
3512
3513 In the second manifestation of the bug, the region is whitespace at the
3514 end of the buffer. Emacs says `Search failed'. What happens is that
3515 the true-or-false-test in the `while' loop tests true, so the search
3516 expression is executed. But since there are no more words in the
3517 buffer, the search fails.
3518
3519 In both manifestations of the bug, the search extends or attempts to
3520 extend outside of the region.
3521
3522 The solution is to limit the search to the region--this is a fairly
3523 simple action, but as you may have come to expect, it is not quite as
3524 simple as you might think.
3525
3526 As we have seen, the `re-search-forward' function takes a search
3527 pattern as its first argument. But in addition to this first,
3528 mandatory argument, it accepts three optional arguments. The optional
3529 second argument bounds the search. The optional third argument, if
3530 `t', causes the function to return `nil' rather than signal an error if
3531 the search fails. The optional fourth argument is a repeat count. (In
3532 Emacs, you can see a function's documentation by typing `C-h f', the
3533 name of the function, and then <RET>.)
3534
3535 In the `count-words-region' definition, the value of the end of the
3536 region is held by the variable `end' which is passed as an argument to
3537 the function. Thus, we can add `end' as an argument to the regular
3538 expression search expression:
3539
3540 (re-search-forward "\\w+\\W*" end)
3541
3542 However, if you make only this change to the `count-words-region'
3543 definition and then test the new version of the definition on a stretch
3544 of whitespace, you will receive an error message saying `Search failed'.
3545
3546 What happens is this: the search is limited to the region, and fails as
3547 you expect because there are no word-constituent characters in the
3548 region. Since it fails, we receive an error message. But we do not
3549 want to receive an error message in this case; we want to receive the
3550 message that "The region does NOT have any words."
3551
3552 The solution to this problem is to provide `re-search-forward' with a
3553 third argument of `t', which causes the function to return `nil' rather
3554 than signal an error if the search fails.
3555
3556 However, if you make this change and try it, you will see the message
3557 "Counting words in region ... " and ... you will keep on seeing that
3558 message ..., until you type `C-g' (`keyboard-quit').
3559
3560 Here is what happens: the search is limited to the region, as before,
3561 and it fails because there are no word-constituent characters in the
3562 region, as expected. Consequently, the `re-search-forward' expression
3563 returns `nil'. It does nothing else. In particular, it does not move
3564 point, which it does as a side effect if it finds the search target.
3565 After the `re-search-forward' expression returns `nil', the next
3566 expression in the `while' loop is evaluated. This expression
3567 increments the count. Then the loop repeats. The true-or-false-test
3568 tests true because the value of point is still less than the value of
3569 end, since the `re-search-forward' expression did not move point. ...
3570 and the cycle repeats ...
3571
3572 The `count-words-region' definition requires yet another modification,
3573 to cause the true-or-false-test of the `while' loop to test false if
3574 the search fails. Put another way, there are two conditions that must
3575 be satisfied in the true-or-false-test before the word count variable
3576 is incremented: point must still be within the region and the search
3577 expression must have found a word to count.
3578
3579 Since both the first condition and the second condition must be true
3580 together, the two expressions, the region test and the search
3581 expression, can be joined with an `and' special form and embedded in
3582 the `while' loop as the true-or-false-test, like this:
3583
3584 (and (< (point) end) (re-search-forward "\\w+\\W*" end t))
3585
3586 (*Note The `kill-new' function: kill-new function, for information
3587 about `and'.)
3588
3589 The `re-search-forward' expression returns `t' if the search succeeds
3590 and as a side effect moves point. Consequently, as words are found,
3591 point is moved through the region. When the search expression fails to
3592 find another word, or when point reaches the end of the region, the
3593 true-or-false-test tests false, the `while' loop exits, and the
3594 `count-words-region' function displays one or other of its messages.
3595
3596 After incorporating these final changes, the `count-words-region' works
3597 without bugs (or at least, without bugs that I have found!). Here is
3598 what it looks like:
3599
3600 ;;; Final version: `while'
3601 (defun count-words-region (beginning end)
3602 "Print number of words in the region."
3603 (interactive "r")
3604 (message "Counting words in region ... ")
3605
3606 ;;; 1. Set up appropriate conditions.
3607 (save-excursion
3608 (let ((count 0))
3609 (goto-char beginning)
3610
3611 ;;; 2. Run the while loop.
3612 (while (and (< (point) end)
3613 (re-search-forward "\\w+\\W*" end t))
3614 (setq count (1+ count)))
3615
3616 ;;; 3. Send a message to the user.
3617 (cond ((zerop count)
3618 (message
3619 "The region does NOT have any words."))
3620 ((= 1 count)
3621 (message
3622 "The region has 1 word."))
3623 (t
3624 (message
3625 "The region has %d words." count))))))
3626
3627 
3628 File: eintr, Node: recursive-count-words, Next: Counting Exercise, Prev: count-words-region, Up: Counting Words
3629
3630 13.2 Count Words Recursively
3631 ============================
3632
3633 You can write the function for counting words recursively as well as
3634 with a `while' loop. Let's see how this is done.
3635
3636 First, we need to recognize that the `count-words-region' function has
3637 three jobs: it sets up the appropriate conditions for counting to
3638 occur; it counts the words in the region; and it sends a message to the
3639 user telling how many words there are.
3640
3641 If we write a single recursive function to do everything, we will
3642 receive a message for every recursive call. If the region contains 13
3643 words, we will receive thirteen messages, one right after the other.
3644 We don't want this! Instead, we must write two functions to do the
3645 job, one of which (the recursive function) will be used inside of the
3646 other. One function will set up the conditions and display the
3647 message; the other will return the word count.
3648
3649 Let us start with the function that causes the message to be displayed.
3650 We can continue to call this `count-words-region'.
3651
3652 This is the function that the user will call. It will be interactive.
3653 Indeed, it will be similar to our previous versions of this function,
3654 except that it will call `recursive-count-words' to determine how many
3655 words are in the region.
3656
3657 We can readily construct a template for this function, based on our
3658 previous versions:
3659
3660 ;; Recursive version; uses regular expression search
3661 (defun count-words-region (beginning end)
3662 "DOCUMENTATION..."
3663 (INTERACTIVE-EXPRESSION...)
3664
3665 ;;; 1. Set up appropriate conditions.
3666 (EXPLANATORY MESSAGE)
3667 (SET-UP FUNCTIONS...
3668
3669 ;;; 2. Count the words.
3670 RECURSIVE CALL
3671
3672 ;;; 3. Send a message to the user.
3673 MESSAGE PROVIDING WORD COUNT))
3674
3675 The definition looks straightforward, except that somehow the count
3676 returned by the recursive call must be passed to the message displaying
3677 the word count. A little thought suggests that this can be done by
3678 making use of a `let' expression: we can bind a variable in the varlist
3679 of a `let' expression to the number of words in the region, as returned
3680 by the recursive call; and then the `cond' expression, using binding,
3681 can display the value to the user.
3682
3683 Often, one thinks of the binding within a `let' expression as somehow
3684 secondary to the `primary' work of a function. But in this case, what
3685 you might consider the `primary' job of the function, counting words,
3686 is done within the `let' expression.
3687
3688 Using `let', the function definition looks like this:
3689
3690 (defun count-words-region (beginning end)
3691 "Print number of words in the region."
3692 (interactive "r")
3693
3694 ;;; 1. Set up appropriate conditions.
3695 (message "Counting words in region ... ")
3696 (save-excursion
3697 (goto-char beginning)
3698
3699 ;;; 2. Count the words.
3700 (let ((count (recursive-count-words end)))
3701
3702 ;;; 3. Send a message to the user.
3703 (cond ((zerop count)
3704 (message
3705 "The region does NOT have any words."))
3706 ((= 1 count)
3707 (message
3708 "The region has 1 word."))
3709 (t
3710 (message
3711 "The region has %d words." count))))))
3712
3713 Next, we need to write the recursive counting function.
3714
3715 A recursive function has at least three parts: the `do-again-test', the
3716 `next-step-expression', and the recursive call.
3717
3718 The do-again-test determines whether the function will or will not be
3719 called again. Since we are counting words in a region and can use a
3720 function that moves point forward for every word, the do-again-test can
3721 check whether point is still within the region. The do-again-test
3722 should find the value of point and determine whether point is before,
3723 at, or after the value of the end of the region. We can use the
3724 `point' function to locate point. Clearly, we must pass the value of
3725 the end of the region to the recursive counting function as an argument.
3726
3727 In addition, the do-again-test should also test whether the search
3728 finds a word. If it does not, the function should not call itself
3729 again.
3730
3731 The next-step-expression changes a value so that when the recursive
3732 function is supposed to stop calling itself, it stops. More precisely,
3733 the next-step-expression changes a value so that at the right time, the
3734 do-again-test stops the recursive function from calling itself again.
3735 In this case, the next-step-expression can be the expression that moves
3736 point forward, word by word.
3737
3738 The third part of a recursive function is the recursive call.
3739
3740 Somewhere, also, we also need a part that does the `work' of the
3741 function, a part that does the counting. A vital part!
3742
3743 But already, we have an outline of the recursive counting function:
3744
3745 (defun recursive-count-words (region-end)
3746 "DOCUMENTATION..."
3747 DO-AGAIN-TEST
3748 NEXT-STEP-EXPRESSION
3749 RECURSIVE CALL)
3750
3751 Now we need to fill in the slots. Let's start with the simplest cases
3752 first: if point is at or beyond the end of the region, there cannot be
3753 any words in the region, so the function should return zero. Likewise,
3754 if the search fails, there are no words to count, so the function
3755 should return zero.
3756
3757 On the other hand, if point is within the region and the search
3758 succeeds, the function should call itself again.
3759
3760 Thus, the do-again-test should look like this:
3761
3762 (and (< (point) region-end)
3763 (re-search-forward "\\w+\\W*" region-end t))
3764
3765 Note that the search expression is part of the do-again-test--the
3766 function returns `t' if its search succeeds and `nil' if it fails.
3767 (*Note The Whitespace Bug in `count-words-region': Whitespace Bug, for
3768 an explanation of how `re-search-forward' works.)
3769
3770 The do-again-test is the true-or-false test of an `if' clause.
3771 Clearly, if the do-again-test succeeds, the then-part of the `if'
3772 clause should call the function again; but if it fails, the else-part
3773 should return zero since either point is outside the region or the
3774 search failed because there were no words to find.
3775
3776 But before considering the recursive call, we need to consider the
3777 next-step-expression. What is it? Interestingly, it is the search
3778 part of the do-again-test.
3779
3780 In addition to returning `t' or `nil' for the do-again-test,
3781 `re-search-forward' moves point forward as a side effect of a
3782 successful search. This is the action that changes the value of point
3783 so that the recursive function stops calling itself when point
3784 completes its movement through the region. Consequently, the
3785 `re-search-forward' expression is the next-step-expression.
3786
3787 In outline, then, the body of the `recursive-count-words' function
3788 looks like this:
3789
3790 (if DO-AGAIN-TEST-AND-NEXT-STEP-COMBINED
3791 ;; then
3792 RECURSIVE-CALL-RETURNING-COUNT
3793 ;; else
3794 RETURN-ZERO)
3795
3796 How to incorporate the mechanism that counts?
3797
3798 If you are not used to writing recursive functions, a question like
3799 this can be troublesome. But it can and should be approached
3800 systematically.
3801
3802 We know that the counting mechanism should be associated in some way
3803 with the recursive call. Indeed, since the next-step-expression moves
3804 point forward by one word, and since a recursive call is made for each
3805 word, the counting mechanism must be an expression that adds one to the
3806 value returned by a call to `recursive-count-words'.
3807
3808 Consider several cases:
3809
3810 * If there are two words in the region, the function should return a
3811 value resulting from adding one to the value returned when it
3812 counts the first word, plus the number returned when it counts the
3813 remaining words in the region, which in this case is one.
3814
3815 * If there is one word in the region, the function should return a
3816 value resulting from adding one to the value returned when it
3817 counts that word, plus the number returned when it counts the
3818 remaining words in the region, which in this case is zero.
3819
3820 * If there are no words in the region, the function should return
3821 zero.
3822
3823 From the sketch we can see that the else-part of the `if' returns zero
3824 for the case of no words. This means that the then-part of the `if'
3825 must return a value resulting from adding one to the value returned
3826 from a count of the remaining words.
3827
3828 The expression will look like this, where `1+' is a function that adds
3829 one to its argument.
3830
3831 (1+ (recursive-count-words region-end))
3832
3833 The whole `recursive-count-words' function will then look like this:
3834
3835 (defun recursive-count-words (region-end)
3836 "DOCUMENTATION..."
3837
3838 ;;; 1. do-again-test
3839 (if (and (< (point) region-end)
3840 (re-search-forward "\\w+\\W*" region-end t))
3841
3842 ;;; 2. then-part: the recursive call
3843 (1+ (recursive-count-words region-end))
3844
3845 ;;; 3. else-part
3846 0))
3847
3848 Let's examine how this works:
3849
3850 If there are no words in the region, the else part of the `if'
3851 expression is evaluated and consequently the function returns zero.
3852
3853 If there is one word in the region, the value of point is less than the
3854 value of `region-end' and the search succeeds. In this case, the
3855 true-or-false-test of the `if' expression tests true, and the then-part
3856 of the `if' expression is evaluated. The counting expression is
3857 evaluated. This expression returns a value (which will be the value
3858 returned by the whole function) that is the sum of one added to the
3859 value returned by a recursive call.
3860
3861 Meanwhile, the next-step-expression has caused point to jump over the
3862 first (and in this case only) word in the region. This means that when
3863 `(recursive-count-words region-end)' is evaluated a second time, as a
3864 result of the recursive call, the value of point will be equal to or
3865 greater than the value of region end. So this time,
3866 `recursive-count-words' will return zero. The zero will be added to
3867 one, and the original evaluation of `recursive-count-words' will return
3868 one plus zero, which is one, which is the correct amount.
3869
3870 Clearly, if there are two words in the region, the first call to
3871 `recursive-count-words' returns one added to the value returned by
3872 calling `recursive-count-words' on a region containing the remaining
3873 word--that is, it adds one to one, producing two, which is the correct
3874 amount.
3875
3876 Similarly, if there are three words in the region, the first call to
3877 `recursive-count-words' returns one added to the value returned by
3878 calling `recursive-count-words' on a region containing the remaining
3879 two words--and so on and so on.
3880
3881 With full documentation the two functions look like this:
3882
3883 The recursive function:
3884
3885 (defun recursive-count-words (region-end)
3886 "Number of words between point and REGION-END."
3887
3888 ;;; 1. do-again-test
3889 (if (and (< (point) region-end)
3890 (re-search-forward "\\w+\\W*" region-end t))
3891
3892 ;;; 2. then-part: the recursive call
3893 (1+ (recursive-count-words region-end))
3894
3895 ;;; 3. else-part
3896 0))
3897
3898 The wrapper:
3899
3900 ;;; Recursive version
3901 (defun count-words-region (beginning end)
3902 "Print number of words in the region.
3903
3904 Words are defined as at least one word-constituent
3905 character followed by at least one character that is
3906 not a word-constituent. The buffer's syntax table
3907 determines which characters these are."
3908 (interactive "r")
3909 (message "Counting words in region ... ")
3910 (save-excursion
3911 (goto-char beginning)
3912 (let ((count (recursive-count-words end)))
3913 (cond ((zerop count)
3914 (message
3915 "The region does NOT have any words."))
3916 ((= 1 count)
3917 (message "The region has 1 word."))
3918 (t
3919 (message
3920 "The region has %d words." count))))))
3921
3922 
3923 File: eintr, Node: Counting Exercise, Prev: recursive-count-words, Up: Counting Words
3924
3925 13.3 Exercise: Counting Punctuation
3926 ===================================
3927
3928 Using a `while' loop, write a function to count the number of
3929 punctuation marks in a region--period, comma, semicolon, colon,
3930 exclamation mark, and question mark. Do the same using recursion.
3931
3932 
3933 File: eintr, Node: Words in a defun, Next: Readying a Graph, Prev: Counting Words, Up: Top
3934
3935 14 Counting Words in a `defun'
3936 ******************************
3937
3938 Our next project is to count the number of words in a function
3939 definition. Clearly, this can be done using some variant of
3940 `count-word-region'. *Note Counting Words: Repetition and Regexps:
3941 Counting Words. If we are just going to count the words in one
3942 definition, it is easy enough to mark the definition with the `C-M-h'
3943 (`mark-defun') command, and then call `count-word-region'.
3944
3945 However, I am more ambitious: I want to count the words and symbols in
3946 every definition in the Emacs sources and then print a graph that shows
3947 how many functions there are of each length: how many contain 40 to 49
3948 words or symbols, how many contain 50 to 59 words or symbols, and so
3949 on. I have often been curious how long a typical function is, and this
3950 will tell.
3951
3952 * Menu:
3953
3954 * Divide and Conquer::
3955 * Words and Symbols::
3956 * Syntax::
3957 * count-words-in-defun::
3958 * Several defuns::
3959 * Find a File::
3960 * lengths-list-file::
3961 * Several files::
3962 * Several files recursively::
3963 * Prepare the data::
3964
3965 
3966 File: eintr, Node: Divide and Conquer, Next: Words and Symbols, Prev: Words in a defun, Up: Words in a defun
3967
3968 Divide and Conquer
3969 ==================
3970
3971 Described in one phrase, the histogram project is daunting; but divided
3972 into numerous small steps, each of which we can take one at a time, the
3973 project becomes less fearsome. Let us consider what the steps must be:
3974
3975 * First, write a function to count the words in one definition. This
3976 includes the problem of handling symbols as well as words.
3977
3978 * Second, write a function to list the numbers of words in each
3979 function in a file. This function can use the
3980 `count-words-in-defun' function.
3981
3982 * Third, write a function to list the numbers of words in each
3983 function in each of several files. This entails automatically
3984 finding the various files, switching to them, and counting the
3985 words in the definitions within them.
3986
3987 * Fourth, write a function to convert the list of numbers that we
3988 created in step three to a form that will be suitable for printing
3989 as a graph.
3990
3991 * Fifth, write a function to print the results as a graph.
3992
3993 This is quite a project! But if we take each step slowly, it will not
3994 be difficult.
3995
3996 
3997 File: eintr, Node: Words and Symbols, Next: Syntax, Prev: Divide and Conquer, Up: Words in a defun
3998
3999 14.1 What to Count?
4000 ===================
4001
4002 When we first start thinking about how to count the words in a function
4003 definition, the first question is (or ought to be) what are we going to
4004 count? When we speak of `words' with respect to a Lisp function
4005 definition, we are actually speaking, in large part, of `symbols'. For
4006 example, the following `multiply-by-seven' function contains the five
4007 symbols `defun', `multiply-by-seven', `number', `*', and `7'. In
4008 addition, in the documentation string, it contains the four words
4009 `Multiply', `NUMBER', `by', and `seven'. The symbol `number' is
4010 repeated, so the definition contains a total of ten words and symbols.
4011
4012 (defun multiply-by-seven (number)
4013 "Multiply NUMBER by seven."
4014 (* 7 number))
4015
4016 However, if we mark the `multiply-by-seven' definition with `C-M-h'
4017 (`mark-defun'), and then call `count-words-region' on it, we will find
4018 that `count-words-region' claims the definition has eleven words, not
4019 ten! Something is wrong!
4020
4021 The problem is twofold: `count-words-region' does not count the `*' as
4022 a word, and it counts the single symbol, `multiply-by-seven', as
4023 containing three words. The hyphens are treated as if they were
4024 interword spaces rather than intraword connectors: `multiply-by-seven'
4025 is counted as if it were written `multiply by seven'.
4026
4027 The cause of this confusion is the regular expression search within the
4028 `count-words-region' definition that moves point forward word by word.
4029 In the canonical version of `count-words-region', the regexp is:
4030
4031 "\\w+\\W*"
4032
4033 This regular expression is a pattern defining one or more word
4034 constituent characters possibly followed by one or more characters that
4035 are not word constituents. What is meant by `word constituent
4036 characters' brings us to the issue of syntax, which is worth a section
4037 of its own.
4038
4039 
4040 File: eintr, Node: Syntax, Next: count-words-in-defun, Prev: Words and Symbols, Up: Words in a defun
4041
4042 14.2 What Constitutes a Word or Symbol?
4043 =======================================
4044
4045 Emacs treats different characters as belonging to different "syntax
4046 categories". For example, the regular expression, `\\w+', is a pattern
4047 specifying one or more _word constituent_ characters. Word constituent
4048 characters are members of one syntax category. Other syntax categories
4049 include the class of punctuation characters, such as the period and the
4050 comma, and the class of whitespace characters, such as the blank space
4051 and the tab character. (For more information, see *Note Syntax:
4052 (emacs)Syntax, and *Note Syntax Tables: (elisp)Syntax Tables.)
4053
4054 Syntax tables specify which characters belong to which categories.
4055 Usually, a hyphen is not specified as a `word constituent character'.
4056 Instead, it is specified as being in the `class of characters that are
4057 part of symbol names but not words.' This means that the
4058 `count-words-region' function treats it in the same way it treats an
4059 interword white space, which is why `count-words-region' counts
4060 `multiply-by-seven' as three words.
4061
4062 There are two ways to cause Emacs to count `multiply-by-seven' as one
4063 symbol: modify the syntax table or modify the regular expression.
4064
4065 We could redefine a hyphen as a word constituent character by modifying
4066 the syntax table that Emacs keeps for each mode. This action would
4067 serve our purpose, except that a hyphen is merely the most common
4068 character within symbols that is not typically a word constituent
4069 character; there are others, too.
4070
4071 Alternatively, we can redefine the regular expression used in the
4072 `count-words' definition so as to include symbols. This procedure has
4073 the merit of clarity, but the task is a little tricky.
4074
4075 The first part is simple enough: the pattern must match "at least one
4076 character that is a word or symbol constituent". Thus:
4077
4078 "\\(\\w\\|\\s_\\)+"
4079
4080 The `\\(' is the first part of the grouping construct that includes the
4081 `\\w' and the `\\s_' as alternatives, separated by the `\\|'. The
4082 `\\w' matches any word-constituent character and the `\\s_' matches any
4083 character that is part of a symbol name but not a word-constituent
4084 character. The `+' following the group indicates that the word or
4085 symbol constituent characters must be matched at least once.
4086
4087 However, the second part of the regexp is more difficult to design.
4088 What we want is to follow the first part with "optionally one or more
4089 characters that are not constituents of a word or symbol". At first, I
4090 thought I could define this with the following:
4091
4092 "\\(\\W\\|\\S_\\)*"
4093
4094 The upper case `W' and `S' match characters that are _not_ word or
4095 symbol constituents. Unfortunately, this expression matches any
4096 character that is either not a word constituent or not a symbol
4097 constituent. This matches any character!
4098
4099 I then noticed that every word or symbol in my test region was followed
4100 by white space (blank space, tab, or newline). So I tried placing a
4101 pattern to match one or more blank spaces after the pattern for one or
4102 more word or symbol constituents. This failed, too. Words and symbols
4103 are often separated by whitespace, but in actual code parentheses may
4104 follow symbols and punctuation may follow words. So finally, I
4105 designed a pattern in which the word or symbol constituents are
4106 followed optionally by characters that are not white space and then
4107 followed optionally by white space.
4108
4109 Here is the full regular expression:
4110
4111 "\\(\\w\\|\\s_\\)+[^ \t\n]*[ \t\n]*"
4112
4113 
4114 File: eintr, Node: count-words-in-defun, Next: Several defuns, Prev: Syntax, Up: Words in a defun
4115
4116 14.3 The `count-words-in-defun' Function
4117 ========================================
4118
4119 We have seen that there are several ways to write a `count-word-region'
4120 function. To write a `count-words-in-defun', we need merely adapt one
4121 of these versions.
4122
4123 The version that uses a `while' loop is easy to understand, so I am
4124 going to adapt that. Because `count-words-in-defun' will be part of a
4125 more complex program, it need not be interactive and it need not
4126 display a message but just return the count. These considerations
4127 simplify the definition a little.
4128
4129 On the other hand, `count-words-in-defun' will be used within a buffer
4130 that contains function definitions. Consequently, it is reasonable to
4131 ask that the function determine whether it is called when point is
4132 within a function definition, and if it is, to return the count for
4133 that definition. This adds complexity to the definition, but saves us
4134 from needing to pass arguments to the function.
4135
4136 These considerations lead us to prepare the following template:
4137
4138 (defun count-words-in-defun ()
4139 "DOCUMENTATION..."
4140 (SET UP...
4141 (WHILE LOOP...)
4142 RETURN COUNT)
4143
4144 As usual, our job is to fill in the slots.
4145
4146 First, the set up.
4147
4148 We are presuming that this function will be called within a buffer
4149 containing function definitions. Point will either be within a
4150 function definition or not. For `count-words-in-defun' to work, point
4151 must move to the beginning of the definition, a counter must start at
4152 zero, and the counting loop must stop when point reaches the end of the
4153 definition.
4154
4155 The `beginning-of-defun' function searches backwards for an opening
4156 delimiter such as a `(' at the beginning of a line, and moves point to
4157 that position, or else to the limit of the search. In practice, this
4158 means that `beginning-of-defun' moves point to the beginning of an
4159 enclosing or preceding function definition, or else to the beginning of
4160 the buffer. We can use `beginning-of-defun' to place point where we
4161 wish to start.
4162
4163 The `while' loop requires a counter to keep track of the words or
4164 symbols being counted. A `let' expression can be used to create a
4165 local variable for this purpose, and bind it to an initial value of
4166 zero.
4167
4168 The `end-of-defun' function works like `beginning-of-defun' except that
4169 it moves point to the end of the definition. `end-of-defun' can be
4170 used as part of an expression that determines the position of the end
4171 of the definition.
4172
4173 The set up for `count-words-in-defun' takes shape rapidly: first we
4174 move point to the beginning of the definition, then we create a local
4175 variable to hold the count, and finally, we record the position of the
4176 end of the definition so the `while' loop will know when to stop
4177 looping.
4178
4179 The code looks like this:
4180
4181 (beginning-of-defun)
4182 (let ((count 0)
4183 (end (save-excursion (end-of-defun) (point))))
4184
4185 The code is simple. The only slight complication is likely to concern
4186 `end': it is bound to the position of the end of the definition by a
4187 `save-excursion' expression that returns the value of point after
4188 `end-of-defun' temporarily moves it to the end of the definition.
4189
4190 The second part of the `count-words-in-defun', after the set up, is the
4191 `while' loop.
4192
4193 The loop must contain an expression that jumps point forward word by
4194 word and symbol by symbol, and another expression that counts the
4195 jumps. The true-or-false-test for the `while' loop should test true so
4196 long as point should jump forward, and false when point is at the end
4197 of the definition. We have already redefined the regular expression
4198 for this (*note Syntax::), so the loop is straightforward:
4199
4200 (while (and (< (point) end)
4201 (re-search-forward
4202 "\\(\\w\\|\\s_\\)+[^ \t\n]*[ \t\n]*" end t)
4203 (setq count (1+ count)))
4204
4205 The third part of the function definition returns the count of words
4206 and symbols. This part is the last expression within the body of the
4207 `let' expression, and can be, very simply, the local variable `count',
4208 which when evaluated returns the count.
4209
4210 Put together, the `count-words-in-defun' definition looks like this:
4211
4212 (defun count-words-in-defun ()
4213 "Return the number of words and symbols in a defun."
4214 (beginning-of-defun)
4215 (let ((count 0)
4216 (end (save-excursion (end-of-defun) (point))))
4217 (while
4218 (and (< (point) end)
4219 (re-search-forward
4220 "\\(\\w\\|\\s_\\)+[^ \t\n]*[ \t\n]*"
4221 end t))
4222 (setq count (1+ count)))
4223 count))
4224
4225 How to test this? The function is not interactive, but it is easy to
4226 put a wrapper around the function to make it interactive; we can use
4227 almost the same code as for the recursive version of
4228 `count-words-region':
4229
4230 ;;; Interactive version.
4231 (defun count-words-defun ()
4232 "Number of words and symbols in a function definition."
4233 (interactive)
4234 (message
4235 "Counting words and symbols in function definition ... ")
4236 (let ((count (count-words-in-defun)))
4237 (cond
4238 ((zerop count)
4239 (message
4240 "The definition does NOT have any words or symbols."))
4241 ((= 1 count)
4242 (message
4243 "The definition has 1 word or symbol."))
4244 (t
4245 (message
4246 "The definition has %d words or symbols." count)))))
4247
4248 Let's re-use `C-c =' as a convenient keybinding:
4249
4250 (global-set-key "\C-c=" 'count-words-defun)
4251
4252 Now we can try out `count-words-defun': install both
4253 `count-words-in-defun' and `count-words-defun', and set the keybinding,
4254 and then place the cursor within the following definition:
4255
4256 (defun multiply-by-seven (number)
4257 "Multiply NUMBER by seven."
4258 (* 7 number))
4259 => 10
4260
4261 Success! The definition has 10 words and symbols.
4262
4263 The next problem is to count the numbers of words and symbols in
4264 several definitions within a single file.
4265
4266 
4267 File: eintr, Node: Several defuns, Next: Find a File, Prev: count-words-in-defun, Up: Words in a defun
4268
4269 14.4 Count Several `defuns' Within a File
4270 =========================================
4271
4272 A file such as `simple.el' may have a hundred or more function
4273 definitions within it. Our long term goal is to collect statistics on
4274 many files, but as a first step, our immediate goal is to collect
4275 statistics on one file.
4276
4277 The information will be a series of numbers, each number being the
4278 length of a function definition. We can store the numbers in a list.
4279
4280 We know that we will want to incorporate the information regarding one
4281 file with information about many other files; this means that the
4282 function for counting definition lengths within one file need only
4283 return the list of lengths. It need not and should not display any
4284 messages.
4285
4286 The word count commands contain one expression to jump point forward
4287 word by word and another expression to count the jumps. The function
4288 to return the lengths of definitions can be designed to work the same
4289 way, with one expression to jump point forward definition by definition
4290 and another expression to construct the lengths' list.
4291
4292 This statement of the problem makes it elementary to write the function
4293 definition. Clearly, we will start the count at the beginning of the
4294 file, so the first command will be `(goto-char (point-min))'. Next, we
4295 start the `while' loop; and the true-or-false test of the loop can be a
4296 regular expression search for the next function definition--so long as
4297 the search succeeds, point is moved forward and then the body of the
4298 loop is evaluated. The body needs an expression that constructs the
4299 lengths' list. `cons', the list construction command, can be used to
4300 create the list. That is almost all there is to it.
4301
4302 Here is what this fragment of code looks like:
4303
4304 (goto-char (point-min))
4305 (while (re-search-forward "^(defun" nil t)
4306 (setq lengths-list
4307 (cons (count-words-in-defun) lengths-list)))
4308
4309 What we have left out is the mechanism for finding the file that
4310 contains the function definitions.
4311
4312 In previous examples, we either used this, the Info file, or we
4313 switched back and forth to some other buffer, such as the `*scratch*'
4314 buffer.
4315
4316 Finding a file is a new process that we have not yet discussed.
4317
4318 
4319 File: eintr, Node: Find a File, Next: lengths-list-file, Prev: Several defuns, Up: Words in a defun
4320
4321 14.5 Find a File
4322 ================
4323
4324 To find a file in Emacs, you use the `C-x C-f' (`find-file') command.
4325 This command is almost, but not quite right for the lengths problem.
4326
4327 Let's look at the source for `find-file':
4328
4329 (defun find-file (filename)
4330 "Edit file FILENAME.
4331 Switch to a buffer visiting file FILENAME,
4332 creating one if none already exists."
4333 (interactive "FFind file: ")
4334 (switch-to-buffer (find-file-noselect filename)))
4335
4336 (The most recent version of the `find-file' function definition permits
4337 you to specify optional wildcards visit multiple files; that makes the
4338 definition more complex and we will not discuss it here, since it is
4339 not relevant. You can see its source using either `M-.' (`find-tag')
4340 or `C-h f' (`describe-function').)
4341
4342 The definition I am showing possesses short but complete documentation
4343 and an interactive specification that prompts you for a file name when
4344 you use the command interactively. The body of the definition contains
4345 two functions, `find-file-noselect' and `switch-to-buffer'.
4346
4347 According to its documentation as shown by `C-h f' (the
4348 `describe-function' command), the `find-file-noselect' function reads
4349 the named file into a buffer and returns the buffer. (Its most recent
4350 version includes an optional wildcards argument, too, as well as
4351 another to read a file literally and an other you suppress warning
4352 messages. These optional arguments are irrelevant.)
4353
4354 However, the `find-file-noselect' function does not select the buffer
4355 in which it puts the file. Emacs does not switch its attention (or
4356 yours if you are using `find-file-noselect') to the named buffer. That
4357 is what `switch-to-buffer' does: it switches the buffer to which Emacs
4358 attention is directed; and it switches the buffer displayed in the
4359 window to the new buffer. We have discussed buffer switching
4360 elsewhere. (*Note Switching Buffers::.)
4361
4362 In this histogram project, we do not need to display each file on the
4363 screen as the program determines the length of each definition within
4364 it. Instead of employing `switch-to-buffer', we can work with
4365 `set-buffer', which redirects the attention of the computer program to
4366 a different buffer but does not redisplay it on the screen. So instead
4367 of calling on `find-file' to do the job, we must write our own
4368 expression.
4369
4370 The task is easy: use `find-file-noselect' and `set-buffer'.
4371
4372 
4373 File: eintr, Node: lengths-list-file, Next: Several files, Prev: Find a File, Up: Words in a defun
4374
4375 14.6 `lengths-list-file' in Detail
4376 ==================================
4377
4378 The core of the `lengths-list-file' function is a `while' loop
4379 containing a function to move point forward `defun by defun' and a
4380 function to count the number of words and symbols in each defun. This
4381 core must be surrounded by functions that do various other tasks,
4382 including finding the file, and ensuring that point starts out at the
4383 beginning of the file. The function definition looks like this:
4384
4385 (defun lengths-list-file (filename)
4386 "Return list of definitions' lengths within FILE.
4387 The returned list is a list of numbers.
4388 Each number is the number of words or
4389 symbols in one function definition."
4390 (message "Working on `%s' ... " filename)
4391 (save-excursion
4392 (let ((buffer (find-file-noselect filename))
4393 (lengths-list))
4394 (set-buffer buffer)
4395 (setq buffer-read-only t)
4396 (widen)
4397 (goto-char (point-min))
4398 (while (re-search-forward "^(defun" nil t)
4399 (setq lengths-list
4400 (cons (count-words-in-defun) lengths-list)))
4401 (kill-buffer buffer)
4402 lengths-list)))
4403
4404 The function is passed one argument, the name of the file on which it
4405 will work. It has four lines of documentation, but no interactive
4406 specification. Since people worry that a computer is broken if they
4407 don't see anything going on, the first line of the body is a message.
4408
4409 The next line contains a `save-excursion' that returns Emacs' attention
4410 to the current buffer when the function completes. This is useful in
4411 case you embed this function in another function that presumes point is
4412 restored to the original buffer.
4413
4414 In the varlist of the `let' expression, Emacs finds the file and binds
4415 the local variable `buffer' to the buffer containing the file. At the
4416 same time, Emacs creates `lengths-list' as a local variable.
4417
4418 Next, Emacs switches its attention to the buffer.
4419
4420 In the following line, Emacs makes the buffer read-only. Ideally, this
4421 line is not necessary. None of the functions for counting words and
4422 symbols in a function definition should change the buffer. Besides,
4423 the buffer is not going to be saved, even if it were changed. This
4424 line is entirely the consequence of great, perhaps excessive, caution.
4425 The reason for the caution is that this function and those it calls
4426 work on the sources for Emacs and it is very inconvenient if they are
4427 inadvertently modified. It goes without saying that I did not realize
4428 a need for this line until an experiment went awry and started to
4429 modify my Emacs source files ...
4430
4431 Next comes a call to widen the buffer if it is narrowed. This function
4432 is usually not needed--Emacs creates a fresh buffer if none already
4433 exists; but if a buffer visiting the file already exists Emacs returns
4434 that one. In this case, the buffer may be narrowed and must be
4435 widened. If we wanted to be fully `user-friendly', we would arrange to
4436 save the restriction and the location of point, but we won't.
4437
4438 The `(goto-char (point-min))' expression moves point to the beginning
4439 of the buffer.
4440
4441 Then comes a `while' loop in which the `work' of the function is
4442 carried out. In the loop, Emacs determines the length of each
4443 definition and constructs a lengths' list containing the information.
4444
4445 Emacs kills the buffer after working through it. This is to save space
4446 inside of Emacs. My version of GNU Emacs 19 contained over 300 source
4447 files of interest; GNU Emacs 22 contains over a thousand source files.
4448 Another function will apply `lengths-list-file' to each of the files.
4449
4450 Finally, the last expression within the `let' expression is the
4451 `lengths-list' variable; its value is returned as the value of the
4452 whole function.
4453
4454 You can try this function by installing it in the usual fashion. Then
4455 place your cursor after the following expression and type `C-x C-e'
4456 (`eval-last-sexp').
4457
4458 (lengths-list-file
4459 "/usr/local/share/emacs/22.0.100/lisp/emacs-lisp/debug.el")
4460
4461 (You may need to change the pathname of the file; the one here is for
4462 GNU Emacs version 22.0.100. To change the expression, copy it to the
4463 `*scratch*' buffer and edit it.
4464
4465 (Also, to see the full length of the list, rather than a truncated
4466 version, you may have to evaluate the following:
4467
4468 (custom-set-variables '(eval-expression-print-length nil))
4469
4470 (*Note Specifying Variables using `defcustom': defcustom.) Then
4471 evaluate the `lengths-list-file' expression.)
4472
4473 The lengths' list for `debug.el' takes less than a second to produce
4474 and looks like this in GNU Emacs 22:
4475
4476 (83 113 105 144 289 22 30 97 48 89 25 52 52 88 28 29 77 49 43 290 232 587)
4477
4478 (Using my old machine, the version 19 lengths' list for `debug.el' took
4479 seven seconds to produce and looked like this:
4480
4481 (75 41 80 62 20 45 44 68 45 12 34 235)
4482
4483 (The newer version of `debug.el' contains more defuns than the earlier
4484 one; and my new machine is much faster than the old one.)
4485
4486 Note that the length of the last definition in the file is first in the
4487 list.
4488
4489 
4490 File: eintr, Node: Several files, Next: Several files recursively, Prev: lengths-list-file, Up: Words in a defun
4491
4492 14.7 Count Words in `defuns' in Different Files
4493 ===============================================
4494
4495 In the previous section, we created a function that returns a list of
4496 the lengths of each definition in a file. Now, we want to define a
4497 function to return a master list of the lengths of the definitions in a
4498 list of files.
4499
4500 Working on each of a list of files is a repetitious act, so we can use
4501 either a `while' loop or recursion.
4502
4503 * Menu:
4504
4505 * lengths-list-many-files::
4506 * append::
4507
4508 
4509 File: eintr, Node: lengths-list-many-files, Next: append, Prev: Several files, Up: Several files
4510
4511 Determine the lengths of `defuns'
4512 ---------------------------------
4513
4514 The design using a `while' loop is routine. The argument passed the
4515 function is a list of files. As we saw earlier (*note Loop Example::),
4516 you can write a `while' loop so that the body of the loop is evaluated
4517 if such a list contains elements, but to exit the loop if the list is
4518 empty. For this design to work, the body of the loop must contain an
4519 expression that shortens the list each time the body is evaluated, so
4520 that eventually the list is empty. The usual technique is to set the
4521 value of the list to the value of the CDR of the list each time the
4522 body is evaluated.
4523
4524 The template looks like this:
4525
4526 (while TEST-WHETHER-LIST-IS-EMPTY
4527 BODY...
4528 SET-LIST-TO-CDR-OF-LIST)
4529
4530 Also, we remember that a `while' loop returns `nil' (the result of
4531 evaluating the true-or-false-test), not the result of any evaluation
4532 within its body. (The evaluations within the body of the loop are done
4533 for their side effects.) However, the expression that sets the
4534 lengths' list is part of the body--and that is the value that we want
4535 returned by the function as a whole. To do this, we enclose the
4536 `while' loop within a `let' expression, and arrange that the last
4537 element of the `let' expression contains the value of the lengths'
4538 list. (*Note Loop Example with an Incrementing Counter: Incrementing
4539 Example.)
4540
4541 These considerations lead us directly to the function itself:
4542
4543 ;;; Use `while' loop.
4544 (defun lengths-list-many-files (list-of-files)
4545 "Return list of lengths of defuns in LIST-OF-FILES."
4546 (let (lengths-list)
4547
4548 ;;; true-or-false-test
4549 (while list-of-files
4550 (setq lengths-list
4551 (append
4552 lengths-list
4553
4554 ;;; Generate a lengths' list.
4555 (lengths-list-file
4556 (expand-file-name (car list-of-files)))))
4557
4558 ;;; Make files' list shorter.
4559 (setq list-of-files (cdr list-of-files)))
4560
4561 ;;; Return final value of lengths' list.
4562 lengths-list))
4563
4564 `expand-file-name' is a built-in function that converts a file name to
4565 the absolute, long, path name form of the directory in which the
4566 function is called.
4567
4568 Thus, if `expand-file-name' is called on `debug.el' when Emacs is
4569 visiting the `/usr/local/share/emacs/22.0.100/lisp/emacs-lisp/'
4570 directory,
4571
4572 debug.el
4573
4574 becomes
4575
4576 /usr/local/share/emacs/22.0.100/lisp/emacs-lisp/debug.el
4577
4578 The only other new element of this function definition is the as yet
4579 unstudied function `append', which merits a short section for itself.
4580
4581 
4582 File: eintr, Node: append, Prev: lengths-list-many-files, Up: Several files
4583
4584 14.7.1 The `append' Function
4585 ----------------------------
4586
4587 The `append' function attaches one list to another. Thus,
4588
4589 (append '(1 2 3 4) '(5 6 7 8))
4590
4591 produces the list
4592
4593 (1 2 3 4 5 6 7 8)
4594
4595 This is exactly how we want to attach two lengths' lists produced by
4596 `lengths-list-file' to each other. The results contrast with `cons',
4597
4598 (cons '(1 2 3 4) '(5 6 7 8))
4599
4600 which constructs a new list in which the first argument to `cons'
4601 becomes the first element of the new list:
4602
4603 ((1 2 3 4) 5 6 7 8)
4604
4605 
4606 File: eintr, Node: Several files recursively, Next: Prepare the data, Prev: Several files, Up: Words in a defun
4607
4608 14.8 Recursively Count Words in Different Files
4609 ===============================================
4610
4611 Besides a `while' loop, you can work on each of a list of files with
4612 recursion. A recursive version of `lengths-list-many-files' is short
4613 and simple.
4614
4615 The recursive function has the usual parts: the `do-again-test', the
4616 `next-step-expression', and the recursive call. The `do-again-test'
4617 determines whether the function should call itself again, which it will
4618 do if the `list-of-files' contains any remaining elements; the
4619 `next-step-expression' resets the `list-of-files' to the CDR of itself,
4620 so eventually the list will be empty; and the recursive call calls
4621 itself on the shorter list. The complete function is shorter than this
4622 description!
4623
4624 (defun recursive-lengths-list-many-files (list-of-files)
4625 "Return list of lengths of each defun in LIST-OF-FILES."
4626 (if list-of-files ; do-again-test
4627 (append
4628 (lengths-list-file
4629 (expand-file-name (car list-of-files)))
4630 (recursive-lengths-list-many-files
4631 (cdr list-of-files)))))
4632
4633 In a sentence, the function returns the lengths' list for the first of
4634 the `list-of-files' appended to the result of calling itself on the
4635 rest of the `list-of-files'.
4636
4637 Here is a test of `recursive-lengths-list-many-files', along with the
4638 results of running `lengths-list-file' on each of the files
4639 individually.
4640
4641 Install `recursive-lengths-list-many-files' and `lengths-list-file', if
4642 necessary, and then evaluate the following expressions. You may need
4643 to change the files' pathnames; those here work when this Info file and
4644 the Emacs sources are located in their customary places. To change the
4645 expressions, copy them to the `*scratch*' buffer, edit them, and then
4646 evaluate them.
4647
4648 The results are shown after the `=>'. (These results are for files
4649 from Emacs Version 22.0.100; files from other versions of Emacs may
4650 produce different results.)
4651
4652 (cd "/usr/local/share/emacs/22.0.100/")
4653
4654 (lengths-list-file "./lisp/macros.el")
4655 => (283 263 480 90)
4656
4657 (lengths-list-file "./lisp/mail/mailalias.el")
4658 => (38 32 29 95 178 180 321 218 324)
4659
4660 (lengths-list-file "./lisp/makesum.el")
4661 => (85 181)
4662
4663 (recursive-lengths-list-many-files
4664 '("./lisp/macros.el"
4665 "./lisp/mail/mailalias.el"
4666 "./lisp/makesum.el"))
4667 => (283 263 480 90 38 32 29 95 178 180 321 218 324 85 181)
4668
4669 The `recursive-lengths-list-many-files' function produces the output we
4670 want.
4671
4672 The next step is to prepare the data in the list for display in a graph.
4673
4674 
4675 File: eintr, Node: Prepare the data, Prev: Several files recursively, Up: Words in a defun
4676
4677 14.9 Prepare the Data for Display in a Graph
4678 ============================================
4679
4680 The `recursive-lengths-list-many-files' function returns a list of
4681 numbers. Each number records the length of a function definition.
4682 What we need to do now is transform this data into a list of numbers
4683 suitable for generating a graph. The new list will tell how many
4684 functions definitions contain less than 10 words and symbols, how many
4685 contain between 10 and 19 words and symbols, how many contain between
4686 20 and 29 words and symbols, and so on.
4687
4688 In brief, we need to go through the lengths' list produced by the
4689 `recursive-lengths-list-many-files' function and count the number of
4690 defuns within each range of lengths, and produce a list of those
4691 numbers.
4692
4693 Based on what we have done before, we can readily foresee that it
4694 should not be too hard to write a function that `CDRs' down the
4695 lengths' list, looks at each element, determines which length range it
4696 is in, and increments a counter for that range.
4697
4698 However, before beginning to write such a function, we should consider
4699 the advantages of sorting the lengths' list first, so the numbers are
4700 ordered from smallest to largest. First, sorting will make it easier
4701 to count the numbers in each range, since two adjacent numbers will
4702 either be in the same length range or in adjacent ranges. Second, by
4703 inspecting a sorted list, we can discover the highest and lowest
4704 number, and thereby determine the largest and smallest length range
4705 that we will need.
4706
4707 * Menu:
4708
4709 * Sorting::
4710 * Files List::
4711 * Counting function definitions::
4712
4713 
4714 File: eintr, Node: Sorting, Next: Files List, Prev: Prepare the data, Up: Prepare the data
4715
4716 14.9.1 Sorting Lists
4717 --------------------
4718
4719 Emacs contains a function to sort lists, called (as you might guess)
4720 `sort'. The `sort' function takes two arguments, the list to be
4721 sorted, and a predicate that determines whether the first of two list
4722 elements is "less" than the second.
4723
4724 As we saw earlier (*note Using the Wrong Type Object as an Argument:
4725 Wrong Type of Argument.), a predicate is a function that determines
4726 whether some property is true or false. The `sort' function will
4727 reorder a list according to whatever property the predicate uses; this
4728 means that `sort' can be used to sort non-numeric lists by non-numeric
4729 criteria--it can, for example, alphabetize a list.
4730
4731 The `<' function is used when sorting a numeric list. For example,
4732
4733 (sort '(4 8 21 17 33 7 21 7) '<)
4734
4735 produces this:
4736
4737 (4 7 7 8 17 21 21 33)
4738
4739 (Note that in this example, both the arguments are quoted so that the
4740 symbols are not evaluated before being passed to `sort' as arguments.)
4741
4742 Sorting the list returned by the `recursive-lengths-list-many-files'
4743 function is straightforward; it uses the `<' function:
4744
4745 (sort
4746 (recursive-lengths-list-many-files
4747 '("./lisp/macros.el"
4748 "./lisp/mailalias.el"
4749 "./lisp/makesum.el"))
4750 '<)
4751
4752 which produces:
4753
4754 (29 32 38 85 90 95 178 180 181 218 263 283 321 324 480)
4755
4756 (Note that in this example, the first argument to `sort' is not quoted,
4757 since the expression must be evaluated so as to produce the list that
4758 is passed to `sort'.)
4759
4760 
4761 File: eintr, Node: Files List, Next: Counting function definitions, Prev: Sorting, Up: Prepare the data
4762
4763 14.9.2 Making a List of Files
4764 -----------------------------
4765
4766 The `recursive-lengths-list-many-files' function requires a list of
4767 files as its argument. For our test examples, we constructed such a
4768 list by hand; but the Emacs Lisp source directory is too large for us
4769 to do for that. Instead, we will write a function to do the job for
4770 us. In this function, we will use both a `while' loop and a recursive
4771 call.
4772
4773 We did not have to write a function like this for older versions of GNU
4774 Emacs, since they placed all the `.el' files in one directory.
4775 Instead, we were able to use the `directory-files' function, which
4776 lists the names of files that match a specified pattern within a single
4777 directory.
4778
4779 However, recent versions of Emacs place Emacs Lisp files in
4780 sub-directories of the top level `lisp' directory. This re-arrangement
4781 eases navigation. For example, all the mail related files are in a
4782 `lisp' sub-directory called `mail'. But at the same time, this
4783 arrangement forces us to create a file listing function that descends
4784 into the sub-directories.
4785
4786 We can create this function, called `files-in-below-directory', using
4787 familiar functions such as `car', `nthcdr', and `substring' in
4788 conjunction with an existing function called
4789 `directory-files-and-attributes'. This latter function not only lists
4790 all the filenames in a directory, including the names of
4791 sub-directories, but also their attributes.
4792
4793 To restate our goal: to create a function that will enable us to feed
4794 filenames to `recursive-lengths-list-many-files' as a list that looks
4795 like this (but with more elements):
4796
4797 ("./lisp/macros.el"
4798 "./lisp/mail/rmail.el"
4799 "./lisp/makesum.el")
4800
4801 The `directory-files-and-attributes' function returns a list of lists.
4802 Each of the lists within the main list consists of 13 elements. The
4803 first element is a string that contains the name of the file - which,
4804 in GNU/Linux, may be a `directory file', that is to say, a file with
4805 the special attributes of a directory. The second element of the list
4806 is `t' for a directory, a string for symbolic link (the string is the
4807 name linked to), or `nil'.
4808
4809 For example, the first `.el' file in the `lisp/' directory is
4810 `abbrev.el'. Its name is
4811 `/usr/local/share/emacs/22.0.100/lisp/abbrev.el' and it is not a
4812 directory or a symbolic link.
4813
4814 This is how `directory-files-and-attributes' lists that file and its
4815 attributes:
4816
4817 ("abbrev.el"
4818 nil
4819 1
4820 1000
4821 100
4822 (17733 259)
4823 (17491 28834)
4824 (17596 62124)
4825 13157
4826 "-rw-rw-r--"
4827 nil
4828 2971624
4829 773)
4830
4831 On the other hand, `mail/' is a directory within the `lisp/' directory.
4832 The beginning of its listing looks like this:
4833
4834 ("mail"
4835 t
4836 ...
4837 )
4838
4839 (To learn about the different attributes, look at the documentation of
4840 `file-attributes'. Bear in mind that the `file-attributes' function
4841 does not list the filename, so its first element is
4842 `directory-files-and-attributes''s second element.)
4843
4844 We will want our new function, `files-in-below-directory', to list the
4845 `.el' files in the directory it is told to check, and in any
4846 directories below that directory.
4847
4848 This gives us a hint on how to construct `files-in-below-directory':
4849 within a directory, the function should add `.el' filenames to a list;
4850 and if, within a directory, the function comes upon a sub-directory, it
4851 should go into that sub-directory and repeat its actions.
4852
4853 However, we should note that every directory contains a name that
4854 refers to itself, called `.', ("dot") and a name that refers to its
4855 parent directory, called `..' ("double dot"). (In `/', the root
4856 directory, `..' refers to itself, since `/' has no parent.) Clearly,
4857 we do not want our `files-in-below-directory' function to enter those
4858 directories, since they always lead us, directly or indirectly, to the
4859 current directory.
4860
4861 Consequently, our `files-in-below-directory' function must do several
4862 tasks:
4863
4864 * Check to see whether it is looking at a filename that ends in
4865 `.el'; and if so, add its name to a list.
4866
4867 * Check to see whether it is looking at a filename that is the name
4868 of a directory; and if so,
4869
4870 - Check to see whether it is looking at `.' or `..'; and if so
4871 skip it.
4872
4873 - Or else, go into that directory and repeat the process.
4874
4875 Let's write a function definition to do these tasks. We will use a
4876 `while' loop to move from one filename to another within a directory,
4877 checking what needs to be done; and we will use a recursive call to
4878 repeat the actions on each sub-directory. The recursive pattern is
4879 `accumulate' (*note Recursive Pattern: _accumulate_: Accumulate.),
4880 using `append' as the combiner.
4881
4882 Here is the function:
4883
4884 (defun files-in-below-directory (directory)
4885 "List the .el files in DIRECTORY and in its sub-directories."
4886 ;; Although the function will be used non-interactively,
4887 ;; it will be easier to test if we make it interactive.
4888 ;; The directory will have a name such as
4889 ;; "/usr/local/share/emacs/22.0.100/lisp/"
4890 (interactive "DDirectory name: ")
4891 (let (el-files-list
4892 (current-directory-list
4893 (directory-files-and-attributes directory t)))
4894 ;; while we are in the current directory
4895 (while current-directory-list
4896 (cond
4897 ;; check to see whether filename ends in `.el'
4898 ;; and if so, append its name to a list.
4899 ((equal ".el" (substring (car (car current-directory-list)) -3))
4900 (setq el-files-list
4901 (cons (car (car current-directory-list)) el-files-list)))
4902 ;; check whether filename is that of a directory
4903 ((eq t (car (cdr (car current-directory-list))))
4904 ;; decide whether to skip or recurse
4905 (if
4906 (equal "."
4907 (substring (car (car current-directory-list)) -1))
4908 ;; then do nothing since filename is that of
4909 ;; current directory or parent, "." or ".."
4910 ()
4911 ;; else descend into the directory and repeat the process
4912 (setq el-files-list
4913 (append
4914 (files-in-below-directory
4915 (car (car current-directory-list)))
4916 el-files-list)))))
4917 ;; move to the next filename in the list; this also
4918 ;; shortens the list so the while loop eventually comes to an end
4919 (setq current-directory-list (cdr current-directory-list)))
4920 ;; return the filenames
4921 el-files-list))
4922
4923 The `files-in-below-directory' `directory-files' function takes one
4924 argument, the name of a directory.
4925
4926 Thus, on my system,
4927
4928 (length
4929 (files-in-below-directory "/usr/local/share/emacs/22.0.100/lisp/"))
4930
4931 tells me that my Lisp sources directory contains 1031 `.el' files.
4932
4933 `files-in-below-directory' returns a list in reverse alphabetical
4934 order. An expression to sort the list in alphabetical order looks like
4935 this:
4936
4937 (sort
4938 (files-in-below-directory "/usr/local/share/emacs/22.0.100/lisp/")
4939 'string-lessp)
4940
4941 
4942 File: eintr, Node: Counting function definitions, Prev: Files List, Up: Prepare the data
4943
4944 14.9.3 Counting function definitions
4945 ------------------------------------
4946
4947 Our immediate goal is to generate a list that tells us how many
4948 function definitions contain fewer than 10 words and symbols, how many
4949 contain between 10 and 19 words and symbols, how many contain between
4950 20 and 29 words and symbols, and so on.
4951
4952 With a sorted list of numbers, this is easy: count how many elements of
4953 the list are smaller than 10, then, after moving past the numbers just
4954 counted, count how many are smaller than 20, then, after moving past
4955 the numbers just counted, count how many are smaller than 30, and so
4956 on. Each of the numbers, 10, 20, 30, 40, and the like, is one larger
4957 than the top of that range. We can call the list of such numbers the
4958 `top-of-ranges' list.
4959
4960 If we wished, we could generate this list automatically, but it is
4961 simpler to write a list manually. Here it is:
4962
4963 (defvar top-of-ranges
4964 '(10 20 30 40 50
4965 60 70 80 90 100
4966 110 120 130 140 150
4967 160 170 180 190 200
4968 210 220 230 240 250
4969 260 270 280 290 300)
4970 "List specifying ranges for `defuns-per-range'.")
4971
4972 To change the ranges, we edit this list.
4973
4974 Next, we need to write the function that creates the list of the number
4975 of definitions within each range. Clearly, this function must take the
4976 `sorted-lengths' and the `top-of-ranges' lists as arguments.
4977
4978 The `defuns-per-range' function must do two things again and again: it
4979 must count the number of definitions within a range specified by the
4980 current top-of-range value; and it must shift to the next higher value
4981 in the `top-of-ranges' list after counting the number of definitions in
4982 the current range. Since each of these actions is repetitive, we can
4983 use `while' loops for the job. One loop counts the number of
4984 definitions in the range defined by the current top-of-range value, and
4985 the other loop selects each of the top-of-range values in turn.
4986
4987 Several entries of the `sorted-lengths' list are counted for each
4988 range; this means that the loop for the `sorted-lengths' list will be
4989 inside the loop for the `top-of-ranges' list, like a small gear inside
4990 a big gear.
4991
4992 The inner loop counts the number of definitions within the range. It
4993 is a simple counting loop of the type we have seen before. (*Note A
4994 loop with an incrementing counter: Incrementing Loop.) The
4995 true-or-false test of the loop tests whether the value from the
4996 `sorted-lengths' list is smaller than the current value of the top of
4997 the range. If it is, the function increments the counter and tests the
4998 next value from the `sorted-lengths' list.
4999
5000 The inner loop looks like this:
5001
5002 (while LENGTH-ELEMENT-SMALLER-THAN-TOP-OF-RANGE
5003 (setq number-within-range (1+ number-within-range))
5004 (setq sorted-lengths (cdr sorted-lengths)))
5005
5006 The outer loop must start with the lowest value of the `top-of-ranges'
5007 list, and then be set to each of the succeeding higher values in turn.
5008 This can be done with a loop like this:
5009
5010 (while top-of-ranges
5011 BODY-OF-LOOP...
5012 (setq top-of-ranges (cdr top-of-ranges)))
5013
5014 Put together, the two loops look like this:
5015
5016 (while top-of-ranges
5017
5018 ;; Count the number of elements within the current range.
5019 (while LENGTH-ELEMENT-SMALLER-THAN-TOP-OF-RANGE
5020 (setq number-within-range (1+ number-within-range))
5021 (setq sorted-lengths (cdr sorted-lengths)))
5022
5023 ;; Move to next range.
5024 (setq top-of-ranges (cdr top-of-ranges)))
5025
5026 In addition, in each circuit of the outer loop, Emacs should record the
5027 number of definitions within that range (the value of
5028 `number-within-range') in a list. We can use `cons' for this purpose.
5029 (*Note `cons': cons.)
5030
5031 The `cons' function works fine, except that the list it constructs will
5032 contain the number of definitions for the highest range at its
5033 beginning and the number of definitions for the lowest range at its
5034 end. This is because `cons' attaches new elements of the list to the
5035 beginning of the list, and since the two loops are working their way
5036 through the lengths' list from the lower end first, the
5037 `defuns-per-range-list' will end up largest number first. But we will
5038 want to print our graph with smallest values first and the larger
5039 later. The solution is to reverse the order of the
5040 `defuns-per-range-list'. We can do this using the `nreverse' function,
5041 which reverses the order of a list.
5042
5043 For example,
5044
5045 (nreverse '(1 2 3 4))
5046
5047 produces:
5048
5049 (4 3 2 1)
5050
5051 Note that the `nreverse' function is "destructive"--that is, it changes
5052 the list to which it is applied; this contrasts with the `car' and
5053 `cdr' functions, which are non-destructive. In this case, we do not
5054 want the original `defuns-per-range-list', so it does not matter that
5055 it is destroyed. (The `reverse' function provides a reversed copy of a
5056 list, leaving the original list as is.)
5057
5058 Put all together, the `defuns-per-range' looks like this:
5059
5060 (defun defuns-per-range (sorted-lengths top-of-ranges)
5061 "SORTED-LENGTHS defuns in each TOP-OF-RANGES range."
5062 (let ((top-of-range (car top-of-ranges))
5063 (number-within-range 0)
5064 defuns-per-range-list)
5065
5066 ;; Outer loop.
5067 (while top-of-ranges
5068
5069 ;; Inner loop.
5070 (while (and
5071 ;; Need number for numeric test.
5072 (car sorted-lengths)
5073 (< (car sorted-lengths) top-of-range))
5074
5075 ;; Count number of definitions within current range.
5076 (setq number-within-range (1+ number-within-range))
5077 (setq sorted-lengths (cdr sorted-lengths)))
5078
5079 ;; Exit inner loop but remain within outer loop.
5080
5081 (setq defuns-per-range-list
5082 (cons number-within-range defuns-per-range-list))
5083 (setq number-within-range 0) ; Reset count to zero.
5084
5085 ;; Move to next range.
5086 (setq top-of-ranges (cdr top-of-ranges))
5087 ;; Specify next top of range value.
5088 (setq top-of-range (car top-of-ranges)))
5089
5090 ;; Exit outer loop and count the number of defuns larger than
5091 ;; the largest top-of-range value.
5092 (setq defuns-per-range-list
5093 (cons
5094 (length sorted-lengths)
5095 defuns-per-range-list))
5096
5097 ;; Return a list of the number of definitions within each range,
5098 ;; smallest to largest.
5099 (nreverse defuns-per-range-list)))
5100
5101 The function is straightforward except for one subtle feature. The
5102 true-or-false test of the inner loop looks like this:
5103
5104 (and (car sorted-lengths)
5105 (< (car sorted-lengths) top-of-range))
5106
5107 instead of like this:
5108
5109 (< (car sorted-lengths) top-of-range)
5110
5111 The purpose of the test is to determine whether the first item in the
5112 `sorted-lengths' list is less than the value of the top of the range.
5113
5114 The simple version of the test works fine unless the `sorted-lengths'
5115 list has a `nil' value. In that case, the `(car sorted-lengths)'
5116 expression function returns `nil'. The `<' function cannot compare a
5117 number to `nil', which is an empty list, so Emacs signals an error and
5118 stops the function from attempting to continue to execute.
5119
5120 The `sorted-lengths' list always becomes `nil' when the counter reaches
5121 the end of the list. This means that any attempt to use the
5122 `defuns-per-range' function with the simple version of the test will
5123 fail.
5124
5125 We solve the problem by using the `(car sorted-lengths)' expression in
5126 conjunction with the `and' expression. The `(car sorted-lengths)'
5127 expression returns a non-`nil' value so long as the list has at least
5128 one number within it, but returns `nil' if the list is empty. The
5129 `and' expression first evaluates the `(car sorted-lengths)' expression,
5130 and if it is `nil', returns false _without_ evaluating the `<'
5131 expression. But if the `(car sorted-lengths)' expression returns a
5132 non-`nil' value, the `and' expression evaluates the `<' expression, and
5133 returns that value as the value of the `and' expression.
5134
5135 This way, we avoid an error. (*Note The `kill-new' function: kill-new
5136 function, for information about `and'.)
5137
5138 Here is a short test of the `defuns-per-range' function. First,
5139 evaluate the expression that binds (a shortened) `top-of-ranges' list
5140 to the list of values, then evaluate the expression for binding the
5141 `sorted-lengths' list, and then evaluate the `defuns-per-range'
5142 function.
5143
5144 ;; (Shorter list than we will use later.)
5145 (setq top-of-ranges
5146 '(110 120 130 140 150
5147 160 170 180 190 200))
5148
5149 (setq sorted-lengths
5150 '(85 86 110 116 122 129 154 176 179 200 265 300 300))
5151
5152 (defuns-per-range sorted-lengths top-of-ranges)
5153
5154 The list returned looks like this:
5155
5156 (2 2 2 0 0 1 0 2 0 0 4)
5157
5158 Indeed, there are two elements of the `sorted-lengths' list smaller
5159 than 110, two elements between 110 and 119, two elements between 120
5160 and 129, and so on. There are four elements with a value of 200 or
5161 larger.
5162
5163 
5164 File: eintr, Node: Readying a Graph, Next: Emacs Initialization, Prev: Words in a defun, Up: Top
5165
5166 15 Readying a Graph
5167 *******************
5168
5169 Our goal is to construct a graph showing the numbers of function
5170 definitions of various lengths in the Emacs lisp sources.
5171
5172 As a practical matter, if you were creating a graph, you would probably
5173 use a program such as `gnuplot' to do the job. (`gnuplot' is nicely
5174 integrated into GNU Emacs.) In this case, however, we create one from
5175 scratch, and in the process we will re-acquaint ourselves with some of
5176 what we learned before and learn more.
5177
5178 In this chapter, we will first write a simple graph printing function.
5179 This first definition will be a "prototype", a rapidly written function
5180 that enables us to reconnoiter this unknown graph-making territory. We
5181 will discover dragons, or find that they are myth. After scouting the
5182 terrain, we will feel more confident and enhance the function to label
5183 the axes automatically.
5184
5185 * Menu:
5186
5187 * Columns of a graph::
5188 * graph-body-print::
5189 * recursive-graph-body-print::
5190 * Printed Axes::
5191 * Line Graph Exercise::
5192
5193 
5194 File: eintr, Node: Columns of a graph, Next: graph-body-print, Prev: Readying a Graph, Up: Readying a Graph
5195
5196 Printing the Columns of a Graph
5197 ===============================
5198
5199 Since Emacs is designed to be flexible and work with all kinds of
5200 terminals, including character-only terminals, the graph will need to
5201 be made from one of the `typewriter' symbols. An asterisk will do; as
5202 we enhance the graph-printing function, we can make the choice of
5203 symbol a user option.
5204
5205 We can call this function `graph-body-print'; it will take a
5206 `numbers-list' as its only argument. At this stage, we will not label
5207 the graph, but only print its body.
5208
5209 The `graph-body-print' function inserts a vertical column of asterisks
5210 for each element in the `numbers-list'. The height of each line is
5211 determined by the value of that element of the `numbers-list'.
5212
5213 Inserting columns is a repetitive act; that means that this function can
5214 be written either with a `while' loop or recursively.
5215
5216 Our first challenge is to discover how to print a column of asterisks.
5217 Usually, in Emacs, we print characters onto a screen horizontally, line
5218 by line, by typing. We have two routes we can follow: write our own
5219 column-insertion function or discover whether one exists in Emacs.
5220
5221 To see whether there is one in Emacs, we can use the `M-x apropos'
5222 command. This command is like the `C-h a' (`command-apropos') command,
5223 except that the latter finds only those functions that are commands.
5224 The `M-x apropos' command lists all symbols that match a regular
5225 expression, including functions that are not interactive.
5226
5227 What we want to look for is some command that prints or inserts
5228 columns. Very likely, the name of the function will contain either the
5229 word `print' or the word `insert' or the word `column'. Therefore, we
5230 can simply type `M-x apropos RET print\|insert\|column RET' and look at
5231 the result. On my system, this command once too takes quite some time,
5232 and then produced a list of 79 functions and variables. Now it does
5233 not take much time at all and produces a list of 211 functions and
5234 variables. Scanning down the list, the only function that looks as if
5235 it might do the job is `insert-rectangle'.
5236
5237 Indeed, this is the function we want; its documentation says:
5238
5239 insert-rectangle:
5240 Insert text of RECTANGLE with upper left corner at point.
5241 RECTANGLE's first line is inserted at point,
5242 its second line is inserted at a point vertically under point, etc.
5243 RECTANGLE should be a list of strings.
5244 After this command, the mark is at the upper left corner
5245 and point is at the lower right corner.
5246
5247 We can run a quick test, to make sure it does what we expect of it.
5248
5249 Here is the result of placing the cursor after the `insert-rectangle'
5250 expression and typing `C-u C-x C-e' (`eval-last-sexp'). The function
5251 inserts the strings `"first"', `"second"', and `"third"' at and below
5252 point. Also the function returns `nil'.
5253
5254 (insert-rectangle '("first" "second" "third"))first
5255 second
5256 thirdnil
5257
5258 Of course, we won't be inserting the text of the `insert-rectangle'
5259 expression itself into the buffer in which we are making the graph, but
5260 will call the function from our program. We shall, however, have to
5261 make sure that point is in the buffer at the place where the
5262 `insert-rectangle' function will insert its column of strings.
5263
5264 If you are reading this in Info, you can see how this works by
5265 switching to another buffer, such as the `*scratch*' buffer, placing
5266 point somewhere in the buffer, typing `M-:', typing the
5267 `insert-rectangle' expression into the minibuffer at the prompt, and
5268 then typing <RET>. This causes Emacs to evaluate the expression in the
5269 minibuffer, but to use as the value of point the position of point in
5270 the `*scratch*' buffer. (`M-:' is the keybinding for
5271 `eval-expression'. Also, `nil' does not appear in the `*scratch*'
5272 buffer since the expression is evaluated in the minibuffer.)
5273
5274 We find when we do this that point ends up at the end of the last
5275 inserted line--that is to say, this function moves point as a
5276 side-effect. If we were to repeat the command, with point at this
5277 position, the next insertion would be below and to the right of the
5278 previous insertion. We don't want this! If we are going to make a bar
5279 graph, the columns need to be beside each other.
5280
5281 So we discover that each cycle of the column-inserting `while' loop
5282 must reposition point to the place we want it, and that place will be
5283 at the top, not the bottom, of the column. Moreover, we remember that
5284 when we print a graph, we do not expect all the columns to be the same
5285 height. This means that the top of each column may be at a different
5286 height from the previous one. We cannot simply reposition point to the
5287 same line each time, but moved over to the right--or perhaps we can...
5288
5289 We are planning to make the columns of the bar graph out of asterisks.
5290 The number of asterisks in the column is the number specified by the
5291 current element of the `numbers-list'. We need to construct a list of
5292 asterisks of the right length for each call to `insert-rectangle'. If
5293 this list consists solely of the requisite number of asterisks, then we
5294 will have position point the right number of lines above the base for
5295 the graph to print correctly. This could be difficult.
5296
5297 Alternatively, if we can figure out some way to pass `insert-rectangle'
5298 a list of the same length each time, then we can place point on the
5299 same line each time, but move it over one column to the right for each
5300 new column. If we do this, however, some of the entries in the list
5301 passed to `insert-rectangle' must be blanks rather than asterisks. For
5302 example, if the maximum height of the graph is 5, but the height of the
5303 column is 3, then `insert-rectangle' requires an argument that looks
5304 like this:
5305
5306 (" " " " "*" "*" "*")
5307
5308 This last proposal is not so difficult, so long as we can determine the
5309 column height. There are two ways for us to specify the column height:
5310 we can arbitrarily state what it will be, which would work fine for
5311 graphs of that height; or we can search through the list of numbers and
5312 use the maximum height of the list as the maximum height of the graph.
5313 If the latter operation were difficult, then the former procedure would
5314 be easiest, but there is a function built into Emacs that determines
5315 the maximum of its arguments. We can use that function. The function
5316 is called `max' and it returns the largest of all its arguments, which
5317 must be numbers. Thus, for example,
5318
5319 (max 3 4 6 5 7 3)
5320
5321 returns 7. (A corresponding function called `min' returns the smallest
5322 of all its arguments.)
5323
5324 However, we cannot simply call `max' on the `numbers-list'; the `max'
5325 function expects numbers as its argument, not a list of numbers. Thus,
5326 the following expression,
5327
5328 (max '(3 4 6 5 7 3))
5329
5330 produces the following error message;
5331
5332 Wrong type of argument: number-or-marker-p, (3 4 6 5 7 3)
5333
5334 We need a function that passes a list of arguments to a function. This
5335 function is `apply'. This function `applies' its first argument (a
5336 function) to its remaining arguments, the last of which may be a list.
5337
5338 For example,
5339
5340 (apply 'max 3 4 7 3 '(4 8 5))
5341
5342 returns 8.
5343
5344 (Incidentally, I don't know how you would learn of this function
5345 without a book such as this. It is possible to discover other
5346 functions, like `search-forward' or `insert-rectangle', by guessing at
5347 a part of their names and then using `apropos'. Even though its base
5348 in metaphor is clear--`apply' its first argument to the rest--I doubt a
5349 novice would come up with that particular word when using `apropos' or
5350 other aid. Of course, I could be wrong; after all, the function was
5351 first named by someone who had to invent it.)
5352
5353 The second and subsequent arguments to `apply' are optional, so we can
5354 use `apply' to call a function and pass the elements of a list to it,
5355 like this, which also returns 8:
5356
5357 (apply 'max '(4 8 5))
5358
5359 This latter way is how we will use `apply'. The
5360 `recursive-lengths-list-many-files' function returns a numbers' list to
5361 which we can apply `max' (we could also apply `max' to the sorted
5362 numbers' list; it does not matter whether the list is sorted or not.)
5363
5364 Hence, the operation for finding the maximum height of the graph is
5365 this:
5366
5367 (setq max-graph-height (apply 'max numbers-list))
5368
5369 Now we can return to the question of how to create a list of strings
5370 for a column of the graph. Told the maximum height of the graph and
5371 the number of asterisks that should appear in the column, the function
5372 should return a list of strings for the `insert-rectangle' command to
5373 insert.
5374
5375 Each column is made up of asterisks or blanks. Since the function is
5376 passed the value of the height of the column and the number of
5377 asterisks in the column, the number of blanks can be found by
5378 subtracting the number of asterisks from the height of the column.
5379 Given the number of blanks and the number of asterisks, two `while'
5380 loops can be used to construct the list:
5381
5382 ;;; First version.
5383 (defun column-of-graph (max-graph-height actual-height)
5384 "Return list of strings that is one column of a graph."
5385 (let ((insert-list nil)
5386 (number-of-top-blanks
5387 (- max-graph-height actual-height)))
5388
5389 ;; Fill in asterisks.
5390 (while (> actual-height 0)
5391 (setq insert-list (cons "*" insert-list))
5392 (setq actual-height (1- actual-height)))
5393
5394 ;; Fill in blanks.
5395 (while (> number-of-top-blanks 0)
5396 (setq insert-list (cons " " insert-list))
5397 (setq number-of-top-blanks
5398 (1- number-of-top-blanks)))
5399
5400 ;; Return whole list.
5401 insert-list))
5402
5403 If you install this function and then evaluate the following expression
5404 you will see that it returns the list as desired:
5405
5406 (column-of-graph 5 3)
5407
5408 returns
5409
5410 (" " " " "*" "*" "*")
5411
5412 As written, `column-of-graph' contains a major flaw: the symbols used
5413 for the blank and for the marked entries in the column are `hard-coded'
5414 as a space and asterisk. This is fine for a prototype, but you, or
5415 another user, may wish to use other symbols. For example, in testing
5416 the graph function, you many want to use a period in place of the
5417 space, to make sure the point is being repositioned properly each time
5418 the `insert-rectangle' function is called; or you might want to
5419 substitute a `+' sign or other symbol for the asterisk. You might even
5420 want to make a graph-column that is more than one display column wide.
5421 The program should be more flexible. The way to do that is to replace
5422 the blank and the asterisk with two variables that we can call
5423 `graph-blank' and `graph-symbol' and define those variables separately.
5424
5425 Also, the documentation is not well written. These considerations lead
5426 us to the second version of the function:
5427
5428 (defvar graph-symbol "*"
5429 "String used as symbol in graph, usually an asterisk.")
5430
5431 (defvar graph-blank " "
5432 "String used as blank in graph, usually a blank space.
5433 graph-blank must be the same number of columns wide
5434 as graph-symbol.")
5435
5436 (For an explanation of `defvar', see *Note Initializing a Variable with
5437 `defvar': defvar.)
5438
5439 ;;; Second version.
5440 (defun column-of-graph (max-graph-height actual-height)
5441 "Return MAX-GRAPH-HEIGHT strings; ACTUAL-HEIGHT are graph-symbols.
5442 The graph-symbols are contiguous entries at the end
5443 of the list.
5444 The list will be inserted as one column of a graph.
5445 The strings are either graph-blank or graph-symbol."
5446
5447 (let ((insert-list nil)
5448 (number-of-top-blanks
5449 (- max-graph-height actual-height)))
5450
5451 ;; Fill in `graph-symbols'.
5452 (while (> actual-height 0)
5453 (setq insert-list (cons graph-symbol insert-list))
5454 (setq actual-height (1- actual-height)))
5455
5456 ;; Fill in `graph-blanks'.
5457 (while (> number-of-top-blanks 0)
5458 (setq insert-list (cons graph-blank insert-list))
5459 (setq number-of-top-blanks
5460 (1- number-of-top-blanks)))
5461
5462 ;; Return whole list.
5463 insert-list))
5464
5465 If we wished, we could rewrite `column-of-graph' a third time to
5466 provide optionally for a line graph as well as for a bar graph. This
5467 would not be hard to do. One way to think of a line graph is that it
5468 is no more than a bar graph in which the part of each bar that is below
5469 the top is blank. To construct a column for a line graph, the function
5470 first constructs a list of blanks that is one shorter than the value,
5471 then it uses `cons' to attach a graph symbol to the list; then it uses
5472 `cons' again to attach the `top blanks' to the list.
5473
5474 It is easy to see how to write such a function, but since we don't need
5475 it, we will not do it. But the job could be done, and if it were done,
5476 it would be done with `column-of-graph'. Even more important, it is
5477 worth noting that few changes would have to be made anywhere else. The
5478 enhancement, if we ever wish to make it, is simple.
5479
5480 Now, finally, we come to our first actual graph printing function.
5481 This prints the body of a graph, not the labels for the vertical and
5482 horizontal axes, so we can call this `graph-body-print'.
5483
5484 
5485 File: eintr, Node: graph-body-print, Next: recursive-graph-body-print, Prev: Columns of a graph, Up: Readying a Graph
5486
5487 15.1 The `graph-body-print' Function
5488 ====================================
5489
5490 After our preparation in the preceding section, the `graph-body-print'
5491 function is straightforward. The function will print column after
5492 column of asterisks and blanks, using the elements of a numbers' list
5493 to specify the number of asterisks in each column. This is a
5494 repetitive act, which means we can use a decrementing `while' loop or
5495 recursive function for the job. In this section, we will write the
5496 definition using a `while' loop.
5497
5498 The `column-of-graph' function requires the height of the graph as an
5499 argument, so we should determine and record that as a local variable.
5500
5501 This leads us to the following template for the `while' loop version of
5502 this function:
5503
5504 (defun graph-body-print (numbers-list)
5505 "DOCUMENTATION..."
5506 (let ((height ...
5507 ...))
5508
5509 (while numbers-list
5510 INSERT-COLUMNS-AND-REPOSITION-POINT
5511 (setq numbers-list (cdr numbers-list)))))
5512
5513 We need to fill in the slots of the template.
5514
5515 Clearly, we can use the `(apply 'max numbers-list)' expression to
5516 determine the height of the graph.
5517
5518 The `while' loop will cycle through the `numbers-list' one element at a
5519 time. As it is shortened by the `(setq numbers-list (cdr
5520 numbers-list))' expression, the CAR of each instance of the list is the
5521 value of the argument for `column-of-graph'.
5522
5523 At each cycle of the `while' loop, the `insert-rectangle' function
5524 inserts the list returned by `column-of-graph'. Since the
5525 `insert-rectangle' function moves point to the lower right of the
5526 inserted rectangle, we need to save the location of point at the time
5527 the rectangle is inserted, move back to that position after the
5528 rectangle is inserted, and then move horizontally to the next place
5529 from which `insert-rectangle' is called.
5530
5531 If the inserted columns are one character wide, as they will be if
5532 single blanks and asterisks are used, the repositioning command is
5533 simply `(forward-char 1)'; however, the width of a column may be
5534 greater than one. This means that the repositioning command should be
5535 written `(forward-char symbol-width)'. The `symbol-width' itself is
5536 the length of a `graph-blank' and can be found using the expression
5537 `(length graph-blank)'. The best place to bind the `symbol-width'
5538 variable to the value of the width of graph column is in the varlist of
5539 the `let' expression.
5540
5541 These considerations lead to the following function definition:
5542
5543 (defun graph-body-print (numbers-list)
5544 "Print a bar graph of the NUMBERS-LIST.
5545 The numbers-list consists of the Y-axis values."
5546
5547 (let ((height (apply 'max numbers-list))
5548 (symbol-width (length graph-blank))
5549 from-position)
5550
5551 (while numbers-list
5552 (setq from-position (point))
5553 (insert-rectangle
5554 (column-of-graph height (car numbers-list)))
5555 (goto-char from-position)
5556 (forward-char symbol-width)
5557 ;; Draw graph column by column.
5558 (sit-for 0)
5559 (setq numbers-list (cdr numbers-list)))
5560 ;; Place point for X axis labels.
5561 (forward-line height)
5562 (insert "\n")
5563 ))
5564
5565 The one unexpected expression in this function is the `(sit-for 0)'
5566 expression in the `while' loop. This expression makes the graph
5567 printing operation more interesting to watch than it would be
5568 otherwise. The expression causes Emacs to `sit' or do nothing for a
5569 zero length of time and then redraw the screen. Placed here, it causes
5570 Emacs to redraw the screen column by column. Without it, Emacs would
5571 not redraw the screen until the function exits.
5572
5573 We can test `graph-body-print' with a short list of numbers.
5574
5575 1. Install `graph-symbol', `graph-blank', `column-of-graph', which
5576 are in *Note Columns of a graph::, and `graph-body-print'.
5577
5578 2. Copy the following expression:
5579
5580 (graph-body-print '(1 2 3 4 6 4 3 5 7 6 5 2 3))
5581
5582 3. Switch to the `*scratch*' buffer and place the cursor where you
5583 want the graph to start.
5584
5585 4. Type `M-:' (`eval-expression').
5586
5587 5. Yank the `graph-body-print' expression into the minibuffer with
5588 `C-y' (`yank)'.
5589
5590 6. Press <RET> to evaluate the `graph-body-print' expression.
5591
5592 Emacs will print a graph like this:
5593
5594 *
5595 * **
5596 * ****
5597 *** ****
5598 ********* *
5599 ************
5600 *************
5601
5602 
5603 File: eintr, Node: recursive-graph-body-print, Next: Printed Axes, Prev: graph-body-print, Up: Readying a Graph
5604
5605 15.2 The `recursive-graph-body-print' Function
5606 ==============================================
5607
5608 The `graph-body-print' function may also be written recursively. The
5609 recursive solution is divided into two parts: an outside `wrapper' that
5610 uses a `let' expression to determine the values of several variables
5611 that need only be found once, such as the maximum height of the graph,
5612 and an inside function that is called recursively to print the graph.
5613
5614 The `wrapper' is uncomplicated:
5615
5616 (defun recursive-graph-body-print (numbers-list)
5617 "Print a bar graph of the NUMBERS-LIST.
5618 The numbers-list consists of the Y-axis values."
5619 (let ((height (apply 'max numbers-list))
5620 (symbol-width (length graph-blank))
5621 from-position)
5622 (recursive-graph-body-print-internal
5623 numbers-list
5624 height
5625 symbol-width)))
5626
5627 The recursive function is a little more difficult. It has four parts:
5628 the `do-again-test', the printing code, the recursive call, and the
5629 `next-step-expression'. The `do-again-test' is an `if' expression that
5630 determines whether the `numbers-list' contains any remaining elements;
5631 if it does, the function prints one column of the graph using the
5632 printing code and calls itself again. The function calls itself again
5633 according to the value produced by the `next-step-expression' which
5634 causes the call to act on a shorter version of the `numbers-list'.
5635
5636 (defun recursive-graph-body-print-internal
5637 (numbers-list height symbol-width)
5638 "Print a bar graph.
5639 Used within recursive-graph-body-print function."
5640
5641 (if numbers-list
5642 (progn
5643 (setq from-position (point))
5644 (insert-rectangle
5645 (column-of-graph height (car numbers-list)))
5646 (goto-char from-position)
5647 (forward-char symbol-width)
5648 (sit-for 0) ; Draw graph column by column.
5649 (recursive-graph-body-print-internal
5650 (cdr numbers-list) height symbol-width))))
5651
5652 After installation, this expression can be tested; here is a sample:
5653
5654 (recursive-graph-body-print '(3 2 5 6 7 5 3 4 6 4 3 2 1))
5655
5656 Here is what `recursive-graph-body-print' produces:
5657
5658 *
5659 ** *
5660 **** *
5661 **** ***
5662 * *********
5663 ************
5664 *************
5665
5666 Either of these two functions, `graph-body-print' or
5667 `recursive-graph-body-print', create the body of a graph.
5668
5669 
5670 File: eintr, Node: Printed Axes, Next: Line Graph Exercise, Prev: recursive-graph-body-print, Up: Readying a Graph
5671
5672 15.3 Need for Printed Axes
5673 ==========================
5674
5675 A graph needs printed axes, so you can orient yourself. For a do-once
5676 project, it may be reasonable to draw the axes by hand using Emacs'
5677 Picture mode; but a graph drawing function may be used more than once.
5678
5679 For this reason, I have written enhancements to the basic
5680 `print-graph-body' function that automatically print labels for the
5681 horizontal and vertical axes. Since the label printing functions do
5682 not contain much new material, I have placed their description in an
5683 appendix. *Note A Graph with Labelled Axes: Full Graph.
5684
5685 
5686 File: eintr, Node: Line Graph Exercise, Prev: Printed Axes, Up: Readying a Graph
5687
5688 15.4 Exercise
5689 =============
5690
5691 Write a line graph version of the graph printing functions.
5692
5693 
5694 File: eintr, Node: Emacs Initialization, Next: Debugging, Prev: Readying a Graph, Up: Top
5695
5696 16 Your `.emacs' File
5697 *********************
5698
5699 "You don't have to like Emacs to like it" - this seemingly paradoxical
5700 statement is the secret of GNU Emacs. The plain, `out of the box'
5701 Emacs is a generic tool. Most people who use it, customize it to suit
5702 themselves.
5703
5704 GNU Emacs is mostly written in Emacs Lisp; this means that by writing
5705 expressions in Emacs Lisp you can change or extend Emacs.
5706
5707 * Menu:
5708
5709 * Default Configuration::
5710 * Site-wide Init::
5711 * defcustom::
5712 * Beginning a .emacs File::
5713 * Text and Auto-fill::
5714 * Mail Aliases::
5715 * Indent Tabs Mode::
5716 * Keybindings::
5717 * Keymaps::
5718 * Loading Files::
5719 * Autoload::
5720 * Simple Extension::
5721 * X11 Colors::
5722 * Miscellaneous::
5723 * Mode Line::
5724
5725 
5726 File: eintr, Node: Default Configuration, Next: Site-wide Init, Prev: Emacs Initialization, Up: Emacs Initialization
5727
5728 Emacs' Default Configuration
5729 ============================
5730
5731 There are those who appreciate Emacs' default configuration. After
5732 all, Emacs starts you in C mode when you edit a C file, starts you in
5733 Fortran mode when you edit a Fortran file, and starts you in
5734 Fundamental mode when you edit an unadorned file. This all makes
5735 sense, if you do not know who is going to use Emacs. Who knows what a
5736 person hopes to do with an unadorned file? Fundamental mode is the
5737 right default for such a file, just as C mode is the right default for
5738 editing C code. (Enough programming languages have syntaxes that
5739 enable them to share or nearly share features, so C mode is now
5740 provided by by CC mode, the `C Collection'.)
5741
5742 But when you do know who is going to use Emacs--you, yourself--then it
5743 makes sense to customize Emacs.
5744
5745 For example, I seldom want Fundamental mode when I edit an otherwise
5746 undistinguished file; I want Text mode. This is why I customize Emacs:
5747 so it suits me.
5748
5749 You can customize and extend Emacs by writing or adapting a `~/.emacs'
5750 file. This is your personal initialization file; its contents, written
5751 in Emacs Lisp, tell Emacs what to do.(1)
5752
5753 A `~/.emacs' file contains Emacs Lisp code. You can write this code
5754 yourself; or you can use Emacs' `customize' feature to write the code
5755 for you. You can combine your own expressions and auto-written
5756 Customize expressions in your `.emacs' file.
5757
5758 (I myself prefer to write my own expressions, except for those,
5759 particularly fonts, that I find easier to manipulate using the
5760 `customize' command. I combine the two methods.)
5761
5762 Most of this chapter is about writing expressions yourself. It
5763 describes a simple `.emacs' file; for more information, see *Note The
5764 Init File: (emacs)Init File, and *Note The Init File: (elisp)Init File.
5765
5766 ---------- Footnotes ----------
5767
5768 (1) You may also add `.el' to `~/.emacs' and call it a `~/.emacs.el'
5769 file. In the past, you were forbidden to type the extra keystrokes
5770 that the name `~/.emacs.el' requires, but now you may. The new format
5771 is consistent with the Emacs Lisp file naming conventions; the old
5772 format saves typing.
5773
5774 
5775 File: eintr, Node: Site-wide Init, Next: defcustom, Prev: Default Configuration, Up: Emacs Initialization
5776
5777 16.1 Site-wide Initialization Files
5778 ===================================
5779
5780 In addition to your personal initialization file, Emacs automatically
5781 loads various site-wide initialization files, if they exist. These
5782 have the same form as your `.emacs' file, but are loaded by everyone.
5783
5784 Two site-wide initialization files, `site-load.el' and `site-init.el',
5785 are loaded into Emacs and then `dumped' if a `dumped' version of Emacs
5786 is created, as is most common. (Dumped copies of Emacs load more
5787 quickly. However, once a file is loaded and dumped, a change to it
5788 does not lead to a change in Emacs unless you load it yourself or
5789 re-dump Emacs. *Note Building Emacs: (elisp)Building Emacs, and the
5790 `INSTALL' file.)
5791
5792 Three other site-wide initialization files are loaded automatically
5793 each time you start Emacs, if they exist. These are `site-start.el',
5794 which is loaded _before_ your `.emacs' file, and `default.el', and the
5795 terminal type file, which are both loaded _after_ your `.emacs' file.
5796
5797 Settings and definitions in your `.emacs' file will overwrite
5798 conflicting settings and definitions in a `site-start.el' file, if it
5799 exists; but the settings and definitions in a `default.el' or terminal
5800 type file will overwrite those in your `.emacs' file. (You can prevent
5801 interference from a terminal type file by setting `term-file-prefix' to
5802 `nil'. *Note A Simple Extension: Simple Extension.)
5803
5804 The `INSTALL' file that comes in the distribution contains descriptions
5805 of the `site-init.el' and `site-load.el' files.
5806
5807 The `loadup.el', `startup.el', and `loaddefs.el' files control loading.
5808 These files are in the `lisp' directory of the Emacs distribution and
5809 are worth perusing.
5810
5811 The `loaddefs.el' file contains a good many suggestions as to what to
5812 put into your own `.emacs' file, or into a site-wide initialization
5813 file.
5814
5815 
5816 File: eintr, Node: defcustom, Next: Beginning a .emacs File, Prev: Site-wide Init, Up: Emacs Initialization
5817
5818 16.2 Specifying Variables using `defcustom'
5819 ===========================================
5820
5821 You can specify variables using `defcustom' so that you and others can
5822 then use Emacs' `customize' feature to set their values. (You cannot
5823 use `customize' to write function definitions; but you can write
5824 `defuns' in your `.emacs' file. Indeed, you can write any Lisp
5825 expression in your `.emacs' file.)
5826
5827 The `customize' feature depends on the `defcustom' special form.
5828 Although you can use `defvar' or `setq' for variables that users set,
5829 the `defcustom' special form is designed for the job.
5830
5831 You can use your knowledge of `defvar' for writing the first three
5832 arguments for `defcustom'. The first argument to `defcustom' is the
5833 name of the variable. The second argument is the variable's initial
5834 value, if any; and this value is set only if the value has not already
5835 been set. The third argument is the documentation.
5836
5837 The fourth and subsequent arguments to `defcustom' specify types and
5838 options; these are not featured in `defvar'. (These arguments are
5839 optional.)
5840
5841 Each of these arguments consists of a keyword followed by a value.
5842 Each keyword starts with the colon character `:'.
5843
5844 For example, the customizable user option variable `text-mode-hook'
5845 looks like this:
5846
5847 (defcustom text-mode-hook nil
5848 "Normal hook run when entering Text mode and many related modes."
5849 :type 'hook
5850 :options '(turn-on-auto-fill flyspell-mode)
5851 :group 'data)
5852
5853 The name of the variable is `text-mode-hook'; it has no default value;
5854 and its documentation string tells you what it does.
5855
5856 The `:type' keyword tells Emacs the kind of data to which
5857 `text-mode-hook' should be set and how to display the value in a
5858 Customization buffer.
5859
5860 The `:options' keyword specifies a suggested list of values for the
5861 variable. Currently, you can use `:options' only for a hook. The list
5862 is only a suggestion; it is not exclusive; a person who sets the
5863 variable may set it to other values; the list shown following the
5864 `:options' keyword is intended to offer convenient choices to a user.
5865
5866 Finally, the `:group' keyword tells the Emacs Customization command in
5867 which group the variable is located. This tells where to find it.
5868
5869 For more information, see *Note Writing Customization Definitions:
5870 (elisp)Customization.
5871
5872 Consider `text-mode-hook' as an example.
5873
5874 There are two ways to customize this variable. You can use the
5875 customization command or write the appropriate expressions yourself.
5876
5877 Using the customization command, you can type:
5878
5879 M-x customize
5880
5881 and find that the group for editing files of data is called `data'.
5882 Enter that group. Text Mode Hook is the first member. You can click
5883 on its various options, such as `turn-on-auto-fill', to set the values.
5884 After you click on the button to
5885
5886 Save for Future Sessions
5887
5888 Emacs will write an expression into your `.emacs' file. It will look
5889 like this:
5890
5891 (custom-set-variables
5892 ;; custom-set-variables was added by Custom.
5893 ;; If you edit it by hand, you could mess it up, so be careful.
5894 ;; Your init file should contain only one such instance.
5895 ;; If there is more than one, they won't work right.
5896 '(text-mode-hook (quote (turn-on-auto-fill text-mode-hook-identify))))
5897
5898 (The `text-mode-hook-identify' function tells
5899 `toggle-text-mode-auto-fill' which buffers are in Text mode. It comes
5900 on automatically. )
5901
5902 The `custom-set-variables' function works somewhat differently than a
5903 `setq'. While I have never learned the differences, I modify the
5904 `custom-set-variables' expressions in my `.emacs' file by hand: I make
5905 the changes in what appears to me to be a reasonable manner and have
5906 not had any problems. Others prefer to use the Customization command
5907 and let Emacs do the work for them.
5908
5909 Another `custom-set-...' function is `custom-set-faces'. This function
5910 sets the various font faces. Over time, I have set a considerable
5911 number of faces. Some of the time, I re-set them using `customize';
5912 other times, I simply edit the `custom-set-faces' expression in my
5913 `.emacs' file itself.
5914
5915 The second way to customize your `text-mode-hook' is to set it yourself
5916 in your `.emacs' file using code that has nothing to do with the
5917 `custom-set-...' functions.
5918
5919 When you do this, and later use `customize', you will see a message
5920 that says
5921
5922 CHANGED outside Customize; operating on it here may be unreliable.
5923
5924 This message is only a warning. If you click on the button to
5925
5926 Save for Future Sessions
5927
5928 Emacs will write a `custom-set-...' expression near the end of your
5929 `.emacs' file that will be evaluated after your hand-written
5930 expression. It will, therefore, overrule your hand-written expression.
5931 No harm will be done. When you do this, however, be careful to
5932 remember which expression is active; if you forget, you may confuse
5933 yourself.
5934
5935 So long as you remember where the values are set, you will have no
5936 trouble. In any event, the values are always set in your
5937 initialization file, which is usually called `.emacs'.
5938
5939 I myself use `customize' for hardly anything. Mostly, I write
5940 expressions myself.
5941
5942 Incidentally, `defsubst' defines an inline function. The syntax is
5943 just like that of `defun'. `defconst' defines a symbol as a constant.
5944 The intent is that neither programs nor users should ever change a
5945 value set by `defconst'
5946
5947 
5948 File: eintr, Node: Beginning a .emacs File, Next: Text and Auto-fill, Prev: defcustom, Up: Emacs Initialization
5949
5950 16.3 Beginning a `.emacs' File
5951 ==============================
5952
5953 When you start Emacs, it loads your `.emacs' file unless you tell it
5954 not to by specifying `-q' on the command line. (The `emacs -q' command
5955 gives you a plain, out-of-the-box Emacs.)
5956
5957 A `.emacs' file contains Lisp expressions. Often, these are no more
5958 than expressions to set values; sometimes they are function definitions.
5959
5960 *Note The Init File `~/.emacs': (emacs)Init File, for a short
5961 description of initialization files.
5962
5963 This chapter goes over some of the same ground, but is a walk among
5964 extracts from a complete, long-used `.emacs' file--my own.
5965
5966 The first part of the file consists of comments: reminders to myself.
5967 By now, of course, I remember these things, but when I started, I did
5968 not.
5969
5970 ;;;; Bob's .emacs file
5971 ; Robert J. Chassell
5972 ; 26 September 1985
5973
5974 Look at that date! I started this file a long time ago. I have been
5975 adding to it ever since.
5976
5977 ; Each section in this file is introduced by a
5978 ; line beginning with four semicolons; and each
5979 ; entry is introduced by a line beginning with
5980 ; three semicolons.
5981
5982 This describes the usual conventions for comments in Emacs Lisp.
5983 Everything on a line that follows a semicolon is a comment. Two,
5984 three, and four semicolons are used as section and subsection markers.
5985 (*Note Comments: (elisp)Comments, for more about comments.)
5986
5987 ;;;; The Help Key
5988 ; Control-h is the help key;
5989 ; after typing control-h, type a letter to
5990 ; indicate the subject about which you want help.
5991 ; For an explanation of the help facility,
5992 ; type control-h two times in a row.
5993
5994 Just remember: type `C-h' two times for help.
5995
5996 ; To find out about any mode, type control-h m
5997 ; while in that mode. For example, to find out
5998 ; about mail mode, enter mail mode and then type
5999 ; control-h m.
6000
6001 `Mode help', as I call this, is very helpful. Usually, it tells you
6002 all you need to know.
6003
6004 Of course, you don't need to include comments like these in your
6005 `.emacs' file. I included them in mine because I kept forgetting about
6006 Mode help or the conventions for comments--but I was able to remember
6007 to look here to remind myself.
6008
6009 
6010 File: eintr, Node: Text and Auto-fill, Next: Mail Aliases, Prev: Beginning a .emacs File, Up: Emacs Initialization
6011
6012 16.4 Text and Auto Fill Mode
6013 ============================
6014
6015 Now we come to the part that `turns on' Text mode and Auto Fill mode.
6016
6017 ;;; Text mode and Auto Fill mode
6018 ; The next two lines put Emacs into Text mode
6019 ; and Auto Fill mode, and are for writers who
6020 ; want to start writing prose rather than code.
6021
6022 (setq default-major-mode 'text-mode)
6023 (add-hook 'text-mode-hook 'turn-on-auto-fill)
6024
6025 Here is the first part of this `.emacs' file that does something
6026 besides remind a forgetful human!
6027
6028 The first of the two lines in parentheses tells Emacs to turn on Text
6029 mode when you find a file, _unless_ that file should go into some other
6030 mode, such as C mode.
6031
6032 When Emacs reads a file, it looks at the extension to the file name, if
6033 any. (The extension is the part that comes after a `.'.) If the file
6034 ends with a `.c' or `.h' extension then Emacs turns on C mode. Also,
6035 Emacs looks at first nonblank line of the file; if the line says
6036 `-*- C -*-', Emacs turns on C mode. Emacs possesses a list of
6037 extensions and specifications that it uses automatically. In addition,
6038 Emacs looks near the last page for a per-buffer, "local variables
6039 list", if any.
6040
6041 *Note How Major Modes are Chosen: (emacs)Choosing Modes.
6042
6043 *Note Local Variables in Files: (emacs)File Variables.
6044
6045 Now, back to the `.emacs' file.
6046
6047 Here is the line again; how does it work?
6048
6049 (setq default-major-mode 'text-mode)
6050
6051 This line is a short, but complete Emacs Lisp expression.
6052
6053 We are already familiar with `setq'. It sets the following variable,
6054 `default-major-mode', to the subsequent value, which is `text-mode'.
6055 The single quote mark before `text-mode' tells Emacs to deal directly
6056 with the `text-mode' variable, not with whatever it might stand for.
6057 *Note Setting the Value of a Variable: set & setq, for a reminder of
6058 how `setq' works. The main point is that there is no difference
6059 between the procedure you use to set a value in your `.emacs' file and
6060 the procedure you use anywhere else in Emacs.
6061
6062 Here is the next line:
6063
6064 (add-hook 'text-mode-hook 'turn-on-auto-fill)
6065
6066 In this line, the `add-hook' command adds `turn-on-auto-fill' to the
6067 variable.
6068
6069 `turn-on-auto-fill' is the name of a program, that, you guessed it!,
6070 turns on Auto Fill mode.
6071
6072 Every time Emacs turns on Text mode, Emacs runs the commands `hooked'
6073 onto Text mode. So every time Emacs turns on Text mode, Emacs also
6074 turns on Auto Fill mode.
6075
6076 In brief, the first line causes Emacs to enter Text mode when you edit a
6077 file, unless the file name extension, a first non-blank line, or local
6078 variables to tell Emacs otherwise.
6079
6080 Text mode among other actions, sets the syntax table to work
6081 conveniently for writers. In Text mode, Emacs considers an apostrophe
6082 as part of a word like a letter; but Emacs does not consider a period
6083 or a space as part of a word. Thus, `M-f' moves you over `it's'. On
6084 the other hand, in C mode, `M-f' stops just after the `t' of `it's'.
6085
6086 The second line causes Emacs to turn on Auto Fill mode when it turns on
6087 Text mode. In Auto Fill mode, Emacs automatically breaks a line that
6088 is too wide and brings the excessively wide part of the line down to
6089 the next line. Emacs breaks lines between words, not within them.
6090
6091 When Auto Fill mode is turned off, lines continue to the right as you
6092 type them. Depending on how you set the value of `truncate-lines', the
6093 words you type either disappear off the right side of the screen, or
6094 else are shown, in a rather ugly and unreadable manner, as a
6095 continuation line on the screen.
6096
6097 In addition, in this part of my `.emacs' file, I tell the Emacs fill
6098 commands to insert two spaces after a colon:
6099
6100 (setq colon-double-space t)
6101
6102 
6103 File: eintr, Node: Mail Aliases, Next: Indent Tabs Mode, Prev: Text and Auto-fill, Up: Emacs Initialization
6104
6105 16.5 Mail Aliases
6106 =================
6107
6108 Here is a `setq' that `turns on' mail aliases, along with more
6109 reminders.
6110
6111 ;;; Mail mode
6112 ; To enter mail mode, type `C-x m'
6113 ; To enter RMAIL (for reading mail),
6114 ; type `M-x rmail'
6115
6116 (setq mail-aliases t)
6117
6118 This `setq' command sets the value of the variable `mail-aliases' to
6119 `t'. Since `t' means true, the line says, in effect, "Yes, use mail
6120 aliases."
6121
6122 Mail aliases are convenient short names for long email addresses or for
6123 lists of email addresses. The file where you keep your `aliases' is
6124 `~/.mailrc'. You write an alias like this:
6125
6126 alias geo george@foobar.wiz.edu
6127
6128 When you write a message to George, address it to `geo'; the mailer
6129 will automatically expand `geo' to the full address.
6130
6131 
6132 File: eintr, Node: Indent Tabs Mode, Next: Keybindings, Prev: Mail Aliases, Up: Emacs Initialization
6133
6134 16.6 Indent Tabs Mode
6135 =====================
6136
6137 By default, Emacs inserts tabs in place of multiple spaces when it
6138 formats a region. (For example, you might indent many lines of text
6139 all at once with the `indent-region' command.) Tabs look fine on a
6140 terminal or with ordinary printing, but they produce badly indented
6141 output when you use TeX or Texinfo since TeX ignores tabs.
6142
6143 The following turns off Indent Tabs mode:
6144
6145 ;;; Prevent Extraneous Tabs
6146 (setq-default indent-tabs-mode nil)
6147
6148 Note that this line uses `setq-default' rather than the `setq' command
6149 that we have seen before. The `setq-default' command sets values only
6150 in buffers that do not have their own local values for the variable.
6151
6152 *Note Tabs vs. Spaces: (emacs)Just Spaces.
6153
6154 *Note Local Variables in Files: (emacs)File Variables.
6155
6156 
6157 File: eintr, Node: Keybindings, Next: Keymaps, Prev: Indent Tabs Mode, Up: Emacs Initialization
6158
6159 16.7 Some Keybindings
6160 =====================
6161
6162 Now for some personal keybindings:
6163
6164 ;;; Compare windows
6165 (global-set-key "\C-cw" 'compare-windows)
6166
6167 `compare-windows' is a nifty command that compares the text in your
6168 current window with text in the next window. It makes the comparison
6169 by starting at point in each window, moving over text in each window as
6170 far as they match. I use this command all the time.
6171
6172 This also shows how to set a key globally, for all modes.
6173
6174 The command is `global-set-key'. It is followed by the keybinding. In
6175 a `.emacs' file, the keybinding is written as shown: `\C-c' stands for
6176 `control-c', which means `press the control key and the `c' key at the
6177 same time'. The `w' means `press the `w' key'. The keybinding is
6178 surrounded by double quotation marks. In documentation, you would
6179 write this as `C-c w'. (If you were binding a <META> key, such as
6180 `M-c', rather than a <CTRL> key, you would write `\M-c'. *Note
6181 Rebinding Keys in Your Init File: (emacs)Init Rebinding, for details.)
6182
6183 The command invoked by the keys is `compare-windows'. Note that
6184 `compare-windows' is preceded by a single quote; otherwise, Emacs would
6185 first try to evaluate the symbol to determine its value.
6186
6187 These three things, the double quotation marks, the backslash before
6188 the `C', and the single quote mark are necessary parts of keybinding
6189 that I tend to forget. Fortunately, I have come to remember that I
6190 should look at my existing `.emacs' file, and adapt what is there.
6191
6192 As for the keybinding itself: `C-c w'. This combines the prefix key,
6193 `C-c', with a single character, in this case, `w'. This set of keys,
6194 `C-c' followed by a single character, is strictly reserved for
6195 individuals' own use. (I call these `own' keys, since these are for my
6196 own use.) You should always be able to create such a keybinding for
6197 your own use without stomping on someone else's keybinding. If you
6198 ever write an extension to Emacs, please avoid taking any of these keys
6199 for public use. Create a key like `C-c C-w' instead. Otherwise, we
6200 will run out of `own' keys.
6201
6202 Here is another keybinding, with a comment:
6203
6204 ;;; Keybinding for `occur'
6205 ; I use occur a lot, so let's bind it to a key:
6206 (global-set-key "\C-co" 'occur)
6207
6208 The `occur' command shows all the lines in the current buffer that
6209 contain a match for a regular expression. Matching lines are shown in
6210 a buffer called `*Occur*'. That buffer serves as a menu to jump to
6211 occurrences.
6212
6213 Here is how to unbind a key, so it does not work:
6214
6215 ;;; Unbind `C-x f'
6216 (global-unset-key "\C-xf")
6217
6218 There is a reason for this unbinding: I found I inadvertently typed
6219 `C-x f' when I meant to type `C-x C-f'. Rather than find a file, as I
6220 intended, I accidentally set the width for filled text, almost always
6221 to a width I did not want. Since I hardly ever reset my default width,
6222 I simply unbound the key.
6223
6224 The following rebinds an existing key:
6225
6226 ;;; Rebind `C-x C-b' for `buffer-menu'
6227 (global-set-key "\C-x\C-b" 'buffer-menu)
6228
6229 By default, `C-x C-b' runs the `list-buffers' command. This command
6230 lists your buffers in _another_ window. Since I almost always want to
6231 do something in that window, I prefer the `buffer-menu' command, which
6232 not only lists the buffers, but moves point into that window.
6233
6234 
6235 File: eintr, Node: Keymaps, Next: Loading Files, Prev: Keybindings, Up: Emacs Initialization
6236
6237 16.8 Keymaps
6238 ============
6239
6240 Emacs uses "keymaps" to record which keys call which commands. When
6241 you use `global-set-key' to set the keybinding for a single command in
6242 all parts of Emacs, you are specifying the keybinding in
6243 `current-global-map'.
6244
6245 Specific modes, such as C mode or Text mode, have their own keymaps;
6246 the mode-specific keymaps override the global map that is shared by all
6247 buffers.
6248
6249 The `global-set-key' function binds, or rebinds, the global keymap.
6250 For example, the following binds the key `C-x C-b' to the function
6251 `buffer-menu':
6252
6253 (global-set-key "\C-x\C-b" 'buffer-menu)
6254
6255 Mode-specific keymaps are bound using the `define-key' function, which
6256 takes a specific keymap as an argument, as well as the key and the
6257 command. For example, my `.emacs' file contains the following
6258 expression to bind the `texinfo-insert-@group' command to `C-c C-c g':
6259
6260 (define-key texinfo-mode-map "\C-c\C-cg" 'texinfo-insert-@group)
6261
6262 The `texinfo-insert-@group' function itself is a little extension to
6263 Texinfo mode that inserts `@group' into a Texinfo file. I use this
6264 command all the time and prefer to type the three strokes `C-c C-c g'
6265 rather than the six strokes `@ g r o u p'. (`@group' and its matching
6266 `@end group' are commands that keep all enclosed text together on one
6267 page; many multi-line examples in this book are surrounded by `@group
6268 ... @end group'.)
6269
6270 Here is the `texinfo-insert-@group' function definition:
6271
6272 (defun texinfo-insert-@group ()
6273 "Insert the string @group in a Texinfo buffer."
6274 (interactive)
6275 (beginning-of-line)
6276 (insert "@group\n"))
6277
6278 (Of course, I could have used Abbrev mode to save typing, rather than
6279 write a function to insert a word; but I prefer key strokes consistent
6280 with other Texinfo mode key bindings.)
6281
6282 You will see numerous `define-key' expressions in `loaddefs.el' as well
6283 as in the various mode libraries, such as `cc-mode.el' and
6284 `lisp-mode.el'.
6285
6286 *Note Customizing Key Bindings: (emacs)Key Bindings, and *Note Keymaps:
6287 (elisp)Keymaps, for more information about keymaps.
6288
6289 
6290 File: eintr, Node: Loading Files, Next: Autoload, Prev: Keymaps, Up: Emacs Initialization
6291
6292 16.9 Loading Files
6293 ==================
6294
6295 Many people in the GNU Emacs community have written extensions to
6296 Emacs. As time goes by, these extensions are often included in new
6297 releases. For example, the Calendar and Diary packages are now part of
6298 the standard GNU Emacs, as is Calc.
6299
6300 You can use a `load' command to evaluate a complete file and thereby
6301 install all the functions and variables in the file into Emacs. For
6302 example:
6303
6304 (load "~/emacs/slowsplit")
6305
6306 This evaluates, i.e. loads, the `slowsplit.el' file or if it exists,
6307 the faster, byte compiled `slowsplit.elc' file from the `emacs'
6308 sub-directory of your home directory. The file contains the function
6309 `split-window-quietly', which John Robinson wrote in 1989.
6310
6311 The `split-window-quietly' function splits a window with the minimum of
6312 redisplay. I installed it in 1989 because it worked well with the slow
6313 1200 baud terminals I was then using. Nowadays, I only occasionally
6314 come across such a slow connection, but I continue to use the function
6315 because I like the way it leaves the bottom half of a buffer in the
6316 lower of the new windows and the top half in the upper window.
6317
6318 To replace the key binding for the default `split-window-vertically',
6319 you must also unset that key and bind the keys to
6320 `split-window-quietly', like this:
6321
6322 (global-unset-key "\C-x2")
6323 (global-set-key "\C-x2" 'split-window-quietly)
6324
6325 If you load many extensions, as I do, then instead of specifying the
6326 exact location of the extension file, as shown above, you can specify
6327 that directory as part of Emacs' `load-path'. Then, when Emacs loads a
6328 file, it will search that directory as well as its default list of
6329 directories. (The default list is specified in `paths.h' when Emacs is
6330 built.)
6331
6332 The following command adds your `~/emacs' directory to the existing
6333 load path:
6334
6335 ;;; Emacs Load Path
6336 (setq load-path (cons "~/emacs" load-path))
6337
6338 Incidentally, `load-library' is an interactive interface to the `load'
6339 function. The complete function looks like this:
6340
6341 (defun load-library (library)
6342 "Load the library named LIBRARY.
6343 This is an interface to the function `load'."
6344 (interactive
6345 (list (completing-read "Load library: "
6346 'locate-file-completion
6347 (cons load-path (get-load-suffixes)))))
6348 (load library))
6349
6350 The name of the function, `load-library', comes from the use of
6351 `library' as a conventional synonym for `file'. The source for the
6352 `load-library' command is in the `files.el' library.
6353
6354 Another interactive command that does a slightly different job is
6355 `load-file'. *Note Libraries of Lisp Code for Emacs: (emacs)Lisp
6356 Libraries, for information on the distinction between `load-library'
6357 and this command.
6358
6359 
6360 File: eintr, Node: Autoload, Next: Simple Extension, Prev: Loading Files, Up: Emacs Initialization
6361
6362 16.10 Autoloading
6363 =================
6364
6365 Instead of installing a function by loading the file that contains it,
6366 or by evaluating the function definition, you can make the function
6367 available but not actually install it until it is first called. This
6368 is called "autoloading".
6369
6370 When you execute an autoloaded function, Emacs automatically evaluates
6371 the file that contains the definition, and then calls the function.
6372
6373 Emacs starts quicker with autoloaded functions, since their libraries
6374 are not loaded right away; but you need to wait a moment when you first
6375 use such a function, while its containing file is evaluated.
6376
6377 Rarely used functions are frequently autoloaded. The `loaddefs.el'
6378 library contains hundreds of autoloaded functions, from `bookmark-set'
6379 to `wordstar-mode'. Of course, you may come to use a `rare' function
6380 frequently. When you do, you should load that function's file with a
6381 `load' expression in your `.emacs' file.
6382
6383 In my `.emacs' file for Emacs version 22, I load 14 libraries that
6384 contain functions that would otherwise be autoloaded. (Actually, it
6385 would have been better to include these files in my `dumped' Emacs, but
6386 I forgot. *Note Building Emacs: (elisp)Building Emacs, and the
6387 `INSTALL' file for more about dumping.)
6388
6389 You may also want to include autoloaded expressions in your `.emacs'
6390 file. `autoload' is a built-in function that takes up to five
6391 arguments, the final three of which are optional. The first argument
6392 is the name of the function to be autoloaded; the second is the name of
6393 the file to be loaded. The third argument is documentation for the
6394 function, and the fourth tells whether the function can be called
6395 interactively. The fifth argument tells what type of
6396 object--`autoload' can handle a keymap or macro as well as a function
6397 (the default is a function).
6398
6399 Here is a typical example:
6400
6401 (autoload 'html-helper-mode
6402 "html-helper-mode" "Edit HTML documents" t)
6403
6404 (`html-helper-mode' is an alternative to `html-mode', which is a
6405 standard part of the distribution).
6406
6407 This expression autoloads the `html-helper-mode' function. It takes it
6408 from the `html-helper-mode.el' file (or from the byte compiled file
6409 `html-helper-mode.elc', if it exists.) The file must be located in a
6410 directory specified by `load-path'. The documentation says that this
6411 is a mode to help you edit documents written in the HyperText Markup
6412 Language. You can call this mode interactively by typing `M-x
6413 html-helper-mode'. (You need to duplicate the function's regular
6414 documentation in the autoload expression because the regular function
6415 is not yet loaded, so its documentation is not available.)
6416
6417 *Note Autoload: (elisp)Autoload, for more information.
6418
6419 
6420 File: eintr, Node: Simple Extension, Next: X11 Colors, Prev: Autoload, Up: Emacs Initialization
6421
6422 16.11 A Simple Extension: `line-to-top-of-window'
6423 =================================================
6424
6425 Here is a simple extension to Emacs that moves the line point is on to
6426 the top of the window. I use this all the time, to make text easier to
6427 read.
6428
6429 You can put the following code into a separate file and then load it
6430 from your `.emacs' file, or you can include it within your `.emacs'
6431 file.
6432
6433 Here is the definition:
6434
6435 ;;; Line to top of window;
6436 ;;; replace three keystroke sequence C-u 0 C-l
6437 (defun line-to-top-of-window ()
6438 "Move the line point is on to top of window."
6439 (interactive)
6440 (recenter 0))
6441
6442 Now for the keybinding.
6443
6444 Nowadays, function keys as well as mouse button events and non-ASCII
6445 characters are written within square brackets, without quotation marks.
6446 (In Emacs version 18 and before, you had to write different function
6447 key bindings for each different make of terminal.)
6448
6449 I bind `line-to-top-of-window' to my <F6> function key like this:
6450
6451 (global-set-key [f6] 'line-to-top-of-window)
6452
6453 For more information, see *Note Rebinding Keys in Your Init File:
6454 (emacs)Init Rebinding.
6455
6456 If you run two versions of GNU Emacs, such as versions 21 and 22, and
6457 use one `.emacs' file, you can select which code to evaluate with the
6458 following conditional:
6459
6460 (cond
6461 ((string-equal (number-to-string 21) (substring (emacs-version) 10 12))
6462 ;; evaluate version 21 code
6463 ( ... ))
6464 ((string-equal (number-to-string 22) (substring (emacs-version) 10 12))
6465 ;; evaluate version 22 code
6466 ( ... )))
6467
6468 For example, in contrast to version 20, version 21 blinks its cursor by
6469 default. I hate such blinking, as well as some other features in
6470 version 21, so I placed the following in my `.emacs' file(1):
6471
6472 (if (string-equal "21" (substring (emacs-version) 10 12))
6473 (progn
6474 (blink-cursor-mode 0)
6475 ;; Insert newline when you press `C-n' (next-line)
6476 ;; at the end of the buffer
6477 (setq next-line-add-newlines t)
6478 ;; Turn on image viewing
6479 (auto-image-file-mode t)
6480 ;; Turn on menu bar (this bar has text)
6481 ;; (Use numeric argument to turn on)
6482 (menu-bar-mode 1)
6483 ;; Turn off tool bar (this bar has icons)
6484 ;; (Use numeric argument to turn on)
6485 (tool-bar-mode nil)
6486 ;; Turn off tooltip mode for tool bar
6487 ;; (This mode causes icon explanations to pop up)
6488 ;; (Use numeric argument to turn on)
6489 (tooltip-mode nil)
6490 ;; If tooltips turned on, make tips appear promptly
6491 (setq tooltip-delay 0.1) ; default is one second
6492 ))
6493
6494 (You will note that instead of typing `(number-to-string 21)', I
6495 decided to save typing and wrote `21' as a string, `"21"', rather than
6496 convert it from an integer to a string. In this instance, this
6497 expression is better than the longer, but more general
6498 `(number-to-string 21)'. However, if you do not know ahead of time
6499 what type of information will be returned, then the `number-to-string'
6500 function will be needed.)
6501
6502 ---------- Footnotes ----------
6503
6504 (1) When I start instances of Emacs that do not load my `.emacs' file
6505 or any site file, I also turn off blinking:
6506
6507 emacs -q --no-site-file -eval '(blink-cursor-mode nil)'
6508
6509 Or nowadays, using an even more sophisticated set of options,
6510
6511 emacs -Q - D
6512
6513 
6514 File: eintr, Node: X11 Colors, Next: Miscellaneous, Prev: Simple Extension, Up: Emacs Initialization
6515
6516 16.12 X11 Colors
6517 ================
6518
6519 You can specify colors when you use Emacs with the MIT X Windowing
6520 system.
6521
6522 I dislike the default colors and specify my own.
6523
6524 Here are the expressions in my `.emacs' file that set values:
6525
6526 ;; Set cursor color
6527 (set-cursor-color "white")
6528
6529 ;; Set mouse color
6530 (set-mouse-color "white")
6531
6532 ;; Set foreground and background
6533 (set-foreground-color "white")
6534 (set-background-color "darkblue")
6535
6536 ;;; Set highlighting colors for isearch and drag
6537 (set-face-foreground 'highlight "white")
6538 (set-face-background 'highlight "blue")
6539
6540 (set-face-foreground 'region "cyan")
6541 (set-face-background 'region "blue")
6542
6543 (set-face-foreground 'secondary-selection "skyblue")
6544 (set-face-background 'secondary-selection "darkblue")
6545
6546 ;; Set calendar highlighting colors
6547 (setq calendar-load-hook
6548 '(lambda ()
6549 (set-face-foreground 'diary-face "skyblue")
6550 (set-face-background 'holiday-face "slate blue")
6551 (set-face-foreground 'holiday-face "white")))
6552
6553 The various shades of blue soothe my eye and prevent me from seeing the
6554 screen flicker.
6555
6556 Alternatively, I could have set my specifications in various X
6557 initialization files. For example, I could set the foreground,
6558 background, cursor, and pointer (i.e., mouse) colors in my
6559 `~/.Xresources' file like this:
6560
6561 Emacs*foreground: white
6562 Emacs*background: darkblue
6563 Emacs*cursorColor: white
6564 Emacs*pointerColor: white
6565
6566 In any event, since it is not part of Emacs, I set the root color of my
6567 X window in my `~/.xinitrc' file, like this(1):
6568
6569 xsetroot -solid Navy -fg white &
6570
6571 ---------- Footnotes ----------
6572
6573 (1) I also run more modern window managers, such as Enlightenment,
6574 Gnome, or KDE; in those cases, I often specify an image rather than a
6575 plain color.
6576
6577 
6578 File: eintr, Node: Miscellaneous, Next: Mode Line, Prev: X11 Colors, Up: Emacs Initialization
6579
6580 16.13 Miscellaneous Settings for a `.emacs' File
6581 ================================================
6582
6583 Here are a few miscellaneous settings:
6584
6585 - Set the shape and color of the mouse cursor:
6586
6587 ; Cursor shapes are defined in
6588 ; `/usr/include/X11/cursorfont.h';
6589 ; for example, the `target' cursor is number 128;
6590 ; the `top_left_arrow' cursor is number 132.
6591
6592 (let ((mpointer (x-get-resource "*mpointer"
6593 "*emacs*mpointer")))
6594 ;; If you have not set your mouse pointer
6595 ;; then set it, otherwise leave as is:
6596 (if (eq mpointer nil)
6597 (setq mpointer "132")) ; top_left_arrow
6598 (setq x-pointer-shape (string-to-int mpointer))
6599 (set-mouse-color "white"))
6600
6601 - Or you can set the values of a variety of features in an alist,
6602 like this:
6603
6604 (setq-default
6605 default-frame-alist
6606 '((cursor-color . "white")
6607 (mouse-color . "white")
6608 (foreground-color . "white")
6609 (background-color . "DodgerBlue4")
6610 ;; (cursor-type . bar)
6611 (cursor-type . box)
6612 (tool-bar-lines . 0)
6613 (menu-bar-lines . 1)
6614 (width . 80)
6615 (height . 58)
6616 (font .
6617 "-Misc-Fixed-Medium-R-Normal--20-200-75-75-C-100-ISO8859-1")
6618 ))
6619
6620 - Convert `<CTRL>-h' into <DEL> and <DEL> into `<CTRL>-h'.
6621 (Some older keyboards needed this, although I have not seen the
6622 problem recently.)
6623
6624 ;; Translate `C-h' to <DEL>.
6625 ; (keyboard-translate ?\C-h ?\C-?)
6626
6627 ;; Translate <DEL> to `C-h'.
6628 (keyboard-translate ?\C-? ?\C-h)
6629
6630 - Turn off a blinking cursor!
6631
6632 (if (fboundp 'blink-cursor-mode)
6633 (blink-cursor-mode -1))
6634
6635 or start GNU Emacs with the command `emacs -nbc'.
6636
6637 - Ignore case when using `grep'
6638 `-n' Prefix each line of output with line number
6639 `-i' Ignore case distinctions
6640 `-e' Protect patterns beginning with a hyphen character, `-'
6641
6642 (setq grep-command "grep -n -i -e ")
6643
6644 - Find an existing buffer, even if it has a different name
6645 This avoids problems with symbolic links.
6646
6647 (setq find-file-existing-other-name t)
6648
6649 - Set your language environment and default input method
6650
6651 (set-language-environment "latin-1")
6652 ;; Remember you can enable or disable multilingual text input
6653 ;; with the `toggle-input-method'' (C-\) command
6654 (setq default-input-method "latin-1-prefix")
6655
6656 If you want to write with Chinese `GB' characters, set this
6657 instead:
6658
6659 (set-language-environment "Chinese-GB")
6660 (setq default-input-method "chinese-tonepy")
6661
6662 Fixing Unpleasant Key Bindings
6663 ..............................
6664
6665 Some systems bind keys unpleasantly. Sometimes, for example, the
6666 <CTRL> key appears in an awkward spot rather than at the far left of
6667 the home row.
6668
6669 Usually, when people fix these sorts of keybindings, they do not change
6670 their `~/.emacs' file. Instead, they bind the proper keys on their
6671 consoles with the `loadkeys' or `install-keymap' commands in their boot
6672 script and then include `xmodmap' commands in their `.xinitrc' or
6673 `.Xsession' file for X Windows.
6674
6675 For a boot script:
6676
6677 loadkeys /usr/share/keymaps/i386/qwerty/emacs2.kmap.gz
6678
6679 or
6680
6681 install-keymap emacs2
6682
6683 For a `.xinitrc' or `.Xsession' file when the <Caps Lock> key is at the
6684 far left of the home row:
6685
6686 # Bind the key labeled `Caps Lock' to `Control'
6687 # (Such a broken user interface suggests that keyboard manufacturers
6688 # think that computers are typewriters from 1885.)
6689
6690 xmodmap -e "clear Lock"
6691 xmodmap -e "add Control = Caps_Lock"
6692
6693 In a `.xinitrc' or `.Xsession' file, to convert an <ALT> key to a
6694 <META> key:
6695
6696 # Some ill designed keyboards have a key labeled ALT and no Meta
6697 xmodmap -e "keysym Alt_L = Meta_L Alt_L"
6698
6699 
6700 File: eintr, Node: Mode Line, Prev: Miscellaneous, Up: Emacs Initialization
6701
6702 16.14 A Modified Mode Line
6703 ==========================
6704
6705 Finally, a feature I really like: a modified mode line.
6706
6707 When I work over a network, I forget which machine I am using. Also, I
6708 tend to I lose track of where I am, and which line point is on.
6709
6710 So I reset my mode line to look like this:
6711
6712 -:-- foo.texi rattlesnake:/home/bob/ Line 1 (Texinfo Fill) Top
6713
6714 I am visiting a file called `foo.texi', on my machine `rattlesnake' in
6715 my `/home/bob' buffer. I am on line 1, in Texinfo mode, and am at the
6716 top of the buffer.
6717
6718 My `.emacs' file has a section that looks like this:
6719
6720 ;; Set a Mode Line that tells me which machine, which directory,
6721 ;; and which line I am on, plus the other customary information.
6722 (setq default-mode-line-format
6723 (quote
6724 (#("-" 0 1
6725 (help-echo
6726 "mouse-1: select window, mouse-2: delete others ..."))
6727 mode-line-mule-info
6728 mode-line-modified
6729 mode-line-frame-identification
6730 " "
6731 mode-line-buffer-identification
6732 " "
6733 (:eval (substring
6734 (system-name) 0 (string-match "\\..+" (system-name))))
6735 ":"
6736 default-directory
6737 #(" " 0 1
6738 (help-echo
6739 "mouse-1: select window, mouse-2: delete others ..."))
6740 (line-number-mode " Line %l ")
6741 global-mode-string
6742 #(" %[(" 0 6
6743 (help-echo
6744 "mouse-1: select window, mouse-2: delete others ..."))
6745 (:eval (mode-line-mode-name))
6746 mode-line-process
6747 minor-mode-alist
6748 #("%n" 0 2 (help-echo "mouse-2: widen" local-map (keymap ...)))
6749 ")%] "
6750 (-3 . "%P")
6751 ;; "-%-"
6752 )))
6753
6754 Here, I redefine the default mode line. Most of the parts are from the
6755 original; but I make a few changes. I set the _default_ mode line
6756 format so as to permit various modes, such as Info, to override it.
6757
6758 Many elements in the list are self-explanatory: `mode-line-modified' is
6759 a variable that tells whether the buffer has been modified, `mode-name'
6760 tells the name of the mode, and so on. However, the format looks
6761 complicated because of two features we have not discussed.
6762
6763 The first string in the mode line is a dash or hyphen, `-'. In the old
6764 days, it would have been specified simply as `"-"'. But nowadays,
6765 Emacs can add properties to a string, such as highlighting or, as in
6766 this case, a help feature. If you place your mouse cursor over the
6767 hyphen, some help information appears (By default, you must wait
6768 seven-tenths of a second before the information appears. You can
6769 change that timing by changing the value of `tooltip-delay'.)
6770
6771 The new string format has a special syntax:
6772
6773 #("-" 0 1 (help-echo "mouse-1: select window, ..."))
6774
6775 The `#(' begins a list. The first element of the list is the string
6776 itself, just one `-'. The second and third elements specify the range
6777 over which the fourth element applies. A range starts _after_ a
6778 character, so a zero means the range starts just before the first
6779 character; a 1 means that the range ends just after the first
6780 character. The third element is the property for the range. It
6781 consists of a property list, a property name, in this case,
6782 `help-echo', followed by a value, in this case, a string. The second,
6783 third, and fourth elements of this new string format can be repeated.
6784
6785 *Note Text Properties: (elisp)Text Properties, and see *Note Mode Line
6786 Format: (elisp)Mode Line Format, for more information.
6787
6788 `mode-line-buffer-identification' displays the current buffer name. It
6789 is a list beginning `(#("%12b" 0 4 ...'. The `#(' begins the list.
6790
6791 The `"%12b"' displays the current buffer name, using the `buffer-name'
6792 function with which we are familiar; the `12' specifies the maximum
6793 number of characters that will be displayed. When a name has fewer
6794 characters, whitespace is added to fill out to this number. (Buffer
6795 names can and often should be longer than 12 characters; this length
6796 works well in a typical 80 column wide window.)
6797
6798 `:eval' was a new feature in GNU Emacs version 21. It says to evaluate
6799 the following form and use the result as a string to display. In this
6800 case, the expression displays the first component of the full system
6801 name. The end of the first component is a `.' (`period'), so I use the
6802 `string-match' function to tell me the length of the first component.
6803 The substring from the zeroth character to that length is the name of
6804 the machine.
6805
6806 This is the expression:
6807
6808 (:eval (substring
6809 (system-name) 0 (string-match "\\..+" (system-name))))
6810
6811 `%[' and `%]' cause a pair of square brackets to appear for each
6812 recursive editing level. `%n' says `Narrow' when narrowing is in
6813 effect. `%P' tells you the percentage of the buffer that is above the
6814 bottom of the window, or `Top', `Bottom', or `All'. (A lower case `p'
6815 tell you the percentage above the _top_ of the window.) `%-' inserts
6816 enough dashes to fill out the line.
6817
6818 Remember, "You don't have to like Emacs to like it" -- your own Emacs
6819 can have different colors, different commands, and different keys than
6820 a default Emacs.
6821
6822 On the other hand, if you want to bring up a plain `out of the box'
6823 Emacs, with no customization, type:
6824
6825 emacs -q
6826
6827 This will start an Emacs that does _not_ load your `~/.emacs'
6828 initialization file. A plain, default Emacs. Nothing more.
6829
6830 
6831 File: eintr, Node: Debugging, Next: Conclusion, Prev: Emacs Initialization, Up: Top
6832
6833 17 Debugging
6834 ************
6835
6836 GNU Emacs has two debuggers, `debug' and `edebug'. The first is built
6837 into the internals of Emacs and is always with you; the second requires
6838 that you instrument a function before you can use it.
6839
6840 Both debuggers are described extensively in *Note Debugging Lisp
6841 Programs: (elisp)Debugging. In this chapter, I will walk through a
6842 short example of each.
6843
6844 * Menu:
6845
6846 * debug::
6847 * debug-on-entry::
6848 * debug-on-quit::
6849 * edebug::
6850 * Debugging Exercises::
6851
6852 
6853 File: eintr, Node: debug, Next: debug-on-entry, Prev: Debugging, Up: Debugging
6854
6855 17.1 `debug'
6856 ============
6857
6858 Suppose you have written a function definition that is intended to
6859 return the sum of the numbers 1 through a given number. (This is the
6860 `triangle' function discussed earlier. *Note Example with Decrementing
6861 Counter: Decrementing Example, for a discussion.)
6862
6863 However, your function definition has a bug. You have mistyped `1='
6864 for `1-'. Here is the broken definition:
6865
6866 (defun triangle-bugged (number)
6867 "Return sum of numbers 1 through NUMBER inclusive."
6868 (let ((total 0))
6869 (while (> number 0)
6870 (setq total (+ total number))
6871 (setq number (1= number))) ; Error here.
6872 total))
6873
6874 If you are reading this in Info, you can evaluate this definition in
6875 the normal fashion. You will see `triangle-bugged' appear in the echo
6876 area.
6877
6878 Now evaluate the `triangle-bugged' function with an argument of 4:
6879
6880 (triangle-bugged 4)
6881
6882 In GNU Emacs version 21, you will create and enter a `*Backtrace*'
6883 buffer that says:
6884
6885
6886 ---------- Buffer: *Backtrace* ----------
6887 Debugger entered--Lisp error: (void-function 1=)
6888 (1= number)
6889 (setq number (1= number))
6890 (while (> number 0) (setq total (+ total number))
6891 (setq number (1= number)))
6892 (let ((total 0)) (while (> number 0) (setq total ...)
6893 (setq number ...)) total)
6894 triangle-bugged(4)
6895 eval((triangle-bugged 4))
6896 eval-last-sexp-1(nil)
6897 eval-last-sexp(nil)
6898 call-interactively(eval-last-sexp)
6899 ---------- Buffer: *Backtrace* ----------
6900
6901 (I have reformatted this example slightly; the debugger does not fold
6902 long lines. As usual, you can quit the debugger by typing `q' in the
6903 `*Backtrace*' buffer.)
6904
6905 In practice, for a bug as simple as this, the `Lisp error' line will
6906 tell you what you need to know to correct the definition. The function
6907 `1=' is `void'.
6908
6909 However, suppose you are not quite certain what is going on? You can
6910 read the complete backtrace.
6911
6912 In this case, you need to run GNU Emacs 22, which automatically starts
6913 the debugger that puts you in the `*Backtrace*' buffer; or else, you
6914 need to start the debugger manually as described below.
6915
6916 Read the `*Backtrace*' buffer from the bottom up; it tells you what
6917 Emacs did that led to the error. Emacs made an interactive call to
6918 `C-x C-e' (`eval-last-sexp'), which led to the evaluation of the
6919 `triangle-bugged' expression. Each line above tells you what the Lisp
6920 interpreter evaluated next.
6921
6922 The third line from the top of the buffer is
6923
6924 (setq number (1= number))
6925
6926 Emacs tried to evaluate this expression; in order to do so, it tried to
6927 evaluate the inner expression shown on the second line from the top:
6928
6929 (1= number)
6930
6931 This is where the error occurred; as the top line says:
6932
6933 Debugger entered--Lisp error: (void-function 1=)
6934
6935 You can correct the mistake, re-evaluate the function definition, and
6936 then run your test again.
6937
6938 
6939 File: eintr, Node: debug-on-entry, Next: debug-on-quit, Prev: debug, Up: Debugging
6940
6941 17.2 `debug-on-entry'
6942 =====================
6943
6944 GNU Emacs 22 starts the debugger automatically when your function has
6945 an error.
6946
6947 Incidentally, you can start the debugger manually for all versions of
6948 Emacs; the advantage is that the debugger runs even if you do not have
6949 a bug in your code. Sometimes your code will be free of bugs!
6950
6951 You can enter the debugger when you call the function by calling
6952 `debug-on-entry'.
6953
6954 Type:
6955
6956 M-x debug-on-entry RET triangle-bugged RET
6957
6958 Now, evaluate the following:
6959
6960 (triangle-bugged 5)
6961
6962 All versions of Emacs will create a `*Backtrace*' buffer and tell you
6963 that it is beginning to evaluate the `triangle-bugged' function:
6964
6965 ---------- Buffer: *Backtrace* ----------
6966 Debugger entered--entering a function:
6967 * triangle-bugged(5)
6968 eval((triangle-bugged 5))
6969 eval-last-sexp-1(nil)
6970 eval-last-sexp(nil)
6971 call-interactively(eval-last-sexp)
6972 ---------- Buffer: *Backtrace* ----------
6973
6974 In the `*Backtrace*' buffer, type `d'. Emacs will evaluate the first
6975 expression in `triangle-bugged'; the buffer will look like this:
6976
6977 ---------- Buffer: *Backtrace* ----------
6978 Debugger entered--beginning evaluation of function call form:
6979 * (let ((total 0)) (while (> number 0) (setq total ...)
6980 (setq number ...)) total)
6981 * triangle-bugged(5)
6982 eval((triangle-bugged 5))
6983 eval-last-sexp-1(nil)
6984 eval-last-sexp(nil)
6985 call-interactively(eval-last-sexp)
6986 ---------- Buffer: *Backtrace* ----------
6987
6988 Now, type `d' again, eight times, slowly. Each time you type `d',
6989 Emacs will evaluate another expression in the function definition.
6990
6991 Eventually, the buffer will look like this:
6992
6993 ---------- Buffer: *Backtrace* ----------
6994 Debugger entered--beginning evaluation of function call form:
6995 * (setq number (1= number))
6996 * (while (> number 0) (setq total (+ total number))
6997 (setq number (1= number)))
6998 * (let ((total 0)) (while (> number 0) (setq total ...)
6999 (setq number ...)) total)
7000 * triangle-bugged(5)
7001 eval((triangle-bugged 5))
7002 eval-last-sexp-1(nil)
7003 eval-last-sexp(nil)
7004 call-interactively(eval-last-sexp)
7005 ---------- Buffer: *Backtrace* ----------
7006
7007 Finally, after you type `d' two more times, Emacs will reach the error,
7008 and the top two lines of the `*Backtrace*' buffer will look like this:
7009
7010 ---------- Buffer: *Backtrace* ----------
7011 Debugger entered--Lisp error: (void-function 1=)
7012 * (1= number)
7013 ...
7014 ---------- Buffer: *Backtrace* ----------
7015
7016 By typing `d', you were able to step through the function.
7017
7018 You can quit a `*Backtrace*' buffer by typing `q' in it; this quits the
7019 trace, but does not cancel `debug-on-entry'.
7020
7021 To cancel the effect of `debug-on-entry', call `cancel-debug-on-entry'
7022 and the name of the function, like this:
7023
7024 M-x cancel-debug-on-entry RET triangle-bugged RET
7025
7026 (If you are reading this in Info, cancel `debug-on-entry' now.)
7027
7028 
7029 File: eintr, Node: debug-on-quit, Next: edebug, Prev: debug-on-entry, Up: Debugging
7030
7031 17.3 `debug-on-quit' and `(debug)'
7032 ==================================
7033
7034 In addition to setting `debug-on-error' or calling `debug-on-entry',
7035 there are two other ways to start `debug'.
7036
7037 You can start `debug' whenever you type `C-g' (`keyboard-quit') by
7038 setting the variable `debug-on-quit' to `t'. This is useful for
7039 debugging infinite loops.
7040
7041 Or, you can insert a line that says `(debug)' into your code where you
7042 want the debugger to start, like this:
7043
7044 (defun triangle-bugged (number)
7045 "Return sum of numbers 1 through NUMBER inclusive."
7046 (let ((total 0))
7047 (while (> number 0)
7048 (setq total (+ total number))
7049 (debug) ; Start debugger.
7050 (setq number (1= number))) ; Error here.
7051 total))
7052
7053 The `debug' function is described in detail in *Note The Lisp Debugger:
7054 (elisp)Debugger.
7055
7056 
7057 File: eintr, Node: edebug, Next: Debugging Exercises, Prev: debug-on-quit, Up: Debugging
7058
7059 17.4 The `edebug' Source Level Debugger
7060 =======================================
7061
7062 Edebug is a source level debugger. Edebug normally displays the source
7063 of the code you are debugging, with an arrow at the left that shows
7064 which line you are currently executing.
7065
7066 You can walk through the execution of a function, line by line, or run
7067 quickly until reaching a "breakpoint" where execution stops.
7068
7069 Edebug is described in *Note Edebug: (elisp)edebug.
7070
7071 Here is a bugged function definition for `triangle-recursively'. *Note
7072 Recursion in place of a counter: Recursive triangle function, for a
7073 review of it.
7074
7075 (defun triangle-recursively-bugged (number)
7076 "Return sum of numbers 1 through NUMBER inclusive.
7077 Uses recursion."
7078 (if (= number 1)
7079 1
7080 (+ number
7081 (triangle-recursively-bugged
7082 (1= number))))) ; Error here.
7083
7084 Normally, you would install this definition by positioning your cursor
7085 after the function's closing parenthesis and typing `C-x C-e'
7086 (`eval-last-sexp') or else by positioning your cursor within the
7087 definition and typing `C-M-x' (`eval-defun'). (By default, the
7088 `eval-defun' command works only in Emacs Lisp mode or in Lisp
7089 Interactive mode.)
7090
7091 However, to prepare this function definition for Edebug, you must first
7092 "instrument" the code using a different command. You can do this by
7093 positioning your cursor within the definition and typing
7094
7095 M-x edebug-defun RET
7096
7097 This will cause Emacs to load Edebug automatically if it is not already
7098 loaded, and properly instrument the function.
7099
7100 After instrumenting the function, place your cursor after the following
7101 expression and type `C-x C-e' (`eval-last-sexp'):
7102
7103 (triangle-recursively-bugged 3)
7104
7105 You will be jumped back to the source for `triangle-recursively-bugged'
7106 and the cursor positioned at the beginning of the `if' line of the
7107 function. Also, you will see an arrowhead at the left hand side of
7108 that line. The arrowhead marks the line where the function is
7109 executing. (In the following examples, we show the arrowhead with
7110 `=>'; in a windowing system, you may see the arrowhead as a solid
7111 triangle in the window `fringe'.)
7112
7113 =>-!-(if (= number 1)
7114
7115 In the example, the location of point is displayed as `-!-' (in a
7116 printed book, it is displayed with a five pointed star).
7117
7118 If you now press <SPC>, point will move to the next expression to be
7119 executed; the line will look like this:
7120
7121 =>(if -!-(= number 1)
7122
7123 As you continue to press <SPC>, point will move from expression to
7124 expression. At the same time, whenever an expression returns a value,
7125 that value will be displayed in the echo area. For example, after you
7126 move point past `number', you will see the following:
7127
7128 Result: 3 (#o3, #x3, ?\C-c)
7129
7130 This means the value of `number' is 3, which is octal three,
7131 hexadecimal three, and ASCII `control-c' (the third letter of the
7132 alphabet, in case you need to know this information).
7133
7134 You can continue moving through the code until you reach the line with
7135 the error. Before evaluation, that line looks like this:
7136
7137 => -!-(1= number))))) ; Error here.
7138
7139 When you press <SPC> once again, you will produce an error message that
7140 says:
7141
7142 Symbol's function definition is void: 1=
7143
7144 This is the bug.
7145
7146 Press `q' to quit Edebug.
7147
7148 To remove instrumentation from a function definition, simply
7149 re-evaluate it with a command that does not instrument it. For
7150 example, you could place your cursor after the definition's closing
7151 parenthesis and type `C-x C-e'.
7152
7153 Edebug does a great deal more than walk with you through a function.
7154 You can set it so it races through on its own, stopping only at an
7155 error or at specified stopping points; you can cause it to display the
7156 changing values of various expressions; you can find out how many times
7157 a function is called, and more.
7158
7159 Edebug is described in *Note Edebug: (elisp)edebug.
7160
7161 
7162 File: eintr, Node: Debugging Exercises, Prev: edebug, Up: Debugging
7163
7164 17.5 Debugging Exercises
7165 ========================
7166
7167 * Install the `count-words-region' function and then cause it to
7168 enter the built-in debugger when you call it. Run the command on a
7169 region containing two words. You will need to press `d' a
7170 remarkable number of times. On your system, is a `hook' called
7171 after the command finishes? (For information on hooks, see *Note
7172 Command Loop Overview: (elisp)Command Overview.)
7173
7174 * Copy `count-words-region' into the `*scratch*' buffer, instrument
7175 the function for Edebug, and walk through its execution. The
7176 function does not need to have a bug, although you can introduce
7177 one if you wish. If the function lacks a bug, the walk-through
7178 completes without problems.
7179
7180 * While running Edebug, type `?' to see a list of all the Edebug
7181 commands. (The `global-edebug-prefix' is usually `C-x X', i.e.
7182 `<CTRL>-x' followed by an upper case `X'; use this prefix for
7183 commands made outside of the Edebug debugging buffer.)
7184
7185 * In the Edebug debugging buffer, use the `p'
7186 (`edebug-bounce-point') command to see where in the region the
7187 `count-words-region' is working.
7188
7189 * Move point to some spot further down the function and then type the
7190 `h' (`edebug-goto-here') command to jump to that location.
7191
7192 * Use the `t' (`edebug-trace-mode') command to cause Edebug to walk
7193 through the function on its own; use an upper case `T' for
7194 `edebug-Trace-fast-mode'.
7195
7196 * Set a breakpoint, then run Edebug in Trace mode until it reaches
7197 the stopping point.
7198
7199 
7200 File: eintr, Node: Conclusion, Next: the-the, Prev: Debugging, Up: Top
7201
7202 18 Conclusion
7203 *************
7204
7205 We have now reached the end of this Introduction. You have now learned
7206 enough about programming in Emacs Lisp to set values, to write simple
7207 `.emacs' files for yourself and your friends, and write simple
7208 customizations and extensions to Emacs.
7209
7210 This is a place to stop. Or, if you wish, you can now go onward, and
7211 teach yourself.
7212
7213 You have learned some of the basic nuts and bolts of programming. But
7214 only some. There are a great many more brackets and hinges that are
7215 easy to use that we have not touched.
7216
7217 A path you can follow right now lies among the sources to GNU Emacs and
7218 in *Note The GNU Emacs Lisp Reference Manual: (elisp)Top.
7219
7220 The Emacs Lisp sources are an adventure. When you read the sources and
7221 come across a function or expression that is unfamiliar, you need to
7222 figure out or find out what it does.
7223
7224 Go to the Reference Manual. It is a thorough, complete, and fairly
7225 easy-to-read description of Emacs Lisp. It is written not only for
7226 experts, but for people who know what you know. (The `Reference
7227 Manual' comes with the standard GNU Emacs distribution. Like this
7228 introduction, it comes as a Texinfo source file, so you can read it
7229 on-line and as a typeset, printed book.)
7230
7231 Go to the other on-line help that is part of GNU Emacs: the on-line
7232 documentation for all functions and variables, and `find-tags', the
7233 program that takes you to sources.
7234
7235 Here is an example of how I explore the sources. Because of its name,
7236 `simple.el' is the file I looked at first, a long time ago. As it
7237 happens some of the functions in `simple.el' are complicated, or at
7238 least look complicated at first sight. The `open-line' function, for
7239 example, looks complicated.
7240
7241 You may want to walk through this function slowly, as we did with the
7242 `forward-sentence' function. (*Note The `forward-sentence' function:
7243 forward-sentence.) Or you may want to skip that function and look at
7244 another, such as `split-line'. You don't need to read all the
7245 functions. According to `count-words-in-defun', the `split-line'
7246 function contains 102 words and symbols.
7247
7248 Even though it is short, `split-line' contains expressions we have not
7249 studied: `skip-chars-forward', `indent-to', `current-column' and
7250 `insert-and-inherit'.
7251
7252 Consider the `skip-chars-forward' function. (It is part of the
7253 function definition for `back-to-indentation', which is shown in *Note
7254 Review: Review.)
7255
7256 In GNU Emacs, you can find out more about `skip-chars-forward' by
7257 typing `C-h f' (`describe-function') and the name of the function.
7258 This gives you the function documentation.
7259
7260 You may be able to guess what is done by a well named function such as
7261 `indent-to'; or you can look it up, too. Incidentally, the
7262 `describe-function' function itself is in `help.el'; it is one of those
7263 long, but decipherable functions. You can look up `describe-function'
7264 using the `C-h f' command!
7265
7266 In this instance, since the code is Lisp, the `*Help*' buffer contains
7267 the name of the library containing the function's source. You can put
7268 point over the name of the library and press the RET key, which in this
7269 situation is bound to `help-follow', and be taken directly to the
7270 source, in the same way as `M-.' (`find-tag').
7271
7272 The definition for `describe-function' illustrates how to customize the
7273 `interactive' expression without using the standard character codes;
7274 and it shows how to create a temporary buffer.
7275
7276 (The `indent-to' function is written in C rather than Emacs Lisp; it is
7277 a `built-in' function. `help-follow' takes you to its source as does
7278 `find-tag', when properly set up.)
7279
7280 You can look at a function's source using `find-tag', which is bound to
7281 `M-.' Finally, you can find out what the Reference Manual has to say
7282 by visiting the manual in Info, and typing `i' (`Info-index') and the
7283 name of the function, or by looking up the function in the index to a
7284 printed copy of the manual.
7285
7286 Similarly, you can find out what is meant by `insert-and-inherit'.
7287
7288 Other interesting source files include `paragraphs.el', `loaddefs.el',
7289 and `loadup.el'. The `paragraphs.el' file includes short, easily
7290 understood functions as well as longer ones. The `loaddefs.el' file
7291 contains the many standard autoloads and many keymaps. I have never
7292 looked at it all; only at parts. `loadup.el' is the file that loads
7293 the standard parts of Emacs; it tells you a great deal about how Emacs
7294 is built. (*Note Building Emacs: (elisp)Building Emacs, for more about
7295 building.)
7296
7297 As I said, you have learned some nuts and bolts; however, and very
7298 importantly, we have hardly touched major aspects of programming; I
7299 have said nothing about how to sort information, except to use the
7300 predefined `sort' function; I have said nothing about how to store
7301 information, except to use variables and lists; I have said nothing
7302 about how to write programs that write programs. These are topics for
7303 another, and different kind of book, a different kind of learning.
7304
7305 What you have done is learn enough for much practical work with GNU
7306 Emacs. What you have done is get started. This is the end of a
7307 beginning.
7308
7309 
7310 File: eintr, Node: the-the, Next: Kill Ring, Prev: Conclusion, Up: Top
7311
7312 Appendix A The `the-the' Function
7313 *********************************
7314
7315 Sometimes when you you write text, you duplicate words--as with "you
7316 you" near the beginning of this sentence. I find that most frequently,
7317 I duplicate "the"; hence, I call the function for detecting duplicated
7318 words, `the-the'.
7319
7320 As a first step, you could use the following regular expression to
7321 search for duplicates:
7322
7323 \\(\\w+[ \t\n]+\\)\\1
7324
7325 This regexp matches one or more word-constituent characters followed by
7326 one or more spaces, tabs, or newlines. However, it does not detect
7327 duplicated words on different lines, since the ending of the first
7328 word, the end of the line, is different from the ending of the second
7329 word, a space. (For more information about regular expressions, see
7330 *Note Regular Expression Searches: Regexp Search, as well as *Note
7331 Syntax of Regular Expressions: (emacs)Regexps, and *Note Regular
7332 Expressions: (elisp)Regular Expressions.)
7333
7334 You might try searching just for duplicated word-constituent characters
7335 but that does not work since the pattern detects doubles such as the
7336 two occurrences of `th' in `with the'.
7337
7338 Another possible regexp searches for word-constituent characters
7339 followed by non-word-constituent characters, reduplicated. Here,
7340 `\\w+' matches one or more word-constituent characters and `\\W*'
7341 matches zero or more non-word-constituent characters.
7342
7343 \\(\\(\\w+\\)\\W*\\)\\1
7344
7345 Again, not useful.
7346
7347 Here is the pattern that I use. It is not perfect, but good enough.
7348 `\\b' matches the empty string, provided it is at the beginning or end
7349 of a word; `[^@ \n\t]+' matches one or more occurrences of any
7350 characters that are _not_ an @-sign, space, newline, or tab.
7351
7352 \\b\\([^@ \n\t]+\\)[ \n\t]+\\1\\b
7353
7354 One can write more complicated expressions, but I found that this
7355 expression is good enough, so I use it.
7356
7357 Here is the `the-the' function, as I include it in my `.emacs' file,
7358 along with a handy global key binding:
7359
7360 (defun the-the ()
7361 "Search forward for for a duplicated word."
7362 (interactive)
7363 (message "Searching for for duplicated words ...")
7364 (push-mark)
7365 ;; This regexp is not perfect
7366 ;; but is fairly good over all:
7367 (if (re-search-forward
7368 "\\b\\([^@ \n\t]+\\)[ \n\t]+\\1\\b" nil 'move)
7369 (message "Found duplicated word.")
7370 (message "End of buffer")))
7371
7372 ;; Bind `the-the' to C-c \
7373 (global-set-key "\C-c\\" 'the-the)
7374
7375
7376 Here is test text:
7377
7378 one two two three four five
7379 five six seven
7380
7381 You can substitute the other regular expressions shown above in the
7382 function definition and try each of them on this list.
7383
7384 
7385 File: eintr, Node: Kill Ring, Next: Full Graph, Prev: the-the, Up: Top
7386
7387 Appendix B Handling the Kill Ring
7388 *********************************
7389
7390 The kill ring is a list that is transformed into a ring by the workings
7391 of the `current-kill' function. The `yank' and `yank-pop' commands use
7392 the `current-kill' function.
7393
7394 This appendix describes the `current-kill' function as well as both the
7395 `yank' and the `yank-pop' commands, but first, consider the workings of
7396 the kill ring.
7397
7398 The kill ring has a default maximum length of sixty items; this number
7399 is too large for an explanation. Instead, set it to four. Please
7400 evaluate the following:
7401
7402 (setq old-kill-ring-max kill-ring-max)
7403 (setq kill-ring-max 4)
7404
7405 Then, please copy each line of the following indented example into the
7406 kill ring. You may kill each line with `C-k' or mark it and copy it
7407 with `M-w'.
7408
7409 (In a read-only buffer, such as the `*info*' buffer, the kill command,
7410 `C-k' (`kill-line'), will not remove the text, merely copy it to the
7411 kill ring. However, your machine may beep at you. (`kill-line' calls
7412 `kill-region'.) Alternatively, for silence, you may copy the region of
7413 each line with the `M-w' (`kill-ring-save') command. You must mark
7414 each line for this command to succeed, but it does not matter at which
7415 end you put point or mark.)
7416
7417 Please invoke the calls in order, so that five elements attempt to fill
7418 the kill ring:
7419
7420 first some text
7421 second piece of text
7422 third line
7423 fourth line of text
7424 fifth bit of text
7425
7426 Then find the value of `kill-ring' by evaluating
7427
7428 kill-ring
7429
7430 It is:
7431
7432 ("fifth bit of text" "fourth line of text"
7433 "third line" "second piece of text")
7434
7435 The first element, `first some text', was dropped.
7436
7437 To return to the old value for the length of the kill ring, evaluate:
7438
7439 (setq kill-ring-max old-kill-ring-max)
7440
7441 * Menu:
7442
7443 * current-kill::
7444 * yank::
7445 * yank-pop::
7446 * ring file::
7447
7448 
7449 File: eintr, Node: current-kill, Next: yank, Prev: Kill Ring, Up: Kill Ring
7450
7451 B.1 The `current-kill' Function
7452 ===============================
7453
7454 The `current-kill' function changes the element in the kill ring to
7455 which `kill-ring-yank-pointer' points. (Also, the `kill-new' function
7456 sets `kill-ring-yank-pointer' to point to the latest element of the the
7457 kill ring.)
7458
7459 The `current-kill' function is used by `yank' and by `yank-pop'. Here
7460 is the code for `current-kill':
7461
7462 (defun current-kill (n &optional do-not-move)
7463 "Rotate the yanking point by N places, and then return that kill.
7464 If N is zero, `interprogram-paste-function' is set, and calling it
7465 returns a string, then that string is added to the front of the
7466 kill ring and returned as the latest kill.
7467 If optional arg DO-NOT-MOVE is non-nil, then don't actually move the
7468 yanking point; just return the Nth kill forward."
7469 (let ((interprogram-paste (and (= n 0)
7470 interprogram-paste-function
7471 (funcall interprogram-paste-function))))
7472 (if interprogram-paste
7473 (progn
7474 ;; Disable the interprogram cut function when we add the new
7475 ;; text to the kill ring, so Emacs doesn't try to own the
7476 ;; selection, with identical text.
7477 (let ((interprogram-cut-function nil))
7478 (kill-new interprogram-paste))
7479 interprogram-paste)
7480 (or kill-ring (error "Kill ring is empty"))
7481 (let ((ARGth-kill-element
7482 (nthcdr (mod (- n (length kill-ring-yank-pointer))
7483 (length kill-ring))
7484 kill-ring)))
7485 (or do-not-move
7486 (setq kill-ring-yank-pointer ARGth-kill-element))
7487 (car ARGth-kill-element)))))
7488
7489 In addition, the `kill-new' function sets `kill-ring-yank-pointer' to
7490 the latest element of the the kill ring. And indirectly so does
7491 `kill-append', since it calls `kill-new'. In addition, `kill-region'
7492 and `kill-line' call the `kill-new' function.
7493
7494 Here is the line in `kill-new', which is explained in *Note The
7495 `kill-new' function: kill-new function.
7496
7497 (setq kill-ring-yank-pointer kill-ring)
7498
7499 * Menu:
7500
7501 * Understanding current-kill::
7502