Mercurial > emacs
comparison info/eintr-2 @ 73591:b214bd8be620
info/eintr-2: Updated Info file to Third Edition for
`Introduction to Programming in Emacs Lisp'
author | Robert J. Chassell <bob@rattlesnake.com> |
---|---|
date | Tue, 31 Oct 2006 17:00:32 +0000 |
parents | |
children | f93366072a0b |
comparison
equal
deleted
inserted
replaced
73590:dcc218a536a8 | 73591:b214bd8be620 |
---|---|
1 This is ../info/eintr, produced by makeinfo version 4.8 from | |
2 emacs-lisp-intro.texi. | |
3 | |
4 INFO-DIR-SECTION Emacs | |
5 START-INFO-DIR-ENTRY | |
6 * Emacs Lisp Intro: (eintr). | |
7 A simple introduction to Emacs Lisp programming. | |
8 END-INFO-DIR-ENTRY | |
9 | |
10 This is an `Introduction to Programming in Emacs Lisp', for people who | |
11 are not programmers. | |
12 | |
13 Edition 3.00, 2006 Oct 31 | |
14 | |
15 Copyright (C) 1990, 1991, 1992, 1993, 1994, 1995, 1997, 2001, 2002, | |
16 2003, 2004, 2005, 2006 Free Software Foundation, Inc. | |
17 | |
18 Published by the: | |
19 | |
20 GNU Press, Website: http://www.gnupress.org | |
21 a division of the General: press@gnu.org | |
22 Free Software Foundation, Inc. Orders: sales@gnu.org | |
23 51 Franklin Street, Fifth Floor Tel: +1 (617) 542-5942 | |
24 Boston, MA 02110-1301 USA Fax: +1 (617) 542-2652 | |
25 | |
26 | |
27 ISBN 1-882114-43-4 | |
28 | |
29 Permission is granted to copy, distribute and/or modify this document | |
30 under the terms of the GNU Free Documentation License, Version 1.2 or | |
31 any later version published by the Free Software Foundation; there | |
32 being no Invariant Section, with the Front-Cover Texts being "A GNU | |
33 Manual", and with the Back-Cover Texts as in (a) below. A copy of the | |
34 license is included in the section entitled "GNU Free Documentation | |
35 License". | |
36 | |
37 (a) The FSF's Back-Cover Text is: "You have freedom to copy and modify | |
38 this GNU Manual, like GNU software. Copies published by the Free | |
39 Software Foundation raise funds for GNU development." | |
40 | |
41 | |
42 File: eintr, Node: defvar and asterisk, Prev: See variable current value, Up: defvar | |
43 | |
44 8.5.1 `defvar' and an asterisk | |
45 ------------------------------ | |
46 | |
47 In the past, Emacs used the `defvar' special form both for internal | |
48 variables that you would not expect a user to change and for variables | |
49 that you do expect a user to change. Although you can still use | |
50 `defvar' for user customizable variables, please use `defcustom' | |
51 instead, since that special form provides a path into the Customization | |
52 commands. (*Note Specifying Variables using `defcustom': defcustom.) | |
53 | |
54 When you specified a variable using the `defvar' special form, you | |
55 could distinguish a readily settable variable from others by typing an | |
56 asterisk, `*', in the first column of its documentation string. For | |
57 example: | |
58 | |
59 (defvar shell-command-default-error-buffer nil | |
60 "*Buffer name for `shell-command' ... error output. | |
61 ... ") | |
62 | |
63 You could (and still can) use the `set-variable' command to change the | |
64 value of `shell-command-default-error-buffer' temporarily. However, | |
65 options set using `set-variable' are set only for the duration of your | |
66 editing session. The new values are not saved between sessions. Each | |
67 time Emacs starts, it reads the original value, unless you change the | |
68 value within your `.emacs' file, either by setting it manually or by | |
69 using `customize'. *Note Your `.emacs' File: Emacs Initialization. | |
70 | |
71 For me, the major use of the `set-variable' command is to suggest | |
72 variables that I might want to set in my `.emacs' file. There are now | |
73 more than 700 such variables -- far too many to remember readily. | |
74 Fortunately, you can press <TAB> after calling the `M-x set-variable' | |
75 command to see the list of variables. (*Note Examining and Setting | |
76 Variables: (emacs)Examining.) | |
77 | |
78 | |
79 File: eintr, Node: cons & search-fwd Review, Next: search Exercises, Prev: defvar, Up: Cutting & Storing Text | |
80 | |
81 8.6 Review | |
82 ========== | |
83 | |
84 Here is a brief summary of some recently introduced functions. | |
85 | |
86 `car' | |
87 `cdr' | |
88 `car' returns the first element of a list; `cdr' returns the | |
89 second and subsequent elements of a list. | |
90 | |
91 For example: | |
92 | |
93 (car '(1 2 3 4 5 6 7)) | |
94 => 1 | |
95 (cdr '(1 2 3 4 5 6 7)) | |
96 => (2 3 4 5 6 7) | |
97 | |
98 `cons' | |
99 `cons' constructs a list by prepending its first argument to its | |
100 second argument. | |
101 | |
102 For example: | |
103 | |
104 (cons 1 '(2 3 4)) | |
105 => (1 2 3 4) | |
106 | |
107 `nthcdr' | |
108 Return the result of taking CDR `n' times on a list. The `rest of | |
109 the rest', as it were. | |
110 | |
111 For example: | |
112 | |
113 (nthcdr 3 '(1 2 3 4 5 6 7)) | |
114 => (4 5 6 7) | |
115 | |
116 `setcar' | |
117 `setcdr' | |
118 `setcar' changes the first element of a list; `setcdr' changes the | |
119 second and subsequent elements of a list. | |
120 | |
121 For example: | |
122 | |
123 (setq triple '(1 2 3)) | |
124 | |
125 (setcar triple '37) | |
126 | |
127 triple | |
128 => (37 2 3) | |
129 | |
130 (setcdr triple '("foo" "bar")) | |
131 | |
132 triple | |
133 => (37 "foo" "bar") | |
134 | |
135 `progn' | |
136 Evaluate each argument in sequence and then return the value of the | |
137 last. | |
138 | |
139 For example: | |
140 | |
141 (progn 1 2 3 4) | |
142 => 4 | |
143 | |
144 `save-restriction' | |
145 Record whatever narrowing is in effect in the current buffer, if | |
146 any, and restore that narrowing after evaluating the arguments. | |
147 | |
148 `search-forward' | |
149 Search for a string, and if the string is found, move point. | |
150 | |
151 Takes four arguments: | |
152 | |
153 1. The string to search for. | |
154 | |
155 2. Optionally, the limit of the search. | |
156 | |
157 3. Optionally, what to do if the search fails, return `nil' or an | |
158 error message. | |
159 | |
160 4. Optionally, how many times to repeat the search; if negative, | |
161 the search goes backwards. | |
162 | |
163 `kill-region' | |
164 `delete-and-extract-region' | |
165 `copy-region-as-kill' | |
166 `kill-region' cuts the text between point and mark from the buffer | |
167 and stores that text in the kill ring, so you can get it back by | |
168 yanking. | |
169 | |
170 `copy-region-as-kill' copies the text between point and mark into | |
171 the kill ring, from which you can get it by yanking. The function | |
172 does not cut or remove the text from the buffer. | |
173 | |
174 `delete-and-extract-region' removes the text between point and mark | |
175 from the buffer and throws it away. You cannot get it back. (This is | |
176 not an interactive command.) | |
177 | |
178 | |
179 File: eintr, Node: search Exercises, Prev: cons & search-fwd Review, Up: Cutting & Storing Text | |
180 | |
181 8.7 Searching Exercises | |
182 ======================= | |
183 | |
184 * Write an interactive function that searches for a string. If the | |
185 search finds the string, leave point after it and display a message | |
186 that says "Found!". (Do not use `search-forward' for the name of | |
187 this function; if you do, you will overwrite the existing version | |
188 of `search-forward' that comes with Emacs. Use a name such as | |
189 `test-search' instead.) | |
190 | |
191 * Write a function that prints the third element of the kill ring in | |
192 the echo area, if any; if the kill ring does not contain a third | |
193 element, print an appropriate message. | |
194 | |
195 | |
196 File: eintr, Node: List Implementation, Next: Yanking, Prev: Cutting & Storing Text, Up: Top | |
197 | |
198 9 How Lists are Implemented | |
199 *************************** | |
200 | |
201 In Lisp, atoms are recorded in a straightforward fashion; if the | |
202 implementation is not straightforward in practice, it is, nonetheless, | |
203 straightforward in theory. The atom `rose', for example, is recorded | |
204 as the four contiguous letters `r', `o', `s', `e'. A list, on the | |
205 other hand, is kept differently. The mechanism is equally simple, but | |
206 it takes a moment to get used to the idea. A list is kept using a | |
207 series of pairs of pointers. In the series, the first pointer in each | |
208 pair points to an atom or to another list, and the second pointer in | |
209 each pair points to the next pair, or to the symbol `nil', which marks | |
210 the end of the list. | |
211 | |
212 A pointer itself is quite simply the electronic address of what is | |
213 pointed to. Hence, a list is kept as a series of electronic addresses. | |
214 | |
215 * Menu: | |
216 | |
217 * Lists diagrammed:: | |
218 * Symbols as Chest:: | |
219 * List Exercise:: | |
220 | |
221 | |
222 File: eintr, Node: Lists diagrammed, Next: Symbols as Chest, Prev: List Implementation, Up: List Implementation | |
223 | |
224 Lists diagrammed | |
225 ================ | |
226 | |
227 For example, the list `(rose violet buttercup)' has three elements, | |
228 `rose', `violet', and `buttercup'. In the computer, the electronic | |
229 address of `rose' is recorded in a segment of computer memory along | |
230 with the address that gives the electronic address of where the atom | |
231 `violet' is located; and that address (the one that tells where | |
232 `violet' is located) is kept along with an address that tells where the | |
233 address for the atom `buttercup' is located. | |
234 | |
235 This sounds more complicated than it is and is easier seen in a diagram: | |
236 | |
237 ___ ___ ___ ___ ___ ___ | |
238 |___|___|--> |___|___|--> |___|___|--> nil | |
239 | | | | |
240 | | | | |
241 --> rose --> violet --> buttercup | |
242 | |
243 | |
244 | |
245 In the diagram, each box represents a word of computer memory that | |
246 holds a Lisp object, usually in the form of a memory address. The | |
247 boxes, i.e. the addresses, are in pairs. Each arrow points to what the | |
248 address is the address of, either an atom or another pair of addresses. | |
249 The first box is the electronic address of `rose' and the arrow points | |
250 to `rose'; the second box is the address of the next pair of boxes, the | |
251 first part of which is the address of `violet' and the second part of | |
252 which is the address of the next pair. The very last box points to the | |
253 symbol `nil', which marks the end of the list. | |
254 | |
255 When a variable is set to a list with a function such as `setq', it | |
256 stores the address of the first box in the variable. Thus, evaluation | |
257 of the expression | |
258 | |
259 (setq bouquet '(rose violet buttercup)) | |
260 | |
261 creates a situation like this: | |
262 | |
263 bouquet | |
264 | | |
265 | ___ ___ ___ ___ ___ ___ | |
266 --> |___|___|--> |___|___|--> |___|___|--> nil | |
267 | | | | |
268 | | | | |
269 --> rose --> violet --> buttercup | |
270 | |
271 | |
272 | |
273 In this example, the symbol `bouquet' holds the address of the first | |
274 pair of boxes. | |
275 | |
276 This same list can be illustrated in a different sort of box notation | |
277 like this: | |
278 | |
279 bouquet | |
280 | | |
281 | -------------- --------------- ---------------- | |
282 | | car | cdr | | car | cdr | | car | cdr | | |
283 -->| rose | o------->| violet | o------->| butter- | nil | | |
284 | | | | | | | cup | | | |
285 -------------- --------------- ---------------- | |
286 | |
287 | |
288 | |
289 (Symbols consist of more than pairs of addresses, but the structure of | |
290 a symbol is made up of addresses. Indeed, the symbol `bouquet' | |
291 consists of a group of address-boxes, one of which is the address of | |
292 the printed word `bouquet', a second of which is the address of a | |
293 function definition attached to the symbol, if any, a third of which is | |
294 the address of the first pair of address-boxes for the list `(rose | |
295 violet buttercup)', and so on. Here we are showing that the symbol's | |
296 third address-box points to the first pair of address-boxes for the | |
297 list.) | |
298 | |
299 If a symbol is set to the CDR of a list, the list itself is not | |
300 changed; the symbol simply has an address further down the list. (In | |
301 the jargon, CAR and CDR are `non-destructive'.) Thus, evaluation of | |
302 the following expression | |
303 | |
304 (setq flowers (cdr bouquet)) | |
305 | |
306 produces this: | |
307 | |
308 | |
309 bouquet flowers | |
310 | | | |
311 | ___ ___ | ___ ___ ___ ___ | |
312 --> | | | --> | | | | | | | |
313 |___|___|----> |___|___|--> |___|___|--> nil | |
314 | | | | |
315 | | | | |
316 --> rose --> violet --> buttercup | |
317 | |
318 | |
319 | |
320 | |
321 The value of `flowers' is `(violet buttercup)', which is to say, the | |
322 symbol `flowers' holds the address of the pair of address-boxes, the | |
323 first of which holds the address of `violet', and the second of which | |
324 holds the address of `buttercup'. | |
325 | |
326 A pair of address-boxes is called a "cons cell" or "dotted pair". | |
327 *Note Cons Cell and List Types: (elisp)Cons Cell Type, and *Note Dotted | |
328 Pair Notation: (elisp)Dotted Pair Notation, for more information about | |
329 cons cells and dotted pairs. | |
330 | |
331 The function `cons' adds a new pair of addresses to the front of a | |
332 series of addresses like that shown above. For example, evaluating the | |
333 expression | |
334 | |
335 (setq bouquet (cons 'lily bouquet)) | |
336 | |
337 produces: | |
338 | |
339 | |
340 bouquet flowers | |
341 | | | |
342 | ___ ___ ___ ___ | ___ ___ ___ ___ | |
343 --> | | | | | | --> | | | | | | | |
344 |___|___|----> |___|___|----> |___|___|---->|___|___|--> nil | |
345 | | | | | |
346 | | | | | |
347 --> lily --> rose --> violet --> buttercup | |
348 | |
349 | |
350 | |
351 | |
352 However, this does not change the value of the symbol `flowers', as you | |
353 can see by evaluating the following, | |
354 | |
355 (eq (cdr (cdr bouquet)) flowers) | |
356 | |
357 which returns `t' for true. | |
358 | |
359 Until it is reset, `flowers' still has the value `(violet buttercup)'; | |
360 that is, it has the address of the cons cell whose first address is of | |
361 `violet'. Also, this does not alter any of the pre-existing cons | |
362 cells; they are all still there. | |
363 | |
364 Thus, in Lisp, to get the CDR of a list, you just get the address of | |
365 the next cons cell in the series; to get the CAR of a list, you get the | |
366 address of the first element of the list; to `cons' a new element on a | |
367 list, you add a new cons cell to the front of the list. That is all | |
368 there is to it! The underlying structure of Lisp is brilliantly simple! | |
369 | |
370 And what does the last address in a series of cons cells refer to? It | |
371 is the address of the empty list, of `nil'. | |
372 | |
373 In summary, when a Lisp variable is set to a value, it is provided with | |
374 the address of the list to which the variable refers. | |
375 | |
376 | |
377 File: eintr, Node: Symbols as Chest, Next: List Exercise, Prev: Lists diagrammed, Up: List Implementation | |
378 | |
379 9.1 Symbols as a Chest of Drawers | |
380 ================================= | |
381 | |
382 In an earlier section, I suggested that you might imagine a symbol as | |
383 being a chest of drawers. The function definition is put in one | |
384 drawer, the value in another, and so on. What is put in the drawer | |
385 holding the value can be changed without affecting the contents of the | |
386 drawer holding the function definition, and vice-verse. | |
387 | |
388 Actually, what is put in each drawer is the address of the value or | |
389 function definition. It is as if you found an old chest in the attic, | |
390 and in one of its drawers you found a map giving you directions to | |
391 where the buried treasure lies. | |
392 | |
393 (In addition to its name, symbol definition, and variable value, a | |
394 symbol has a `drawer' for a "property list" which can be used to record | |
395 other information. Property lists are not discussed here; see *Note | |
396 Property Lists: (elisp)Property Lists.) | |
397 | |
398 Here is a fanciful representation: | |
399 | |
400 | |
401 Chest of Drawers Contents of Drawers | |
402 | |
403 __ o0O0o __ | |
404 / \ | |
405 --------------------- | |
406 | directions to | [map to] | |
407 | symbol name | bouquet | |
408 | | | |
409 +---------------------+ | |
410 | directions to | | |
411 | symbol definition | [none] | |
412 | | | |
413 +---------------------+ | |
414 | directions to | [map to] | |
415 | variable value | (rose violet buttercup) | |
416 | | | |
417 +---------------------+ | |
418 | directions to | | |
419 | property list | [not described here] | |
420 | | | |
421 +---------------------+ | |
422 |/ \| | |
423 | |
424 | |
425 | |
426 | |
427 | |
428 File: eintr, Node: List Exercise, Prev: Symbols as Chest, Up: List Implementation | |
429 | |
430 9.2 Exercise | |
431 ============ | |
432 | |
433 Set `flowers' to `violet' and `buttercup'. Cons two more flowers on to | |
434 this list and set this new list to `more-flowers'. Set the CAR of | |
435 `flowers' to a fish. What does the `more-flowers' list now contain? | |
436 | |
437 | |
438 File: eintr, Node: Yanking, Next: Loops & Recursion, Prev: List Implementation, Up: Top | |
439 | |
440 10 Yanking Text Back | |
441 ******************** | |
442 | |
443 Whenever you cut text out of a buffer with a `kill' command in GNU | |
444 Emacs, you can bring it back with a `yank' command. The text that is | |
445 cut out of the buffer is put in the kill ring and the yank commands | |
446 insert the appropriate contents of the kill ring back into a buffer | |
447 (not necessarily the original buffer). | |
448 | |
449 A simple `C-y' (`yank') command inserts the first item from the kill | |
450 ring into the current buffer. If the `C-y' command is followed | |
451 immediately by `M-y', the first element is replaced by the second | |
452 element. Successive `M-y' commands replace the second element with the | |
453 third, fourth, or fifth element, and so on. When the last element in | |
454 the kill ring is reached, it is replaced by the first element and the | |
455 cycle is repeated. (Thus the kill ring is called a `ring' rather than | |
456 just a `list'. However, the actual data structure that holds the text | |
457 is a list. *Note Handling the Kill Ring: Kill Ring, for the details of | |
458 how the list is handled as a ring.) | |
459 | |
460 * Menu: | |
461 | |
462 * Kill Ring Overview:: | |
463 * kill-ring-yank-pointer:: | |
464 * yank nthcdr Exercises:: | |
465 | |
466 | |
467 File: eintr, Node: Kill Ring Overview, Next: kill-ring-yank-pointer, Prev: Yanking, Up: Yanking | |
468 | |
469 10.1 Kill Ring Overview | |
470 ======================= | |
471 | |
472 The kill ring is a list of textual strings. This is what it looks like: | |
473 | |
474 ("some text" "a different piece of text" "yet more text") | |
475 | |
476 If this were the contents of my kill ring and I pressed `C-y', the | |
477 string of characters saying `some text' would be inserted in this | |
478 buffer where my cursor is located. | |
479 | |
480 The `yank' command is also used for duplicating text by copying it. | |
481 The copied text is not cut from the buffer, but a copy of it is put on | |
482 the kill ring and is inserted by yanking it back. | |
483 | |
484 Three functions are used for bringing text back from the kill ring: | |
485 `yank', which is usually bound to `C-y'; `yank-pop', which is usually | |
486 bound to `M-y'; and `rotate-yank-pointer', which is used by the two | |
487 other functions. | |
488 | |
489 These functions refer to the kill ring through a variable called the | |
490 `kill-ring-yank-pointer'. Indeed, the insertion code for both the | |
491 `yank' and `yank-pop' functions is: | |
492 | |
493 (insert (car kill-ring-yank-pointer)) | |
494 | |
495 (Well, no more. In GNU Emacs 22, the function has been replaced by | |
496 `insert-for-yank' which calls `insert-for-yank-1' repetitively for each | |
497 `yank-handler' segment. In turn, `insert-for-yank-1' strips text | |
498 properties from the inserted text according to | |
499 `yank-excluded-properties'. Otherwise, it is just like `insert'. We | |
500 will stick with plain `insert' since it is easier to understand.) | |
501 | |
502 To begin to understand how `yank' and `yank-pop' work, it is first | |
503 necessary to look at the `kill-ring-yank-pointer' variable and the | |
504 `rotate-yank-pointer' function. | |
505 | |
506 | |
507 File: eintr, Node: kill-ring-yank-pointer, Next: yank nthcdr Exercises, Prev: Kill Ring Overview, Up: Yanking | |
508 | |
509 10.2 The `kill-ring-yank-pointer' Variable | |
510 ========================================== | |
511 | |
512 `kill-ring-yank-pointer' is a variable, just as `kill-ring' is a | |
513 variable. It points to something by being bound to the value of what | |
514 it points to, like any other Lisp variable. | |
515 | |
516 Thus, if the value of the kill ring is: | |
517 | |
518 ("some text" "a different piece of text" "yet more text") | |
519 | |
520 and the `kill-ring-yank-pointer' points to the second clause, the value | |
521 of `kill-ring-yank-pointer' is: | |
522 | |
523 ("a different piece of text" "yet more text") | |
524 | |
525 As explained in the previous chapter (*note List Implementation::), the | |
526 computer does not keep two different copies of the text being pointed to | |
527 by both the `kill-ring' and the `kill-ring-yank-pointer'. The words "a | |
528 different piece of text" and "yet more text" are not duplicated. | |
529 Instead, the two Lisp variables point to the same pieces of text. Here | |
530 is a diagram: | |
531 | |
532 kill-ring kill-ring-yank-pointer | |
533 | | | |
534 | ___ ___ | ___ ___ ___ ___ | |
535 ---> | | | --> | | | | | | | |
536 |___|___|----> |___|___|--> |___|___|--> nil | |
537 | | | | |
538 | | | | |
539 | | --> "yet more text" | |
540 | | | |
541 | --> "a different piece of text | |
542 | | |
543 --> "some text" | |
544 | |
545 | |
546 | |
547 | |
548 Both the variable `kill-ring' and the variable `kill-ring-yank-pointer' | |
549 are pointers. But the kill ring itself is usually described as if it | |
550 were actually what it is composed of. The `kill-ring' is spoken of as | |
551 if it were the list rather than that it points to the list. | |
552 Conversely, the `kill-ring-yank-pointer' is spoken of as pointing to a | |
553 list. | |
554 | |
555 These two ways of talking about the same thing sound confusing at first | |
556 but make sense on reflection. The kill ring is generally thought of as | |
557 the complete structure of data that holds the information of what has | |
558 recently been cut out of the Emacs buffers. The | |
559 `kill-ring-yank-pointer' on the other hand, serves to indicate--that | |
560 is, to `point to'--that part of the kill ring of which the first | |
561 element (the CAR) will be inserted. | |
562 | |
563 | |
564 File: eintr, Node: yank nthcdr Exercises, Prev: kill-ring-yank-pointer, Up: Yanking | |
565 | |
566 10.3 Exercises with `yank' and `nthcdr' | |
567 ======================================= | |
568 | |
569 * Using `C-h v' (`describe-variable'), look at the value of your | |
570 kill ring. Add several items to your kill ring; look at its value | |
571 again. Using `M-y' (`yank-pop)', move all the way around the kill | |
572 ring. How many items were in your kill ring? Find the value of | |
573 `kill-ring-max'. Was your kill ring full, or could you have kept | |
574 more blocks of text within it? | |
575 | |
576 * Using `nthcdr' and `car', construct a series of expressions to | |
577 return the first, second, third, and fourth elements of a list. | |
578 | |
579 | |
580 File: eintr, Node: Loops & Recursion, Next: Regexp Search, Prev: Yanking, Up: Top | |
581 | |
582 11 Loops and Recursion | |
583 ********************** | |
584 | |
585 Emacs Lisp has two primary ways to cause an expression, or a series of | |
586 expressions, to be evaluated repeatedly: one uses a `while' loop, and | |
587 the other uses "recursion". | |
588 | |
589 Repetition can be very valuable. For example, to move forward four | |
590 sentences, you need only write a program that will move forward one | |
591 sentence and then repeat the process four times. Since a computer does | |
592 not get bored or tired, such repetitive action does not have the | |
593 deleterious effects that excessive or the wrong kinds of repetition can | |
594 have on humans. | |
595 | |
596 People mostly write Emacs Lisp functions using `while' loops and their | |
597 kin; but you can use recursion, which provides a very powerful way to | |
598 think about and then to solve problems(1). | |
599 | |
600 * Menu: | |
601 | |
602 * while:: | |
603 * dolist dotimes:: | |
604 * Recursion:: | |
605 * Looping exercise:: | |
606 | |
607 ---------- Footnotes ---------- | |
608 | |
609 (1) You can write recursive functions to be frugal or wasteful of | |
610 mental or computer resources; as it happens, methods that people find | |
611 easy--that are frugal of `mental resources'--sometimes use considerable | |
612 computer resources. Emacs was designed to run on machines that we now | |
613 consider limited and its default settings are conservative. You may | |
614 want to increase the values of `max-specpdl-size' and | |
615 `max-lisp-eval-depth'. In my `.emacs' file, I set them to 15 and 30 | |
616 times their default value. | |
617 | |
618 | |
619 File: eintr, Node: while, Next: dolist dotimes, Prev: Loops & Recursion, Up: Loops & Recursion | |
620 | |
621 11.1 `while' | |
622 ============ | |
623 | |
624 The `while' special form tests whether the value returned by evaluating | |
625 its first argument is true or false. This is similar to what the Lisp | |
626 interpreter does with an `if'; what the interpreter does next, however, | |
627 is different. | |
628 | |
629 In a `while' expression, if the value returned by evaluating the first | |
630 argument is false, the Lisp interpreter skips the rest of the | |
631 expression (the "body" of the expression) and does not evaluate it. | |
632 However, if the value is true, the Lisp interpreter evaluates the body | |
633 of the expression and then again tests whether the first argument to | |
634 `while' is true or false. If the value returned by evaluating the | |
635 first argument is again true, the Lisp interpreter again evaluates the | |
636 body of the expression. | |
637 | |
638 The template for a `while' expression looks like this: | |
639 | |
640 (while TRUE-OR-FALSE-TEST | |
641 BODY...) | |
642 | |
643 * Menu: | |
644 | |
645 * Looping with while:: | |
646 * Loop Example:: | |
647 * print-elements-of-list:: | |
648 * Incrementing Loop:: | |
649 * Decrementing Loop:: | |
650 | |
651 | |
652 File: eintr, Node: Looping with while, Next: Loop Example, Prev: while, Up: while | |
653 | |
654 Looping with `while' | |
655 -------------------- | |
656 | |
657 So long as the true-or-false-test of the `while' expression returns a | |
658 true value when it is evaluated, the body is repeatedly evaluated. | |
659 This process is called a loop since the Lisp interpreter repeats the | |
660 same thing again and again, like an airplane doing a loop. When the | |
661 result of evaluating the true-or-false-test is false, the Lisp | |
662 interpreter does not evaluate the rest of the `while' expression and | |
663 `exits the loop'. | |
664 | |
665 Clearly, if the value returned by evaluating the first argument to | |
666 `while' is always true, the body following will be evaluated again and | |
667 again ... and again ... forever. Conversely, if the value returned is | |
668 never true, the expressions in the body will never be evaluated. The | |
669 craft of writing a `while' loop consists of choosing a mechanism such | |
670 that the true-or-false-test returns true just the number of times that | |
671 you want the subsequent expressions to be evaluated, and then have the | |
672 test return false. | |
673 | |
674 The value returned by evaluating a `while' is the value of the | |
675 true-or-false-test. An interesting consequence of this is that a | |
676 `while' loop that evaluates without error will return `nil' or false | |
677 regardless of whether it has looped 1 or 100 times or none at all. A | |
678 `while' expression that evaluates successfully never returns a true | |
679 value! What this means is that `while' is always evaluated for its | |
680 side effects, which is to say, the consequences of evaluating the | |
681 expressions within the body of the `while' loop. This makes sense. It | |
682 is not the mere act of looping that is desired, but the consequences of | |
683 what happens when the expressions in the loop are repeatedly evaluated. | |
684 | |
685 | |
686 File: eintr, Node: Loop Example, Next: print-elements-of-list, Prev: Looping with while, Up: while | |
687 | |
688 11.1.1 A `while' Loop and a List | |
689 -------------------------------- | |
690 | |
691 A common way to control a `while' loop is to test whether a list has | |
692 any elements. If it does, the loop is repeated; but if it does not, | |
693 the repetition is ended. Since this is an important technique, we will | |
694 create a short example to illustrate it. | |
695 | |
696 A simple way to test whether a list has elements is to evaluate the | |
697 list: if it has no elements, it is an empty list and will return the | |
698 empty list, `()', which is a synonym for `nil' or false. On the other | |
699 hand, a list with elements will return those elements when it is | |
700 evaluated. Since Emacs Lisp considers as true any value that is not | |
701 `nil', a list that returns elements will test true in a `while' loop. | |
702 | |
703 For example, you can set the variable `empty-list' to `nil' by | |
704 evaluating the following `setq' expression: | |
705 | |
706 (setq empty-list ()) | |
707 | |
708 After evaluating the `setq' expression, you can evaluate the variable | |
709 `empty-list' in the usual way, by placing the cursor after the symbol | |
710 and typing `C-x C-e'; `nil' will appear in your echo area: | |
711 | |
712 empty-list | |
713 | |
714 On the other hand, if you set a variable to be a list with elements, the | |
715 list will appear when you evaluate the variable, as you can see by | |
716 evaluating the following two expressions: | |
717 | |
718 (setq animals '(gazelle giraffe lion tiger)) | |
719 | |
720 animals | |
721 | |
722 Thus, to create a `while' loop that tests whether there are any items | |
723 in the list `animals', the first part of the loop will be written like | |
724 this: | |
725 | |
726 (while animals | |
727 ... | |
728 | |
729 When the `while' tests its first argument, the variable `animals' is | |
730 evaluated. It returns a list. So long as the list has elements, the | |
731 `while' considers the results of the test to be true; but when the list | |
732 is empty, it considers the results of the test to be false. | |
733 | |
734 To prevent the `while' loop from running forever, some mechanism needs | |
735 to be provided to empty the list eventually. An oft-used technique is | |
736 to have one of the subsequent forms in the `while' expression set the | |
737 value of the list to be the CDR of the list. Each time the `cdr' | |
738 function is evaluated, the list will be made shorter, until eventually | |
739 only the empty list will be left. At this point, the test of the | |
740 `while' loop will return false, and the arguments to the `while' will | |
741 no longer be evaluated. | |
742 | |
743 For example, the list of animals bound to the variable `animals' can be | |
744 set to be the CDR of the original list with the following expression: | |
745 | |
746 (setq animals (cdr animals)) | |
747 | |
748 If you have evaluated the previous expressions and then evaluate this | |
749 expression, you will see `(giraffe lion tiger)' appear in the echo | |
750 area. If you evaluate the expression again, `(lion tiger)' will appear | |
751 in the echo area. If you evaluate it again and yet again, `(tiger)' | |
752 appears and then the empty list, shown by `nil'. | |
753 | |
754 A template for a `while' loop that uses the `cdr' function repeatedly | |
755 to cause the true-or-false-test eventually to test false looks like | |
756 this: | |
757 | |
758 (while TEST-WHETHER-LIST-IS-EMPTY | |
759 BODY... | |
760 SET-LIST-TO-CDR-OF-LIST) | |
761 | |
762 This test and use of `cdr' can be put together in a function that goes | |
763 through a list and prints each element of the list on a line of its own. | |
764 | |
765 | |
766 File: eintr, Node: print-elements-of-list, Next: Incrementing Loop, Prev: Loop Example, Up: while | |
767 | |
768 11.1.2 An Example: `print-elements-of-list' | |
769 ------------------------------------------- | |
770 | |
771 The `print-elements-of-list' function illustrates a `while' loop with a | |
772 list. | |
773 | |
774 The function requires several lines for its output. If you are reading | |
775 this in a recent instance of GNU Emacs, you can evaluate the following | |
776 expression inside of Info, as usual. | |
777 | |
778 If you are using an earlier version of Emacs, you need to copy the | |
779 necessary expressions to your `*scratch*' buffer and evaluate them | |
780 there. This is because the echo area had only one line in the earlier | |
781 versions. | |
782 | |
783 You can copy the expressions by marking the beginning of the region | |
784 with `C-<SPC>' (`set-mark-command'), moving the cursor to the end of | |
785 the region and then copying the region using `M-w' (`kill-ring-save', | |
786 which calls `copy-region-as-kill' and then provides visual feedback). | |
787 In the `*scratch*' buffer, you can yank the expressions back by typing | |
788 `C-y' (`yank'). | |
789 | |
790 After you have copied the expressions to the `*scratch*' buffer, | |
791 evaluate each expression in turn. Be sure to evaluate the last | |
792 expression, `(print-elements-of-list animals)', by typing `C-u C-x | |
793 C-e', that is, by giving an argument to `eval-last-sexp'. This will | |
794 cause the result of the evaluation to be printed in the `*scratch*' | |
795 buffer instead of being printed in the echo area. (Otherwise you will | |
796 see something like this in your echo area: | |
797 `^Jgazelle^J^Jgiraffe^J^Jlion^J^Jtiger^Jnil', in which each `^J' stands | |
798 for a `newline'.) | |
799 | |
800 In a recent instance of GNU Emacs, you can evaluate these expressions | |
801 directly in the Info buffer, and the echo area will grow to show the | |
802 results. | |
803 | |
804 (setq animals '(gazelle giraffe lion tiger)) | |
805 | |
806 (defun print-elements-of-list (list) | |
807 "Print each element of LIST on a line of its own." | |
808 (while list | |
809 (print (car list)) | |
810 (setq list (cdr list)))) | |
811 | |
812 (print-elements-of-list animals) | |
813 | |
814 When you evaluate the three expressions in sequence, you will see this: | |
815 | |
816 gazelle | |
817 | |
818 giraffe | |
819 | |
820 lion | |
821 | |
822 tiger | |
823 nil | |
824 | |
825 Each element of the list is printed on a line of its own (that is what | |
826 the function `print' does) and then the value returned by the function | |
827 is printed. Since the last expression in the function is the `while' | |
828 loop, and since `while' loops always return `nil', a `nil' is printed | |
829 after the last element of the list. | |
830 | |
831 | |
832 File: eintr, Node: Incrementing Loop, Next: Decrementing Loop, Prev: print-elements-of-list, Up: while | |
833 | |
834 11.1.3 A Loop with an Incrementing Counter | |
835 ------------------------------------------ | |
836 | |
837 A loop is not useful unless it stops when it ought. Besides | |
838 controlling a loop with a list, a common way of stopping a loop is to | |
839 write the first argument as a test that returns false when the correct | |
840 number of repetitions are complete. This means that the loop must have | |
841 a counter--an expression that counts how many times the loop repeats | |
842 itself. | |
843 | |
844 The test can be an expression such as `(< count desired-number)' which | |
845 returns `t' for true if the value of `count' is less than the | |
846 `desired-number' of repetitions and `nil' for false if the value of | |
847 `count' is equal to or is greater than the `desired-number'. The | |
848 expression that increments the count can be a simple `setq' such as | |
849 `(setq count (1+ count))', where `1+' is a built-in function in Emacs | |
850 Lisp that adds 1 to its argument. (The expression `(1+ count)' has the | |
851 same result as `(+ count 1)', but is easier for a human to read.) | |
852 | |
853 The template for a `while' loop controlled by an incrementing counter | |
854 looks like this: | |
855 | |
856 SET-COUNT-TO-INITIAL-VALUE | |
857 (while (< count desired-number) ; true-or-false-test | |
858 BODY... | |
859 (setq count (1+ count))) ; incrementer | |
860 | |
861 Note that you need to set the initial value of `count'; usually it is | |
862 set to 1. | |
863 | |
864 * Menu: | |
865 | |
866 * Incrementing Example:: | |
867 * Inc Example parts:: | |
868 * Inc Example altogether:: | |
869 | |
870 | |
871 File: eintr, Node: Incrementing Example, Next: Inc Example parts, Prev: Incrementing Loop, Up: Incrementing Loop | |
872 | |
873 Example with incrementing counter | |
874 ................................. | |
875 | |
876 Suppose you are playing on the beach and decide to make a triangle of | |
877 pebbles, putting one pebble in the first row, two in the second row, | |
878 three in the third row and so on, like this: | |
879 | |
880 | |
881 * | |
882 * * | |
883 * * * | |
884 * * * * | |
885 | |
886 | |
887 (About 2500 years ago, Pythagoras and others developed the beginnings of | |
888 number theory by considering questions such as this.) | |
889 | |
890 Suppose you want to know how many pebbles you will need to make a | |
891 triangle with 7 rows? | |
892 | |
893 Clearly, what you need to do is add up the numbers from 1 to 7. There | |
894 are two ways to do this; start with the smallest number, one, and add up | |
895 the list in sequence, 1, 2, 3, 4 and so on; or start with the largest | |
896 number and add the list going down: 7, 6, 5, 4 and so on. Because both | |
897 mechanisms illustrate common ways of writing `while' loops, we will | |
898 create two examples, one counting up and the other counting down. In | |
899 this first example, we will start with 1 and add 2, 3, 4 and so on. | |
900 | |
901 If you are just adding up a short list of numbers, the easiest way to do | |
902 it is to add up all the numbers at once. However, if you do not know | |
903 ahead of time how many numbers your list will have, or if you want to be | |
904 prepared for a very long list, then you need to design your addition so | |
905 that what you do is repeat a simple process many times instead of doing | |
906 a more complex process once. | |
907 | |
908 For example, instead of adding up all the pebbles all at once, what you | |
909 can do is add the number of pebbles in the first row, 1, to the number | |
910 in the second row, 2, and then add the total of those two rows to the | |
911 third row, 3. Then you can add the number in the fourth row, 4, to the | |
912 total of the first three rows; and so on. | |
913 | |
914 The critical characteristic of the process is that each repetitive | |
915 action is simple. In this case, at each step we add only two numbers, | |
916 the number of pebbles in the row and the total already found. This | |
917 process of adding two numbers is repeated again and again until the last | |
918 row has been added to the total of all the preceding rows. In a more | |
919 complex loop the repetitive action might not be so simple, but it will | |
920 be simpler than doing everything all at once. | |
921 | |
922 | |
923 File: eintr, Node: Inc Example parts, Next: Inc Example altogether, Prev: Incrementing Example, Up: Incrementing Loop | |
924 | |
925 The parts of the function definition | |
926 .................................... | |
927 | |
928 The preceding analysis gives us the bones of our function definition: | |
929 first, we will need a variable that we can call `total' that will be | |
930 the total number of pebbles. This will be the value returned by the | |
931 function. | |
932 | |
933 Second, we know that the function will require an argument: this | |
934 argument will be the total number of rows in the triangle. It can be | |
935 called `number-of-rows'. | |
936 | |
937 Finally, we need a variable to use as a counter. We could call this | |
938 variable `counter', but a better name is `row-number'. That is because | |
939 what the counter does in this function is count rows, and a program | |
940 should be written to be as understandable as possible. | |
941 | |
942 When the Lisp interpreter first starts evaluating the expressions in the | |
943 function, the value of `total' should be set to zero, since we have not | |
944 added anything to it. Then the function should add the number of | |
945 pebbles in the first row to the total, and then add the number of | |
946 pebbles in the second to the total, and then add the number of pebbles | |
947 in the third row to the total, and so on, until there are no more rows | |
948 left to add. | |
949 | |
950 Both `total' and `row-number' are used only inside the function, so | |
951 they can be declared as local variables with `let' and given initial | |
952 values. Clearly, the initial value for `total' should be 0. The | |
953 initial value of `row-number' should be 1, since we start with the | |
954 first row. This means that the `let' statement will look like this: | |
955 | |
956 (let ((total 0) | |
957 (row-number 1)) | |
958 BODY...) | |
959 | |
960 After the internal variables are declared and bound to their initial | |
961 values, we can begin the `while' loop. The expression that serves as | |
962 the test should return a value of `t' for true so long as the | |
963 `row-number' is less than or equal to the `number-of-rows'. (If the | |
964 expression tests true only so long as the row number is less than the | |
965 number of rows in the triangle, the last row will never be added to the | |
966 total; hence the row number has to be either less than or equal to the | |
967 number of rows.) | |
968 | |
969 Lisp provides the `<=' function that returns true if the value of its | |
970 first argument is less than or equal to the value of its second | |
971 argument and false otherwise. So the expression that the `while' will | |
972 evaluate as its test should look like this: | |
973 | |
974 (<= row-number number-of-rows) | |
975 | |
976 The total number of pebbles can be found by repeatedly adding the number | |
977 of pebbles in a row to the total already found. Since the number of | |
978 pebbles in the row is equal to the row number, the total can be found by | |
979 adding the row number to the total. (Clearly, in a more complex | |
980 situation, the number of pebbles in the row might be related to the row | |
981 number in a more complicated way; if this were the case, the row number | |
982 would be replaced by the appropriate expression.) | |
983 | |
984 (setq total (+ total row-number)) | |
985 | |
986 What this does is set the new value of `total' to be equal to the sum | |
987 of adding the number of pebbles in the row to the previous total. | |
988 | |
989 After setting the value of `total', the conditions need to be | |
990 established for the next repetition of the loop, if there is one. This | |
991 is done by incrementing the value of the `row-number' variable, which | |
992 serves as a counter. After the `row-number' variable has been | |
993 incremented, the true-or-false-test at the beginning of the `while' | |
994 loop tests whether its value is still less than or equal to the value | |
995 of the `number-of-rows' and if it is, adds the new value of the | |
996 `row-number' variable to the `total' of the previous repetition of the | |
997 loop. | |
998 | |
999 The built-in Emacs Lisp function `1+' adds 1 to a number, so the | |
1000 `row-number' variable can be incremented with this expression: | |
1001 | |
1002 (setq row-number (1+ row-number)) | |
1003 | |
1004 | |
1005 File: eintr, Node: Inc Example altogether, Prev: Inc Example parts, Up: Incrementing Loop | |
1006 | |
1007 Putting the function definition together | |
1008 ........................................ | |
1009 | |
1010 We have created the parts for the function definition; now we need to | |
1011 put them together. | |
1012 | |
1013 First, the contents of the `while' expression: | |
1014 | |
1015 (while (<= row-number number-of-rows) ; true-or-false-test | |
1016 (setq total (+ total row-number)) | |
1017 (setq row-number (1+ row-number))) ; incrementer | |
1018 | |
1019 Along with the `let' expression varlist, this very nearly completes the | |
1020 body of the function definition. However, it requires one final | |
1021 element, the need for which is somewhat subtle. | |
1022 | |
1023 The final touch is to place the variable `total' on a line by itself | |
1024 after the `while' expression. Otherwise, the value returned by the | |
1025 whole function is the value of the last expression that is evaluated in | |
1026 the body of the `let', and this is the value returned by the `while', | |
1027 which is always `nil'. | |
1028 | |
1029 This may not be evident at first sight. It almost looks as if the | |
1030 incrementing expression is the last expression of the whole function. | |
1031 But that expression is part of the body of the `while'; it is the last | |
1032 element of the list that starts with the symbol `while'. Moreover, the | |
1033 whole of the `while' loop is a list within the body of the `let'. | |
1034 | |
1035 In outline, the function will look like this: | |
1036 | |
1037 (defun NAME-OF-FUNCTION (ARGUMENT-LIST) | |
1038 "DOCUMENTATION..." | |
1039 (let (VARLIST) | |
1040 (while (TRUE-OR-FALSE-TEST) | |
1041 BODY-OF-WHILE... ) | |
1042 ... )) ; Need final expression here. | |
1043 | |
1044 The result of evaluating the `let' is what is going to be returned by | |
1045 the `defun' since the `let' is not embedded within any containing list, | |
1046 except for the `defun' as a whole. However, if the `while' is the last | |
1047 element of the `let' expression, the function will always return `nil'. | |
1048 This is not what we want! Instead, what we want is the value of the | |
1049 variable `total'. This is returned by simply placing the symbol as the | |
1050 last element of the list starting with `let'. It gets evaluated after | |
1051 the preceding elements of the list are evaluated, which means it gets | |
1052 evaluated after it has been assigned the correct value for the total. | |
1053 | |
1054 It may be easier to see this by printing the list starting with `let' | |
1055 all on one line. This format makes it evident that the VARLIST and | |
1056 `while' expressions are the second and third elements of the list | |
1057 starting with `let', and the `total' is the last element: | |
1058 | |
1059 (let (VARLIST) (while (TRUE-OR-FALSE-TEST) BODY-OF-WHILE... ) total) | |
1060 | |
1061 Putting everything together, the `triangle' function definition looks | |
1062 like this: | |
1063 | |
1064 (defun triangle (number-of-rows) ; Version with | |
1065 ; incrementing counter. | |
1066 "Add up the number of pebbles in a triangle. | |
1067 The first row has one pebble, the second row two pebbles, | |
1068 the third row three pebbles, and so on. | |
1069 The argument is NUMBER-OF-ROWS." | |
1070 (let ((total 0) | |
1071 (row-number 1)) | |
1072 (while (<= row-number number-of-rows) | |
1073 (setq total (+ total row-number)) | |
1074 (setq row-number (1+ row-number))) | |
1075 total)) | |
1076 | |
1077 After you have installed `triangle' by evaluating the function, you can | |
1078 try it out. Here are two examples: | |
1079 | |
1080 (triangle 4) | |
1081 | |
1082 (triangle 7) | |
1083 | |
1084 The sum of the first four numbers is 10 and the sum of the first seven | |
1085 numbers is 28. | |
1086 | |
1087 | |
1088 File: eintr, Node: Decrementing Loop, Prev: Incrementing Loop, Up: while | |
1089 | |
1090 11.1.4 Loop with a Decrementing Counter | |
1091 --------------------------------------- | |
1092 | |
1093 Another common way to write a `while' loop is to write the test so that | |
1094 it determines whether a counter is greater than zero. So long as the | |
1095 counter is greater than zero, the loop is repeated. But when the | |
1096 counter is equal to or less than zero, the loop is stopped. For this | |
1097 to work, the counter has to start out greater than zero and then be | |
1098 made smaller and smaller by a form that is evaluated repeatedly. | |
1099 | |
1100 The test will be an expression such as `(> counter 0)' which returns | |
1101 `t' for true if the value of `counter' is greater than zero, and `nil' | |
1102 for false if the value of `counter' is equal to or less than zero. The | |
1103 expression that makes the number smaller and smaller can be a simple | |
1104 `setq' such as `(setq counter (1- counter))', where `1-' is a built-in | |
1105 function in Emacs Lisp that subtracts 1 from its argument. | |
1106 | |
1107 The template for a decrementing `while' loop looks like this: | |
1108 | |
1109 (while (> counter 0) ; true-or-false-test | |
1110 BODY... | |
1111 (setq counter (1- counter))) ; decrementer | |
1112 | |
1113 * Menu: | |
1114 | |
1115 * Decrementing Example:: | |
1116 * Dec Example parts:: | |
1117 * Dec Example altogether:: | |
1118 | |
1119 | |
1120 File: eintr, Node: Decrementing Example, Next: Dec Example parts, Prev: Decrementing Loop, Up: Decrementing Loop | |
1121 | |
1122 Example with decrementing counter | |
1123 ................................. | |
1124 | |
1125 To illustrate a loop with a decrementing counter, we will rewrite the | |
1126 `triangle' function so the counter decreases to zero. | |
1127 | |
1128 This is the reverse of the earlier version of the function. In this | |
1129 case, to find out how many pebbles are needed to make a triangle with 3 | |
1130 rows, add the number of pebbles in the third row, 3, to the number in | |
1131 the preceding row, 2, and then add the total of those two rows to the | |
1132 row that precedes them, which is 1. | |
1133 | |
1134 Likewise, to find the number of pebbles in a triangle with 7 rows, add | |
1135 the number of pebbles in the seventh row, 7, to the number in the | |
1136 preceding row, which is 6, and then add the total of those two rows to | |
1137 the row that precedes them, which is 5, and so on. As in the previous | |
1138 example, each addition only involves adding two numbers, the total of | |
1139 the rows already added up and the number of pebbles in the row that is | |
1140 being added to the total. This process of adding two numbers is | |
1141 repeated again and again until there are no more pebbles to add. | |
1142 | |
1143 We know how many pebbles to start with: the number of pebbles in the | |
1144 last row is equal to the number of rows. If the triangle has seven | |
1145 rows, the number of pebbles in the last row is 7. Likewise, we know how | |
1146 many pebbles are in the preceding row: it is one less than the number in | |
1147 the row. | |
1148 | |
1149 | |
1150 File: eintr, Node: Dec Example parts, Next: Dec Example altogether, Prev: Decrementing Example, Up: Decrementing Loop | |
1151 | |
1152 The parts of the function definition | |
1153 .................................... | |
1154 | |
1155 We start with three variables: the total number of rows in the | |
1156 triangle; the number of pebbles in a row; and the total number of | |
1157 pebbles, which is what we want to calculate. These variables can be | |
1158 named `number-of-rows', `number-of-pebbles-in-row', and `total', | |
1159 respectively. | |
1160 | |
1161 Both `total' and `number-of-pebbles-in-row' are used only inside the | |
1162 function and are declared with `let'. The initial value of `total' | |
1163 should, of course, be zero. However, the initial value of | |
1164 `number-of-pebbles-in-row' should be equal to the number of rows in the | |
1165 triangle, since the addition will start with the longest row. | |
1166 | |
1167 This means that the beginning of the `let' expression will look like | |
1168 this: | |
1169 | |
1170 (let ((total 0) | |
1171 (number-of-pebbles-in-row number-of-rows)) | |
1172 BODY...) | |
1173 | |
1174 The total number of pebbles can be found by repeatedly adding the number | |
1175 of pebbles in a row to the total already found, that is, by repeatedly | |
1176 evaluating the following expression: | |
1177 | |
1178 (setq total (+ total number-of-pebbles-in-row)) | |
1179 | |
1180 After the `number-of-pebbles-in-row' is added to the `total', the | |
1181 `number-of-pebbles-in-row' should be decremented by one, since the next | |
1182 time the loop repeats, the preceding row will be added to the total. | |
1183 | |
1184 The number of pebbles in a preceding row is one less than the number of | |
1185 pebbles in a row, so the built-in Emacs Lisp function `1-' can be used | |
1186 to compute the number of pebbles in the preceding row. This can be | |
1187 done with the following expression: | |
1188 | |
1189 (setq number-of-pebbles-in-row | |
1190 (1- number-of-pebbles-in-row)) | |
1191 | |
1192 Finally, we know that the `while' loop should stop making repeated | |
1193 additions when there are no pebbles in a row. So the test for the | |
1194 `while' loop is simply: | |
1195 | |
1196 (while (> number-of-pebbles-in-row 0) | |
1197 | |
1198 | |
1199 File: eintr, Node: Dec Example altogether, Prev: Dec Example parts, Up: Decrementing Loop | |
1200 | |
1201 Putting the function definition together | |
1202 ........................................ | |
1203 | |
1204 We can put these expressions together to create a function definition | |
1205 that works. However, on examination, we find that one of the local | |
1206 variables is unneeded! | |
1207 | |
1208 The function definition looks like this: | |
1209 | |
1210 ;;; First subtractive version. | |
1211 (defun triangle (number-of-rows) | |
1212 "Add up the number of pebbles in a triangle." | |
1213 (let ((total 0) | |
1214 (number-of-pebbles-in-row number-of-rows)) | |
1215 (while (> number-of-pebbles-in-row 0) | |
1216 (setq total (+ total number-of-pebbles-in-row)) | |
1217 (setq number-of-pebbles-in-row | |
1218 (1- number-of-pebbles-in-row))) | |
1219 total)) | |
1220 | |
1221 As written, this function works. | |
1222 | |
1223 However, we do not need `number-of-pebbles-in-row'. | |
1224 | |
1225 When the `triangle' function is evaluated, the symbol `number-of-rows' | |
1226 will be bound to a number, giving it an initial value. That number can | |
1227 be changed in the body of the function as if it were a local variable, | |
1228 without any fear that such a change will effect the value of the | |
1229 variable outside of the function. This is a very useful characteristic | |
1230 of Lisp; it means that the variable `number-of-rows' can be used | |
1231 anywhere in the function where `number-of-pebbles-in-row' is used. | |
1232 | |
1233 Here is a second version of the function written a bit more cleanly: | |
1234 | |
1235 (defun triangle (number) ; Second version. | |
1236 "Return sum of numbers 1 through NUMBER inclusive." | |
1237 (let ((total 0)) | |
1238 (while (> number 0) | |
1239 (setq total (+ total number)) | |
1240 (setq number (1- number))) | |
1241 total)) | |
1242 | |
1243 In brief, a properly written `while' loop will consist of three parts: | |
1244 | |
1245 1. A test that will return false after the loop has repeated itself | |
1246 the correct number of times. | |
1247 | |
1248 2. An expression the evaluation of which will return the value desired | |
1249 after being repeatedly evaluated. | |
1250 | |
1251 3. An expression to change the value passed to the true-or-false-test | |
1252 so that the test returns false after the loop has repeated itself | |
1253 the right number of times. | |
1254 | |
1255 | |
1256 File: eintr, Node: dolist dotimes, Next: Recursion, Prev: while, Up: Loops & Recursion | |
1257 | |
1258 11.2 Save your time: `dolist' and `dotimes' | |
1259 =========================================== | |
1260 | |
1261 In addition to `while', both `dolist' and `dotimes' provide for | |
1262 looping. Sometimes these are quicker to write than the equivalent | |
1263 `while' loop. Both are Lisp macros. (*Note Macros: (elisp)Macros. ) | |
1264 | |
1265 `dolist' works like a `while' loop that `CDRs down a list': `dolist' | |
1266 automatically shortens the list each time it loops--takes the CDR of | |
1267 the list--and binds the CAR of each shorter version of the list to the | |
1268 first of its arguments. | |
1269 | |
1270 `dotimes' loops a specific number of times: you specify the number. | |
1271 | |
1272 * Menu: | |
1273 | |
1274 * dolist:: | |
1275 * dotimes:: | |
1276 | |
1277 | |
1278 File: eintr, Node: dolist, Next: dotimes, Prev: dolist dotimes, Up: dolist dotimes | |
1279 | |
1280 The `dolist' Macro | |
1281 .................. | |
1282 | |
1283 Suppose, for example, you want to reverse a list, so that "first" | |
1284 "second" "third" becomes "third" "second" "first". | |
1285 | |
1286 In practice, you would use the `reverse' function, like this: | |
1287 | |
1288 (setq animals '(gazelle giraffe lion tiger)) | |
1289 | |
1290 (reverse animals) | |
1291 | |
1292 Here is how you could reverse the list using a `while' loop: | |
1293 | |
1294 (setq animals '(gazelle giraffe lion tiger)) | |
1295 | |
1296 (defun reverse-list-with-while (list) | |
1297 "Using while, reverse the order of LIST." | |
1298 (let (value) ; make sure list starts empty | |
1299 (while list | |
1300 (setq value (cons (car list) value)) | |
1301 (setq list (cdr list))) | |
1302 value)) | |
1303 | |
1304 (reverse-list-with-while animals) | |
1305 | |
1306 And here is how you could use the `dolist' macro: | |
1307 | |
1308 (setq animals '(gazelle giraffe lion tiger)) | |
1309 | |
1310 (defun reverse-list-with-dolist (list) | |
1311 "Using dolist, reverse the order of LIST." | |
1312 (let (value) ; make sure list starts empty | |
1313 (dolist (element list value) | |
1314 (setq value (cons element value))))) | |
1315 | |
1316 (reverse-list-with-dolist animals) | |
1317 | |
1318 In Info, you can place your cursor after the closing parenthesis of | |
1319 each expression and type `C-x C-e'; in each case, you should see | |
1320 | |
1321 (tiger lion giraffe gazelle) | |
1322 | |
1323 in the echo area. | |
1324 | |
1325 For this example, the existing `reverse' function is obviously best. | |
1326 The `while' loop is just like our first example (*note A `while' Loop | |
1327 and a List: Loop Example.). The `while' first checks whether the list | |
1328 has elements; if so, it constructs a new list by adding the first | |
1329 element of the list to the existing list (which in the first iteration | |
1330 of the loop is `nil'). Since the second element is prepended in front | |
1331 of the first element, and the third element is prepended in front of | |
1332 the second element, the list is reversed. | |
1333 | |
1334 In the expression using a `while' loop, the `(setq list (cdr list))' | |
1335 expression shortens the list, so the `while' loop eventually stops. In | |
1336 addition, it provides the `cons' expression with a new first element by | |
1337 creating a new and shorter list at each repetition of the loop. | |
1338 | |
1339 The `dolist' expression does very much the same as the `while' | |
1340 expression, except that the `dolist' macro does some of the work you | |
1341 have to do when writing a `while' expression. | |
1342 | |
1343 Like a `while' loop, a `dolist' loops. What is different is that it | |
1344 automatically shortens the list each time it loops -- it `CDRs down the | |
1345 list' on its own -- and it automatically binds the CAR of each shorter | |
1346 version of the list to the first of its arguments. | |
1347 | |
1348 In the example, the CAR of each shorter version of the list is referred | |
1349 to using the symbol `element', the list itself is called `list', and | |
1350 the value returned is called `value'. The remainder of the `dolist' | |
1351 expression is the body. | |
1352 | |
1353 The `dolist' expression binds the CAR of each shorter version of the | |
1354 list to `element' and then evaluates the body of the expression; and | |
1355 repeats the loop. The result is returned in `value'. | |
1356 | |
1357 | |
1358 File: eintr, Node: dotimes, Prev: dolist, Up: dolist dotimes | |
1359 | |
1360 The `dotimes' Macro | |
1361 ................... | |
1362 | |
1363 The `dotimes' macro is similar to `dolist', except that it loops a | |
1364 specific number of times. | |
1365 | |
1366 The first argument to `dotimes' is assigned the numbers 0, 1, 2 and so | |
1367 forth each time around the loop, and the value of the third argument is | |
1368 returned. You need to provide the value of the second argument, which | |
1369 is how many times the macro loops. | |
1370 | |
1371 For example, the following binds the numbers from 0 up to, but not | |
1372 including, the number 3 to the first argument, NUMBER, and then | |
1373 constructs a list of the three numbers. (The first number is 0, the | |
1374 second number is 1, and the third number is 2; this makes a total of | |
1375 three numbers in all, starting with zero as the first number.) | |
1376 | |
1377 (let (value) ; otherwise a value is a void variable | |
1378 (dotimes (number 3 value) | |
1379 (setq value (cons number value)))) | |
1380 | |
1381 => (2 1 0) | |
1382 | |
1383 `dotimes' returns `value', so the way to use `dotimes' is to operate on | |
1384 some expression NUMBER number of times and then return the result, | |
1385 either as a list or an atom. | |
1386 | |
1387 Here is an example of a `defun' that uses `dotimes' to add up the | |
1388 number of pebbles in a triangle. | |
1389 | |
1390 (defun triangle-using-dotimes (number-of-rows) | |
1391 "Using dotimes, add up the number of pebbles in a triangle." | |
1392 (let ((total 0)) ; otherwise a total is a void variable | |
1393 (dotimes (number number-of-rows total) | |
1394 (setq total (+ total (1+ number)))))) | |
1395 | |
1396 (triangle-using-dotimes 4) | |
1397 | |
1398 | |
1399 File: eintr, Node: Recursion, Next: Looping exercise, Prev: dolist dotimes, Up: Loops & Recursion | |
1400 | |
1401 11.3 Recursion | |
1402 ============== | |
1403 | |
1404 A recursive function contains code that tells the Lisp interpreter to | |
1405 call a program that runs exactly like itself, but with slightly | |
1406 different arguments. The code runs exactly the same because it has the | |
1407 same name. However, even though the program has the same name, it is | |
1408 not the same entity. It is different. In the jargon, it is a | |
1409 different `instance'. | |
1410 | |
1411 Eventually, if the program is written correctly, the `slightly | |
1412 different arguments' will become sufficiently different from the first | |
1413 arguments that the final instance will stop. | |
1414 | |
1415 * Menu: | |
1416 | |
1417 * Building Robots:: | |
1418 * Recursive Definition Parts:: | |
1419 * Recursion with list:: | |
1420 * Recursive triangle function:: | |
1421 * Recursion with cond:: | |
1422 * Recursive Patterns:: | |
1423 * No Deferment:: | |
1424 * No deferment solution:: | |
1425 | |
1426 | |
1427 File: eintr, Node: Building Robots, Next: Recursive Definition Parts, Prev: Recursion, Up: Recursion | |
1428 | |
1429 11.3.1 Building Robots: Extending the Metaphor | |
1430 ---------------------------------------------- | |
1431 | |
1432 It is sometimes helpful to think of a running program as a robot that | |
1433 does a job. In doing its job, a recursive function calls on a second | |
1434 robot to help it. The second robot is identical to the first in every | |
1435 way, except that the second robot helps the first and has been passed | |
1436 different arguments than the first. | |
1437 | |
1438 In a recursive function, the second robot may call a third; and the | |
1439 third may call a fourth, and so on. Each of these is a different | |
1440 entity; but all are clones. | |
1441 | |
1442 Since each robot has slightly different instructions--the arguments | |
1443 will differ from one robot to the next--the last robot should know when | |
1444 to stop. | |
1445 | |
1446 Let's expand on the metaphor in which a computer program is a robot. | |
1447 | |
1448 A function definition provides the blueprints for a robot. When you | |
1449 install a function definition, that is, when you evaluate a `defun' | |
1450 special form, you install the necessary equipment to build robots. It | |
1451 is as if you were in a factory, setting up an assembly line. Robots | |
1452 with the same name are built according to the same blueprints. So they | |
1453 have, as it were, the same `model number', but a different `serial | |
1454 number'. | |
1455 | |
1456 We often say that a recursive function `calls itself'. What we mean is | |
1457 that the instructions in a recursive function cause the Lisp | |
1458 interpreter to run a different function that has the same name and does | |
1459 the same job as the first, but with different arguments. | |
1460 | |
1461 It is important that the arguments differ from one instance to the | |
1462 next; otherwise, the process will never stop. | |
1463 | |
1464 | |
1465 File: eintr, Node: Recursive Definition Parts, Next: Recursion with list, Prev: Building Robots, Up: Recursion | |
1466 | |
1467 11.3.2 The Parts of a Recursive Definition | |
1468 ------------------------------------------ | |
1469 | |
1470 A recursive function typically contains a conditional expression which | |
1471 has three parts: | |
1472 | |
1473 1. A true-or-false-test that determines whether the function is called | |
1474 again, here called the "do-again-test". | |
1475 | |
1476 2. The name of the function. When this name is called, a new | |
1477 instance of the function--a new robot, as it were--is created and | |
1478 told what to do. | |
1479 | |
1480 3. An expression that returns a different value each time the | |
1481 function is called, here called the "next-step-expression". | |
1482 Consequently, the argument (or arguments) passed to the new | |
1483 instance of the function will be different from that passed to the | |
1484 previous instance. This causes the conditional expression, the | |
1485 "do-again-test", to test false after the correct number of | |
1486 repetitions. | |
1487 | |
1488 Recursive functions can be much simpler than any other kind of | |
1489 function. Indeed, when people first start to use them, they often look | |
1490 so mysteriously simple as to be incomprehensible. Like riding a | |
1491 bicycle, reading a recursive function definition takes a certain knack | |
1492 which is hard at first but then seems simple. | |
1493 | |
1494 There are several different common recursive patterns. A very simple | |
1495 pattern looks like this: | |
1496 | |
1497 (defun NAME-OF-RECURSIVE-FUNCTION (ARGUMENT-LIST) | |
1498 "DOCUMENTATION..." | |
1499 (if DO-AGAIN-TEST | |
1500 BODY... | |
1501 (NAME-OF-RECURSIVE-FUNCTION | |
1502 NEXT-STEP-EXPRESSION))) | |
1503 | |
1504 Each time a recursive function is evaluated, a new instance of it is | |
1505 created and told what to do. The arguments tell the instance what to | |
1506 do. | |
1507 | |
1508 An argument is bound to the value of the next-step-expression. Each | |
1509 instance runs with a different value of the next-step-expression. | |
1510 | |
1511 The value in the next-step-expression is used in the do-again-test. | |
1512 | |
1513 The value returned by the next-step-expression is passed to the new | |
1514 instance of the function, which evaluates it (or some | |
1515 transmogrification of it) to determine whether to continue or stop. | |
1516 The next-step-expression is designed so that the do-again-test returns | |
1517 false when the function should no longer be repeated. | |
1518 | |
1519 The do-again-test is sometimes called the "stop condition", since it | |
1520 stops the repetitions when it tests false. | |
1521 | |
1522 | |
1523 File: eintr, Node: Recursion with list, Next: Recursive triangle function, Prev: Recursive Definition Parts, Up: Recursion | |
1524 | |
1525 11.3.3 Recursion with a List | |
1526 ---------------------------- | |
1527 | |
1528 The example of a `while' loop that printed the elements of a list of | |
1529 numbers can be written recursively. Here is the code, including an | |
1530 expression to set the value of the variable `animals' to a list. | |
1531 | |
1532 If you are using GNU Emacs 20 or before, this example must be copied to | |
1533 the `*scratch*' buffer and each expression must be evaluated there. | |
1534 Use `C-u C-x C-e' to evaluate the `(print-elements-recursively | |
1535 animals)' expression so that the results are printed in the buffer; | |
1536 otherwise the Lisp interpreter will try to squeeze the results into the | |
1537 one line of the echo area. | |
1538 | |
1539 Also, place your cursor immediately after the last closing parenthesis | |
1540 of the `print-elements-recursively' function, before the comment. | |
1541 Otherwise, the Lisp interpreter will try to evaluate the comment. | |
1542 | |
1543 If you are using a more recent version, you can evaluate this | |
1544 expression directly in Info. | |
1545 | |
1546 (setq animals '(gazelle giraffe lion tiger)) | |
1547 | |
1548 (defun print-elements-recursively (list) | |
1549 "Print each element of LIST on a line of its own. | |
1550 Uses recursion." | |
1551 (if list ; do-again-test | |
1552 (progn | |
1553 (print (car list)) ; body | |
1554 (print-elements-recursively ; recursive call | |
1555 (cdr list))))) ; next-step-expression | |
1556 | |
1557 (print-elements-recursively animals) | |
1558 | |
1559 The `print-elements-recursively' function first tests whether there is | |
1560 any content in the list; if there is, the function prints the first | |
1561 element of the list, the CAR of the list. Then the function `invokes | |
1562 itself', but gives itself as its argument, not the whole list, but the | |
1563 second and subsequent elements of the list, the CDR of the list. | |
1564 | |
1565 Put another way, if the list is not empty, the function invokes another | |
1566 instance of code that is similar to the initial code, but is a | |
1567 different thread of execution, with different arguments than the first | |
1568 instance. | |
1569 | |
1570 Put in yet another way, if the list is not empty, the first robot | |
1571 assemblies a second robot and tells it what to do; the second robot is | |
1572 a different individual from the first, but is the same model. | |
1573 | |
1574 When the second evaluation occurs, the `if' expression is evaluated and | |
1575 if true, prints the first element of the list it receives as its | |
1576 argument (which is the second element of the original list). Then the | |
1577 function `calls itself' with the CDR of the list it is invoked with, | |
1578 which (the second time around) is the CDR of the CDR of the original | |
1579 list. | |
1580 | |
1581 Note that although we say that the function `calls itself', what we | |
1582 mean is that the Lisp interpreter assembles and instructs a new | |
1583 instance of the program. The new instance is a clone of the first, but | |
1584 is a separate individual. | |
1585 | |
1586 Each time the function `invokes itself', it invokes itself on a shorter | |
1587 version of the original list. It creates a new instance that works on | |
1588 a shorter list. | |
1589 | |
1590 Eventually, the function invokes itself on an empty list. It creates a | |
1591 new instance whose argument is `nil'. The conditional expression tests | |
1592 the value of `list'. Since the value of `list' is `nil', the `if' | |
1593 expression tests false so the then-part is not evaluated. The function | |
1594 as a whole then returns `nil'. | |
1595 | |
1596 When you evaluate `(print-elements-recursively animals)' in the | |
1597 `*scratch*' buffer, you see this result: | |
1598 | |
1599 gazelle | |
1600 | |
1601 giraffe | |
1602 | |
1603 lion | |
1604 | |
1605 tiger | |
1606 nil | |
1607 | |
1608 | |
1609 File: eintr, Node: Recursive triangle function, Next: Recursion with cond, Prev: Recursion with list, Up: Recursion | |
1610 | |
1611 11.3.4 Recursion in Place of a Counter | |
1612 -------------------------------------- | |
1613 | |
1614 The `triangle' function described in a previous section can also be | |
1615 written recursively. It looks like this: | |
1616 | |
1617 (defun triangle-recursively (number) | |
1618 "Return the sum of the numbers 1 through NUMBER inclusive. | |
1619 Uses recursion." | |
1620 (if (= number 1) ; do-again-test | |
1621 1 ; then-part | |
1622 (+ number ; else-part | |
1623 (triangle-recursively ; recursive call | |
1624 (1- number))))) ; next-step-expression | |
1625 | |
1626 (triangle-recursively 7) | |
1627 | |
1628 You can install this function by evaluating it and then try it by | |
1629 evaluating `(triangle-recursively 7)'. (Remember to put your cursor | |
1630 immediately after the last parenthesis of the function definition, | |
1631 before the comment.) The function evaluates to 28. | |
1632 | |
1633 To understand how this function works, let's consider what happens in | |
1634 the various cases when the function is passed 1, 2, 3, or 4 as the | |
1635 value of its argument. | |
1636 | |
1637 * Menu: | |
1638 | |
1639 * Recursive Example arg of 1 or 2:: | |
1640 * Recursive Example arg of 3 or 4:: | |
1641 | |
1642 | |
1643 File: eintr, Node: Recursive Example arg of 1 or 2, Next: Recursive Example arg of 3 or 4, Prev: Recursive triangle function, Up: Recursive triangle function | |
1644 | |
1645 An argument of 1 or 2 | |
1646 ..................... | |
1647 | |
1648 First, what happens if the value of the argument is 1? | |
1649 | |
1650 The function has an `if' expression after the documentation string. It | |
1651 tests whether the value of `number' is equal to 1; if so, Emacs | |
1652 evaluates the then-part of the `if' expression, which returns the | |
1653 number 1 as the value of the function. (A triangle with one row has | |
1654 one pebble in it.) | |
1655 | |
1656 Suppose, however, that the value of the argument is 2. In this case, | |
1657 Emacs evaluates the else-part of the `if' expression. | |
1658 | |
1659 The else-part consists of an addition, the recursive call to | |
1660 `triangle-recursively' and a decrementing action; and it looks like | |
1661 this: | |
1662 | |
1663 (+ number (triangle-recursively (1- number))) | |
1664 | |
1665 When Emacs evaluates this expression, the innermost expression is | |
1666 evaluated first; then the other parts in sequence. Here are the steps | |
1667 in detail: | |
1668 | |
1669 Step 1 Evaluate the innermost expression. | |
1670 The innermost expression is `(1- number)' so Emacs decrements the | |
1671 value of `number' from 2 to 1. | |
1672 | |
1673 Step 2 Evaluate the `triangle-recursively' function. | |
1674 The Lisp interpreter creates an individual instance of | |
1675 `triangle-recursively'. It does not matter that this function is | |
1676 contained within itself. Emacs passes the result Step 1 as the | |
1677 argument used by this instance of the `triangle-recursively' | |
1678 function | |
1679 | |
1680 In this case, Emacs evaluates `triangle-recursively' with an | |
1681 argument of 1. This means that this evaluation of | |
1682 `triangle-recursively' returns 1. | |
1683 | |
1684 Step 3 Evaluate the value of `number'. | |
1685 The variable `number' is the second element of the list that | |
1686 starts with `+'; its value is 2. | |
1687 | |
1688 Step 4 Evaluate the `+' expression. | |
1689 The `+' expression receives two arguments, the first from the | |
1690 evaluation of `number' (Step 3) and the second from the evaluation | |
1691 of `triangle-recursively' (Step 2). | |
1692 | |
1693 The result of the addition is the sum of 2 plus 1, and the number | |
1694 3 is returned, which is correct. A triangle with two rows has | |
1695 three pebbles in it. | |
1696 | |
1697 | |
1698 File: eintr, Node: Recursive Example arg of 3 or 4, Prev: Recursive Example arg of 1 or 2, Up: Recursive triangle function | |
1699 | |
1700 An argument of 3 or 4 | |
1701 ..................... | |
1702 | |
1703 Suppose that `triangle-recursively' is called with an argument of 3. | |
1704 | |
1705 Step 1 Evaluate the do-again-test. | |
1706 The `if' expression is evaluated first. This is the do-again test | |
1707 and returns false, so the else-part of the `if' expression is | |
1708 evaluated. (Note that in this example, the do-again-test causes | |
1709 the function to call itself when it tests false, not when it tests | |
1710 true.) | |
1711 | |
1712 Step 2 Evaluate the innermost expression of the else-part. | |
1713 The innermost expression of the else-part is evaluated, which | |
1714 decrements 3 to 2. This is the next-step-expression. | |
1715 | |
1716 Step 3 Evaluate the `triangle-recursively' function. | |
1717 The number 2 is passed to the `triangle-recursively' function. | |
1718 | |
1719 We know what happens when Emacs evaluates `triangle-recursively' | |
1720 with an argument of 2. After going through the sequence of | |
1721 actions described earlier, it returns a value of 3. So that is | |
1722 what will happen here. | |
1723 | |
1724 Step 4 Evaluate the addition. | |
1725 3 will be passed as an argument to the addition and will be added | |
1726 to the number with which the function was called, which is 3. | |
1727 | |
1728 The value returned by the function as a whole will be 6. | |
1729 | |
1730 Now that we know what will happen when `triangle-recursively' is called | |
1731 with an argument of 3, it is evident what will happen if it is called | |
1732 with an argument of 4: | |
1733 | |
1734 In the recursive call, the evaluation of | |
1735 | |
1736 (triangle-recursively (1- 4)) | |
1737 | |
1738 will return the value of evaluating | |
1739 | |
1740 (triangle-recursively 3) | |
1741 | |
1742 which is 6 and this value will be added to 4 by the addition in the | |
1743 third line. | |
1744 | |
1745 The value returned by the function as a whole will be 10. | |
1746 | |
1747 Each time `triangle-recursively' is evaluated, it evaluates a version | |
1748 of itself--a different instance of itself--with a smaller argument, | |
1749 until the argument is small enough so that it does not evaluate itself. | |
1750 | |
1751 Note that this particular design for a recursive function requires that | |
1752 operations be deferred. | |
1753 | |
1754 Before `(triangle-recursively 7)' can calculate its answer, it must | |
1755 call `(triangle-recursively 6)'; and before `(triangle-recursively 6)' | |
1756 can calculate its answer, it must call `(triangle-recursively 5)'; and | |
1757 so on. That is to say, the calculation that `(triangle-recursively 7)' | |
1758 makes must be deferred until `(triangle-recursively 6)' makes its | |
1759 calculation; and `(triangle-recursively 6)' must defer until | |
1760 `(triangle-recursively 5)' completes; and so on. | |
1761 | |
1762 If each of these instances of `triangle-recursively' are thought of as | |
1763 different robots, the first robot must wait for the second to complete | |
1764 its job, which must wait until the third completes, and so on. | |
1765 | |
1766 There is a way around this kind of waiting, which we will discuss in | |
1767 *Note Recursion without Deferments: No Deferment. | |
1768 | |
1769 | |
1770 File: eintr, Node: Recursion with cond, Next: Recursive Patterns, Prev: Recursive triangle function, Up: Recursion | |
1771 | |
1772 11.3.5 Recursion Example Using `cond' | |
1773 ------------------------------------- | |
1774 | |
1775 The version of `triangle-recursively' described earlier is written with | |
1776 the `if' special form. It can also be written using another special | |
1777 form called `cond'. The name of the special form `cond' is an | |
1778 abbreviation of the word `conditional'. | |
1779 | |
1780 Although the `cond' special form is not used as often in the Emacs Lisp | |
1781 sources as `if', it is used often enough to justify explaining it. | |
1782 | |
1783 The template for a `cond' expression looks like this: | |
1784 | |
1785 (cond | |
1786 BODY...) | |
1787 | |
1788 where the BODY is a series of lists. | |
1789 | |
1790 Written out more fully, the template looks like this: | |
1791 | |
1792 (cond | |
1793 (FIRST-TRUE-OR-FALSE-TEST FIRST-CONSEQUENT) | |
1794 (SECOND-TRUE-OR-FALSE-TEST SECOND-CONSEQUENT) | |
1795 (THIRD-TRUE-OR-FALSE-TEST THIRD-CONSEQUENT) | |
1796 ...) | |
1797 | |
1798 When the Lisp interpreter evaluates the `cond' expression, it evaluates | |
1799 the first element (the CAR or true-or-false-test) of the first | |
1800 expression in a series of expressions within the body of the `cond'. | |
1801 | |
1802 If the true-or-false-test returns `nil' the rest of that expression, | |
1803 the consequent, is skipped and the true-or-false-test of the next | |
1804 expression is evaluated. When an expression is found whose | |
1805 true-or-false-test returns a value that is not `nil', the consequent of | |
1806 that expression is evaluated. The consequent can be one or more | |
1807 expressions. If the consequent consists of more than one expression, | |
1808 the expressions are evaluated in sequence and the value of the last one | |
1809 is returned. If the expression does not have a consequent, the value | |
1810 of the true-or-false-test is returned. | |
1811 | |
1812 If none of the true-or-false-tests test true, the `cond' expression | |
1813 returns `nil'. | |
1814 | |
1815 Written using `cond', the `triangle' function looks like this: | |
1816 | |
1817 (defun triangle-using-cond (number) | |
1818 (cond ((<= number 0) 0) | |
1819 ((= number 1) 1) | |
1820 ((> number 1) | |
1821 (+ number (triangle-using-cond (1- number)))))) | |
1822 | |
1823 In this example, the `cond' returns 0 if the number is less than or | |
1824 equal to 0, it returns 1 if the number is 1 and it evaluates `(+ number | |
1825 (triangle-using-cond (1- number)))' if the number is greater than 1. | |
1826 | |
1827 | |
1828 File: eintr, Node: Recursive Patterns, Next: No Deferment, Prev: Recursion with cond, Up: Recursion | |
1829 | |
1830 11.3.6 Recursive Patterns | |
1831 ------------------------- | |
1832 | |
1833 Here are three common recursive patterns. Each involves a list. | |
1834 Recursion does not need to involve lists, but Lisp is designed for lists | |
1835 and this provides a sense of its primal capabilities. | |
1836 | |
1837 * Menu: | |
1838 | |
1839 * Every:: | |
1840 * Accumulate:: | |
1841 * Keep:: | |
1842 | |
1843 | |
1844 File: eintr, Node: Every, Next: Accumulate, Prev: Recursive Patterns, Up: Recursive Patterns | |
1845 | |
1846 Recursive Pattern: _every_ | |
1847 .......................... | |
1848 | |
1849 In the `every' recursive pattern, an action is performed on every | |
1850 element of a list. | |
1851 | |
1852 The basic pattern is: | |
1853 | |
1854 * If a list be empty, return `nil'. | |
1855 | |
1856 * Else, act on the beginning of the list (the CAR of the list) | |
1857 - through a recursive call by the function on the rest (the | |
1858 CDR) of the list, | |
1859 | |
1860 - and, optionally, combine the acted-on element, using | |
1861 `cons', with the results of acting on the rest. | |
1862 | |
1863 Here is example: | |
1864 | |
1865 (defun square-each (numbers-list) | |
1866 "Square each of a NUMBERS LIST, recursively." | |
1867 (if (not numbers-list) ; do-again-test | |
1868 nil | |
1869 (cons | |
1870 (* (car numbers-list) (car numbers-list)) | |
1871 (square-each (cdr numbers-list))))) ; next-step-expression | |
1872 | |
1873 (square-each '(1 2 3)) | |
1874 => (1 4 9) | |
1875 | |
1876 If `numbers-list' is empty, do nothing. But if it has content, | |
1877 construct a list combining the square of the first number in the list | |
1878 with the result of the recursive call. | |
1879 | |
1880 (The example follows the pattern exactly: `nil' is returned if the | |
1881 numbers' list is empty. In practice, you would write the conditional | |
1882 so it carries out the action when the numbers' list is not empty.) | |
1883 | |
1884 The `print-elements-recursively' function (*note Recursion with a List: | |
1885 Recursion with list.) is another example of an `every' pattern, except | |
1886 in this case, rather than bring the results together using `cons', we | |
1887 print each element of output. | |
1888 | |
1889 The `print-elements-recursively' function looks like this: | |
1890 | |
1891 (setq animals '(gazelle giraffe lion tiger)) | |
1892 | |
1893 (defun print-elements-recursively (list) | |
1894 "Print each element of LIST on a line of its own. | |
1895 Uses recursion." | |
1896 (if list ; do-again-test | |
1897 (progn | |
1898 (print (car list)) ; body | |
1899 (print-elements-recursively ; recursive call | |
1900 (cdr list))))) ; next-step-expression | |
1901 | |
1902 (print-elements-recursively animals) | |
1903 | |
1904 The pattern for `print-elements-recursively' is: | |
1905 | |
1906 * If the list be empty, do nothing. | |
1907 | |
1908 * But if the list has at least one element, | |
1909 - act on the beginning of the list (the CAR of the list), | |
1910 | |
1911 - and make a recursive call on the rest (the CDR) of the | |
1912 list. | |
1913 | |
1914 | |
1915 File: eintr, Node: Accumulate, Next: Keep, Prev: Every, Up: Recursive Patterns | |
1916 | |
1917 Recursive Pattern: _accumulate_ | |
1918 ............................... | |
1919 | |
1920 Another recursive pattern is called the `accumulate' pattern. In the | |
1921 `accumulate' recursive pattern, an action is performed on every element | |
1922 of a list and the result of that action is accumulated with the results | |
1923 of performing the action on the other elements. | |
1924 | |
1925 This is very like the `every' pattern using `cons', except that `cons' | |
1926 is not used, but some other combiner. | |
1927 | |
1928 The pattern is: | |
1929 | |
1930 * If a list be empty, return zero or some other constant. | |
1931 | |
1932 * Else, act on the beginning of the list (the CAR of the list), | |
1933 - and combine that acted-on element, using `+' or some | |
1934 other combining function, with | |
1935 | |
1936 - a recursive call by the function on the rest (the CDR) of | |
1937 the list. | |
1938 | |
1939 Here is an example: | |
1940 | |
1941 (defun add-elements (numbers-list) | |
1942 "Add the elements of NUMBERS-LIST together." | |
1943 (if (not numbers-list) | |
1944 0 | |
1945 (+ (car numbers-list) (add-elements (cdr numbers-list))))) | |
1946 | |
1947 (add-elements '(1 2 3 4)) | |
1948 => 10 | |
1949 | |
1950 *Note Making a List of Files: Files List, for an example of the | |
1951 accumulate pattern. | |
1952 | |
1953 | |
1954 File: eintr, Node: Keep, Prev: Accumulate, Up: Recursive Patterns | |
1955 | |
1956 Recursive Pattern: _keep_ | |
1957 ......................... | |
1958 | |
1959 A third recursive pattern is called the `keep' pattern. In the `keep' | |
1960 recursive pattern, each element of a list is tested; the element is | |
1961 acted on and the results are kept only if the element meets a criterion. | |
1962 | |
1963 Again, this is very like the `every' pattern, except the element is | |
1964 skipped unless it meets a criterion. | |
1965 | |
1966 The pattern has three parts: | |
1967 | |
1968 * If a list be empty, return `nil'. | |
1969 | |
1970 * Else, if the beginning of the list (the CAR of the list) passes | |
1971 a test | |
1972 - act on that element and combine it, using `cons' with | |
1973 | |
1974 - a recursive call by the function on the rest (the CDR) of | |
1975 the list. | |
1976 | |
1977 * Otherwise, if the beginning of the list (the CAR of the list) fails | |
1978 the test | |
1979 - skip on that element, | |
1980 | |
1981 - and, recursively call the function on the rest (the CDR) | |
1982 of the list. | |
1983 | |
1984 Here is an example that uses `cond': | |
1985 | |
1986 (defun keep-three-letter-words (word-list) | |
1987 "Keep three letter words in WORD-LIST." | |
1988 (cond | |
1989 ;; First do-again-test: stop-condition | |
1990 ((not word-list) nil) | |
1991 | |
1992 ;; Second do-again-test: when to act | |
1993 ((eq 3 (length (symbol-name (car word-list)))) | |
1994 ;; combine acted-on element with recursive call on shorter list | |
1995 (cons (car word-list) (keep-three-letter-words (cdr word-list)))) | |
1996 | |
1997 ;; Third do-again-test: when to skip element; | |
1998 ;; recursively call shorter list with next-step expression | |
1999 (t (keep-three-letter-words (cdr word-list))))) | |
2000 | |
2001 (keep-three-letter-words '(one two three four five six)) | |
2002 => (one two six) | |
2003 | |
2004 It goes without saying that you need not use `nil' as the test for when | |
2005 to stop; and you can, of course, combine these patterns. | |
2006 | |
2007 | |
2008 File: eintr, Node: No Deferment, Next: No deferment solution, Prev: Recursive Patterns, Up: Recursion | |
2009 | |
2010 11.3.7 Recursion without Deferments | |
2011 ----------------------------------- | |
2012 | |
2013 Let's consider again what happens with the `triangle-recursively' | |
2014 function. We will find that the intermediate calculations are deferred | |
2015 until all can be done. | |
2016 | |
2017 Here is the function definition: | |
2018 | |
2019 (defun triangle-recursively (number) | |
2020 "Return the sum of the numbers 1 through NUMBER inclusive. | |
2021 Uses recursion." | |
2022 (if (= number 1) ; do-again-test | |
2023 1 ; then-part | |
2024 (+ number ; else-part | |
2025 (triangle-recursively ; recursive call | |
2026 (1- number))))) ; next-step-expression | |
2027 | |
2028 What happens when we call this function with a argument of 7? | |
2029 | |
2030 The first instance of the `triangle-recursively' function adds the | |
2031 number 7 to the value returned by a second instance of | |
2032 `triangle-recursively', an instance that has been passed an argument of | |
2033 6. That is to say, the first calculation is: | |
2034 | |
2035 (+ 7 (triangle-recursively 6)) | |
2036 | |
2037 The first instance of `triangle-recursively'--you may want to think of | |
2038 it as a little robot--cannot complete its job. It must hand off the | |
2039 calculation for `(triangle-recursively 6)' to a second instance of the | |
2040 program, to a second robot. This second individual is completely | |
2041 different from the first one; it is, in the jargon, a `different | |
2042 instantiation'. Or, put another way, it is a different robot. It is | |
2043 the same model as the first; it calculates triangle numbers | |
2044 recursively; but it has a different serial number. | |
2045 | |
2046 And what does `(triangle-recursively 6)' return? It returns the number | |
2047 6 added to the value returned by evaluating `triangle-recursively' with | |
2048 an argument of 5. Using the robot metaphor, it asks yet another robot | |
2049 to help it. | |
2050 | |
2051 Now the total is: | |
2052 | |
2053 (+ 7 6 (triangle-recursively 5)) | |
2054 | |
2055 And what happens next? | |
2056 | |
2057 (+ 7 6 5 (triangle-recursively 4)) | |
2058 | |
2059 Each time `triangle-recursively' is called, except for the last time, | |
2060 it creates another instance of the program--another robot--and asks it | |
2061 to make a calculation. | |
2062 | |
2063 Eventually, the full addition is set up and performed: | |
2064 | |
2065 (+ 7 6 5 4 3 2 1) | |
2066 | |
2067 This design for the function defers the calculation of the first step | |
2068 until the second can be done, and defers that until the third can be | |
2069 done, and so on. Each deferment means the computer must remember what | |
2070 is being waited on. This is not a problem when there are only a few | |
2071 steps, as in this example. But it can be a problem when there are more | |
2072 steps. | |
2073 | |
2074 | |
2075 File: eintr, Node: No deferment solution, Prev: No Deferment, Up: Recursion | |
2076 | |
2077 11.3.8 No Deferment Solution | |
2078 ---------------------------- | |
2079 | |
2080 The solution to the problem of deferred operations is to write in a | |
2081 manner that does not defer operations(1). This requires writing to a | |
2082 different pattern, often one that involves writing two function | |
2083 definitions, an `initialization' function and a `helper' function. | |
2084 | |
2085 The `initialization' function sets up the job; the `helper' function | |
2086 does the work. | |
2087 | |
2088 Here are the two function definitions for adding up numbers. They are | |
2089 so simple, I find them hard to understand. | |
2090 | |
2091 (defun triangle-initialization (number) | |
2092 "Return the sum of the numbers 1 through NUMBER inclusive. | |
2093 This is the `initialization' component of a two function | |
2094 duo that uses recursion." | |
2095 (triangle-recursive-helper 0 0 number)) | |
2096 | |
2097 (defun triangle-recursive-helper (sum counter number) | |
2098 "Return SUM, using COUNTER, through NUMBER inclusive. | |
2099 This is the `helper' component of a two function duo | |
2100 that uses recursion." | |
2101 (if (> counter number) | |
2102 sum | |
2103 (triangle-recursive-helper (+ sum counter) ; sum | |
2104 (1+ counter) ; counter | |
2105 number))) ; number | |
2106 | |
2107 Install both function definitions by evaluating them, then call | |
2108 `triangle-initialization' with 2 rows: | |
2109 | |
2110 (triangle-initialization 2) | |
2111 => 3 | |
2112 | |
2113 The `initialization' function calls the first instance of the `helper' | |
2114 function with three arguments: zero, zero, and a number which is the | |
2115 number of rows in the triangle. | |
2116 | |
2117 The first two arguments passed to the `helper' function are | |
2118 initialization values. These values are changed when | |
2119 `triangle-recursive-helper' invokes new instances.(2) | |
2120 | |
2121 Let's see what happens when we have a triangle that has one row. (This | |
2122 triangle will have one pebble in it!) | |
2123 | |
2124 `triangle-initialization' will call its helper with the arguments | |
2125 `0 0 1'. That function will run the conditional test whether `(> | |
2126 counter number)': | |
2127 | |
2128 (> 0 1) | |
2129 | |
2130 and find that the result is false, so it will invoke the else-part of | |
2131 the `if' clause: | |
2132 | |
2133 (triangle-recursive-helper | |
2134 (+ sum counter) ; sum plus counter => sum | |
2135 (1+ counter) ; increment counter => counter | |
2136 number) ; number stays the same | |
2137 | |
2138 which will first compute: | |
2139 | |
2140 (triangle-recursive-helper (+ 0 0) ; sum | |
2141 (1+ 0) ; counter | |
2142 1) ; number | |
2143 which is: | |
2144 | |
2145 (triangle-recursive-helper 0 1 1) | |
2146 | |
2147 Again, `(> counter number)' will be false, so again, the Lisp | |
2148 interpreter will evaluate `triangle-recursive-helper', creating a new | |
2149 instance with new arguments. | |
2150 | |
2151 This new instance will be; | |
2152 | |
2153 (triangle-recursive-helper | |
2154 (+ sum counter) ; sum plus counter => sum | |
2155 (1+ counter) ; increment counter => counter | |
2156 number) ; number stays the same | |
2157 | |
2158 which is: | |
2159 | |
2160 (triangle-recursive-helper 1 2 1) | |
2161 | |
2162 In this case, the `(> counter number)' test will be true! So the | |
2163 instance will return the value of the sum, which will be 1, as expected. | |
2164 | |
2165 Now, let's pass `triangle-initialization' an argument of 2, to find out | |
2166 how many pebbles there are in a triangle with two rows. | |
2167 | |
2168 That function calls `(triangle-recursive-helper 0 0 2)'. | |
2169 | |
2170 In stages, the instances called will be: | |
2171 | |
2172 sum counter number | |
2173 (triangle-recursive-helper 0 1 2) | |
2174 | |
2175 (triangle-recursive-helper 1 2 2) | |
2176 | |
2177 (triangle-recursive-helper 3 3 2) | |
2178 | |
2179 When the last instance is called, the `(> counter number)' test will be | |
2180 true, so the instance will return the value of `sum', which will be 3. | |
2181 | |
2182 This kind of pattern helps when you are writing functions that can use | |
2183 many resources in a computer. | |
2184 | |
2185 ---------- Footnotes ---------- | |
2186 | |
2187 (1) The phrase "tail recursive" is used to describe such a process, one | |
2188 that uses `constant space'. | |
2189 | |
2190 (2) The jargon is mildly confusing: `triangle-recursive-helper' uses a | |
2191 process that is iterative in a procedure that is recursive. The | |
2192 process is called iterative because the computer need only record the | |
2193 three values, `sum', `counter', and `number'; the procedure is | |
2194 recursive because the function `calls itself'. On the other hand, both | |
2195 the process and the procedure used by `triangle-recursively' are called | |
2196 recursive. The word `recursive' has different meanings in the two | |
2197 contexts. | |
2198 | |
2199 | |
2200 File: eintr, Node: Looping exercise, Prev: Recursion, Up: Loops & Recursion | |
2201 | |
2202 11.4 Looping Exercise | |
2203 ===================== | |
2204 | |
2205 * Write a function similar to `triangle' in which each row has a | |
2206 value which is the square of the row number. Use a `while' loop. | |
2207 | |
2208 * Write a function similar to `triangle' that multiplies instead of | |
2209 adds the values. | |
2210 | |
2211 * Rewrite these two functions recursively. Rewrite these functions | |
2212 using `cond'. | |
2213 | |
2214 * Write a function for Texinfo mode that creates an index entry at | |
2215 the beginning of a paragraph for every `@dfn' within the paragraph. | |
2216 (In a Texinfo file, `@dfn' marks a definition. This book is | |
2217 written in Texinfo.) | |
2218 | |
2219 Many of the functions you will need are described in two of the | |
2220 previous chapters, *Note Cutting and Storing Text: Cutting & | |
2221 Storing Text, and *Note Yanking Text Back: Yanking. If you use | |
2222 `forward-paragraph' to put the index entry at the beginning of the | |
2223 paragraph, you will have to use `C-h f' (`describe-function') to | |
2224 find out how to make the command go backwards. | |
2225 | |
2226 For more information, see *Note Indicating Definitions: | |
2227 (texinfo)Indicating. | |
2228 | |
2229 | |
2230 File: eintr, Node: Regexp Search, Next: Counting Words, Prev: Loops & Recursion, Up: Top | |
2231 | |
2232 12 Regular Expression Searches | |
2233 ****************************** | |
2234 | |
2235 Regular expression searches are used extensively in GNU Emacs. The two | |
2236 functions, `forward-sentence' and `forward-paragraph', illustrate these | |
2237 searches well. They use regular expressions to find where to move | |
2238 point. The phrase `regular expression' is often written as `regexp'. | |
2239 | |
2240 Regular expression searches are described in *Note Regular Expression | |
2241 Search: (emacs)Regexp Search, as well as in *Note Regular Expressions: | |
2242 (elisp)Regular Expressions. In writing this chapter, I am presuming | |
2243 that you have at least a mild acquaintance with them. The major point | |
2244 to remember is that regular expressions permit you to search for | |
2245 patterns as well as for literal strings of characters. For example, | |
2246 the code in `forward-sentence' searches for the pattern of possible | |
2247 characters that could mark the end of a sentence, and moves point to | |
2248 that spot. | |
2249 | |
2250 Before looking at the code for the `forward-sentence' function, it is | |
2251 worth considering what the pattern that marks the end of a sentence | |
2252 must be. The pattern is discussed in the next section; following that | |
2253 is a description of the regular expression search function, | |
2254 `re-search-forward'. The `forward-sentence' function is described in | |
2255 the section following. Finally, the `forward-paragraph' function is | |
2256 described in the last section of this chapter. `forward-paragraph' is | |
2257 a complex function that introduces several new features. | |
2258 | |
2259 * Menu: | |
2260 | |
2261 * sentence-end:: | |
2262 * re-search-forward:: | |
2263 * forward-sentence:: | |
2264 * forward-paragraph:: | |
2265 * etags:: | |
2266 * Regexp Review:: | |
2267 * re-search Exercises:: | |
2268 | |
2269 | |
2270 File: eintr, Node: sentence-end, Next: re-search-forward, Prev: Regexp Search, Up: Regexp Search | |
2271 | |
2272 12.1 The Regular Expression for `sentence-end' | |
2273 ============================================== | |
2274 | |
2275 The symbol `sentence-end' is bound to the pattern that marks the end of | |
2276 a sentence. What should this regular expression be? | |
2277 | |
2278 Clearly, a sentence may be ended by a period, a question mark, or an | |
2279 exclamation mark. Indeed, only clauses that end with one of those three | |
2280 characters should be considered the end of a sentence. This means that | |
2281 the pattern should include the character set: | |
2282 | |
2283 [.?!] | |
2284 | |
2285 However, we do not want `forward-sentence' merely to jump to a period, | |
2286 a question mark, or an exclamation mark, because such a character might | |
2287 be used in the middle of a sentence. A period, for example, is used | |
2288 after abbreviations. So other information is needed. | |
2289 | |
2290 According to convention, you type two spaces after every sentence, but | |
2291 only one space after a period, a question mark, or an exclamation mark | |
2292 in the body of a sentence. So a period, a question mark, or an | |
2293 exclamation mark followed by two spaces is a good indicator of an end | |
2294 of sentence. However, in a file, the two spaces may instead be a tab | |
2295 or the end of a line. This means that the regular expression should | |
2296 include these three items as alternatives. | |
2297 | |
2298 This group of alternatives will look like this: | |
2299 | |
2300 \\($\\| \\| \\) | |
2301 ^ ^^ | |
2302 TAB SPC | |
2303 | |
2304 Here, `$' indicates the end of the line, and I have pointed out where | |
2305 the tab and two spaces are inserted in the expression. Both are | |
2306 inserted by putting the actual characters into the expression. | |
2307 | |
2308 Two backslashes, `\\', are required before the parentheses and vertical | |
2309 bars: the first backslash quotes the following backslash in Emacs; and | |
2310 the second indicates that the following character, the parenthesis or | |
2311 the vertical bar, is special. | |
2312 | |
2313 Also, a sentence may be followed by one or more carriage returns, like | |
2314 this: | |
2315 | |
2316 [ | |
2317 ]* | |
2318 | |
2319 Like tabs and spaces, a carriage return is inserted into a regular | |
2320 expression by inserting it literally. The asterisk indicates that the | |
2321 <RET> is repeated zero or more times. | |
2322 | |
2323 But a sentence end does not consist only of a period, a question mark or | |
2324 an exclamation mark followed by appropriate space: a closing quotation | |
2325 mark or a closing brace of some kind may precede the space. Indeed more | |
2326 than one such mark or brace may precede the space. These require a | |
2327 expression that looks like this: | |
2328 | |
2329 []\"')}]* | |
2330 | |
2331 In this expression, the first `]' is the first character in the | |
2332 expression; the second character is `"', which is preceded by a `\' to | |
2333 tell Emacs the `"' is _not_ special. The last three characters are | |
2334 `'', `)', and `}'. | |
2335 | |
2336 All this suggests what the regular expression pattern for matching the | |
2337 end of a sentence should be; and, indeed, if we evaluate `sentence-end' | |
2338 we find that it returns the following value: | |
2339 | |
2340 sentence-end | |
2341 => "[.?!][]\"')}]*\\($\\| \\| \\)[ | |
2342 ]*" | |
2343 | |
2344 (Well, not in GNU Emacs 22; that is because of an effort to make the | |
2345 process simpler. When its value is `nil', then use the value defined | |
2346 by the function `sentence-end', and that returns a value constructed | |
2347 from the variables `sentence-end-base', `sentence-end-double-space', | |
2348 `sentence-end-without-period', and `sentence-end-without-space'. The | |
2349 critical variable is `sentence-end-base'; its global value is similar | |
2350 to the one described above but it also contains two additional | |
2351 quotation marks. These have differing degrees of curliness. The | |
2352 `sentence-end-without-period' variable, when true, tells Emacs that a | |
2353 sentence may end without a period, such as text in Thai.) | |
2354 | |
2355 | |
2356 File: eintr, Node: re-search-forward, Next: forward-sentence, Prev: sentence-end, Up: Regexp Search | |
2357 | |
2358 12.2 The `re-search-forward' Function | |
2359 ===================================== | |
2360 | |
2361 The `re-search-forward' function is very like the `search-forward' | |
2362 function. (*Note The `search-forward' Function: search-forward.) | |
2363 | |
2364 `re-search-forward' searches for a regular expression. If the search | |
2365 is successful, it leaves point immediately after the last character in | |
2366 the target. If the search is backwards, it leaves point just before | |
2367 the first character in the target. You may tell `re-search-forward' to | |
2368 return `t' for true. (Moving point is therefore a `side effect'.) | |
2369 | |
2370 Like `search-forward', the `re-search-forward' function takes four | |
2371 arguments: | |
2372 | |
2373 1. The first argument is the regular expression that the function | |
2374 searches for. The regular expression will be a string between | |
2375 quotations marks. | |
2376 | |
2377 2. The optional second argument limits how far the function will | |
2378 search; it is a bound, which is specified as a position in the | |
2379 buffer. | |
2380 | |
2381 3. The optional third argument specifies how the function responds to | |
2382 failure: `nil' as the third argument causes the function to signal | |
2383 an error (and print a message) when the search fails; any other | |
2384 value causes it to return `nil' if the search fails and `t' if the | |
2385 search succeeds. | |
2386 | |
2387 4. The optional fourth argument is the repeat count. A negative | |
2388 repeat count causes `re-search-forward' to search backwards. | |
2389 | |
2390 The template for `re-search-forward' looks like this: | |
2391 | |
2392 (re-search-forward "REGULAR-EXPRESSION" | |
2393 LIMIT-OF-SEARCH | |
2394 WHAT-TO-DO-IF-SEARCH-FAILS | |
2395 REPEAT-COUNT) | |
2396 | |
2397 The second, third, and fourth arguments are optional. However, if you | |
2398 want to pass a value to either or both of the last two arguments, you | |
2399 must also pass a value to all the preceding arguments. Otherwise, the | |
2400 Lisp interpreter will mistake which argument you are passing the value | |
2401 to. | |
2402 | |
2403 In the `forward-sentence' function, the regular expression will be the | |
2404 value of the variable `sentence-end'. In simple form, that is: | |
2405 | |
2406 "[.?!][]\"')}]*\\($\\| \\| \\)[ | |
2407 ]*" | |
2408 | |
2409 The limit of the search will be the end of the paragraph (since a | |
2410 sentence cannot go beyond a paragraph). If the search fails, the | |
2411 function will return `nil'; and the repeat count will be provided by | |
2412 the argument to the `forward-sentence' function. | |
2413 | |
2414 | |
2415 File: eintr, Node: forward-sentence, Next: forward-paragraph, Prev: re-search-forward, Up: Regexp Search | |
2416 | |
2417 12.3 `forward-sentence' | |
2418 ======================= | |
2419 | |
2420 The command to move the cursor forward a sentence is a straightforward | |
2421 illustration of how to use regular expression searches in Emacs Lisp. | |
2422 Indeed, the function looks longer and more complicated than it is; this | |
2423 is because the function is designed to go backwards as well as forwards; | |
2424 and, optionally, over more than one sentence. The function is usually | |
2425 bound to the key command `M-e'. | |
2426 | |
2427 * Menu: | |
2428 | |
2429 * Complete forward-sentence:: | |
2430 * fwd-sentence while loops:: | |
2431 * fwd-sentence re-search:: | |
2432 | |
2433 | |
2434 File: eintr, Node: Complete forward-sentence, Next: fwd-sentence while loops, Prev: forward-sentence, Up: forward-sentence | |
2435 | |
2436 Complete `forward-sentence' function definition | |
2437 ----------------------------------------------- | |
2438 | |
2439 Here is the code for `forward-sentence': | |
2440 | |
2441 (defun forward-sentence (&optional arg) | |
2442 "Move forward to next `sentence-end'. With argument, repeat. | |
2443 With negative argument, move backward repeatedly to `sentence-beginning'. | |
2444 | |
2445 The variable `sentence-end' is a regular expression that matches ends of | |
2446 sentences. Also, every paragraph boundary terminates sentences as well." | |
2447 (interactive "p") | |
2448 (or arg (setq arg 1)) | |
2449 (let ((opoint (point)) | |
2450 (sentence-end (sentence-end))) | |
2451 (while (< arg 0) | |
2452 (let ((pos (point)) | |
2453 (par-beg (save-excursion (start-of-paragraph-text) (point)))) | |
2454 (if (and (re-search-backward sentence-end par-beg t) | |
2455 (or (< (match-end 0) pos) | |
2456 (re-search-backward sentence-end par-beg t))) | |
2457 (goto-char (match-end 0)) | |
2458 (goto-char par-beg))) | |
2459 (setq arg (1+ arg))) | |
2460 (while (> arg 0) | |
2461 (let ((par-end (save-excursion (end-of-paragraph-text) (point)))) | |
2462 (if (re-search-forward sentence-end par-end t) | |
2463 (skip-chars-backward " \t\n") | |
2464 (goto-char par-end))) | |
2465 (setq arg (1- arg))) | |
2466 (constrain-to-field nil opoint t))) | |
2467 | |
2468 The function looks long at first sight and it is best to look at its | |
2469 skeleton first, and then its muscle. The way to see the skeleton is to | |
2470 look at the expressions that start in the left-most columns: | |
2471 | |
2472 (defun forward-sentence (&optional arg) | |
2473 "DOCUMENTATION..." | |
2474 (interactive "p") | |
2475 (or arg (setq arg 1)) | |
2476 (let ((opoint (point)) (sentence-end (sentence-end))) | |
2477 (while (< arg 0) | |
2478 (let ((pos (point)) | |
2479 (par-beg (save-excursion (start-of-paragraph-text) (point)))) | |
2480 REST-OF-BODY-OF-WHILE-LOOP-WHEN-GOING-BACKWARDS | |
2481 (while (> arg 0) | |
2482 (let ((par-end (save-excursion (end-of-paragraph-text) (point)))) | |
2483 REST-OF-BODY-OF-WHILE-LOOP-WHEN-GOING-FORWARDS | |
2484 HANDLE-FORMS-AND-EQUIVALENT | |
2485 | |
2486 This looks much simpler! The function definition consists of | |
2487 documentation, an `interactive' expression, an `or' expression, a `let' | |
2488 expression, and `while' loops. | |
2489 | |
2490 Let's look at each of these parts in turn. | |
2491 | |
2492 We note that the documentation is thorough and understandable. | |
2493 | |
2494 The function has an `interactive "p"' declaration. This means that the | |
2495 processed prefix argument, if any, is passed to the function as its | |
2496 argument. (This will be a number.) If the function is not passed an | |
2497 argument (it is optional) then the argument `arg' will be bound to 1. | |
2498 | |
2499 When `forward-sentence' is called non-interactively without an | |
2500 argument, `arg' is bound to `nil'. The `or' expression handles this. | |
2501 What it does is either leave the value of `arg' as it is, but only if | |
2502 `arg' is bound to a value; or it sets the value of `arg' to 1, in the | |
2503 case when `arg' is bound to `nil'. | |
2504 | |
2505 Next is a `let'. That specifies the values of two local variables, | |
2506 `point' and `sentence-end'. The local value of point, from before the | |
2507 search, is used in the `constrain-to-field' function which handles | |
2508 forms and equivalents. The `sentence-end' variable is set by the | |
2509 `sentence-end' function. | |
2510 | |
2511 | |
2512 File: eintr, Node: fwd-sentence while loops, Next: fwd-sentence re-search, Prev: Complete forward-sentence, Up: forward-sentence | |
2513 | |
2514 The `while' loops | |
2515 ----------------- | |
2516 | |
2517 Two `while' loops follow. The first `while' has a true-or-false-test | |
2518 that tests true if the prefix argument for `forward-sentence' is a | |
2519 negative number. This is for going backwards. The body of this loop | |
2520 is similar to the body of the second `while' clause, but it is not | |
2521 exactly the same. We will skip this `while' loop and concentrate on | |
2522 the second `while' loop. | |
2523 | |
2524 The second `while' loop is for moving point forward. Its skeleton | |
2525 looks like this: | |
2526 | |
2527 (while (> arg 0) ; true-or-false-test | |
2528 (let VARLIST | |
2529 (if (TRUE-OR-FALSE-TEST) | |
2530 THEN-PART | |
2531 ELSE-PART | |
2532 (setq arg (1- arg)))) ; `while' loop decrementer | |
2533 | |
2534 The `while' loop is of the decrementing kind. (*Note A Loop with a | |
2535 Decrementing Counter: Decrementing Loop.) It has a true-or-false-test | |
2536 that tests true so long as the counter (in this case, the variable | |
2537 `arg') is greater than zero; and it has a decrementer that subtracts 1 | |
2538 from the value of the counter every time the loop repeats. | |
2539 | |
2540 If no prefix argument is given to `forward-sentence', which is the most | |
2541 common way the command is used, this `while' loop will run once, since | |
2542 the value of `arg' will be 1. | |
2543 | |
2544 The body of the `while' loop consists of a `let' expression, which | |
2545 creates and binds a local variable, and has, as its body, an `if' | |
2546 expression. | |
2547 | |
2548 The body of the `while' loop looks like this: | |
2549 | |
2550 (let ((par-end | |
2551 (save-excursion (end-of-paragraph-text) (point)))) | |
2552 (if (re-search-forward sentence-end par-end t) | |
2553 (skip-chars-backward " \t\n") | |
2554 (goto-char par-end))) | |
2555 | |
2556 The `let' expression creates and binds the local variable `par-end'. | |
2557 As we shall see, this local variable is designed to provide a bound or | |
2558 limit to the regular expression search. If the search fails to find a | |
2559 proper sentence ending in the paragraph, it will stop on reaching the | |
2560 end of the paragraph. | |
2561 | |
2562 But first, let us examine how `par-end' is bound to the value of the | |
2563 end of the paragraph. What happens is that the `let' sets the value of | |
2564 `par-end' to the value returned when the Lisp interpreter evaluates the | |
2565 expression | |
2566 | |
2567 (save-excursion (end-of-paragraph-text) (point)) | |
2568 | |
2569 In this expression, `(end-of-paragraph-text)' moves point to the end of | |
2570 the paragraph, `(point)' returns the value of point, and then | |
2571 `save-excursion' restores point to its original position. Thus, the | |
2572 `let' binds `par-end' to the value returned by the `save-excursion' | |
2573 expression, which is the position of the end of the paragraph. (The | |
2574 `(end-of-paragraph-text)' function uses `forward-paragraph', which we | |
2575 will discuss shortly.) | |
2576 | |
2577 Emacs next evaluates the body of the `let', which is an `if' expression | |
2578 that looks like this: | |
2579 | |
2580 (if (re-search-forward sentence-end par-end t) ; if-part | |
2581 (skip-chars-backward " \t\n") ; then-part | |
2582 (goto-char par-end))) ; else-part | |
2583 | |
2584 The `if' tests whether its first argument is true and if so, evaluates | |
2585 its then-part; otherwise, the Emacs Lisp interpreter evaluates the | |
2586 else-part. The true-or-false-test of the `if' expression is the | |
2587 regular expression search. | |
2588 | |
2589 It may seem odd to have what looks like the `real work' of the | |
2590 `forward-sentence' function buried here, but this is a common way this | |
2591 kind of operation is carried out in Lisp. | |
2592 | |
2593 | |
2594 File: eintr, Node: fwd-sentence re-search, Prev: fwd-sentence while loops, Up: forward-sentence | |
2595 | |
2596 The regular expression search | |
2597 ----------------------------- | |
2598 | |
2599 The `re-search-forward' function searches for the end of the sentence, | |
2600 that is, for the pattern defined by the `sentence-end' regular | |
2601 expression. If the pattern is found--if the end of the sentence is | |
2602 found--then the `re-search-forward' function does two things: | |
2603 | |
2604 1. The `re-search-forward' function carries out a side effect, which | |
2605 is to move point to the end of the occurrence found. | |
2606 | |
2607 2. The `re-search-forward' function returns a value of true. This is | |
2608 the value received by the `if', and means that the search was | |
2609 successful. | |
2610 | |
2611 The side effect, the movement of point, is completed before the `if' | |
2612 function is handed the value returned by the successful conclusion of | |
2613 the search. | |
2614 | |
2615 When the `if' function receives the value of true from a successful | |
2616 call to `re-search-forward', the `if' evaluates the then-part, which is | |
2617 the expression `(skip-chars-backward " \t\n")'. This expression moves | |
2618 backwards over any blank spaces, tabs or carriage returns until a | |
2619 printed character is found and then leaves point after the character. | |
2620 Since point has already been moved to the end of the pattern that marks | |
2621 the end of the sentence, this action leaves point right after the | |
2622 closing printed character of the sentence, which is usually a period. | |
2623 | |
2624 On the other hand, if the `re-search-forward' function fails to find a | |
2625 pattern marking the end of the sentence, the function returns false. | |
2626 The false then causes the `if' to evaluate its third argument, which is | |
2627 `(goto-char par-end)': it moves point to the end of the paragraph. | |
2628 | |
2629 (And if the text is in a form or equivalent, and point may not move | |
2630 fully, then the `constrain-to-field' function comes into play.) | |
2631 | |
2632 Regular expression searches are exceptionally useful and the pattern | |
2633 illustrated by `re-search-forward', in which the search is the test of | |
2634 an `if' expression, is handy. You will see or write code incorporating | |
2635 this pattern often. | |
2636 | |
2637 | |
2638 File: eintr, Node: forward-paragraph, Next: etags, Prev: forward-sentence, Up: Regexp Search | |
2639 | |
2640 12.4 `forward-paragraph': a Goldmine of Functions | |
2641 ================================================= | |
2642 | |
2643 The `forward-paragraph' function moves point forward to the end of the | |
2644 paragraph. It is usually bound to `M-}' and makes use of a number of | |
2645 functions that are important in themselves, including `let*', | |
2646 `match-beginning', and `looking-at'. | |
2647 | |
2648 The function definition for `forward-paragraph' is considerably longer | |
2649 than the function definition for `forward-sentence' because it works | |
2650 with a paragraph, each line of which may begin with a fill prefix. | |
2651 | |
2652 A fill prefix consists of a string of characters that are repeated at | |
2653 the beginning of each line. For example, in Lisp code, it is a | |
2654 convention to start each line of a paragraph-long comment with `;;; '. | |
2655 In Text mode, four blank spaces make up another common fill prefix, | |
2656 creating an indented paragraph. (*Note Fill Prefix: (emacs)Fill | |
2657 Prefix, for more information about fill prefixes.) | |
2658 | |
2659 The existence of a fill prefix means that in addition to being able to | |
2660 find the end of a paragraph whose lines begin on the left-most column, | |
2661 the `forward-paragraph' function must be able to find the end of a | |
2662 paragraph when all or many of the lines in the buffer begin with the | |
2663 fill prefix. | |
2664 | |
2665 Moreover, it is sometimes practical to ignore a fill prefix that | |
2666 exists, especially when blank lines separate paragraphs. This is an | |
2667 added complication. | |
2668 | |
2669 * Menu: | |
2670 | |
2671 * forward-paragraph in brief:: | |
2672 * fwd-para let:: | |
2673 * fwd-para while:: | |
2674 | |
2675 | |
2676 File: eintr, Node: forward-paragraph in brief, Next: fwd-para let, Prev: forward-paragraph, Up: forward-paragraph | |
2677 | |
2678 Shortened `forward-paragraph' function definition | |
2679 ------------------------------------------------- | |
2680 | |
2681 Rather than print all of the `forward-paragraph' function, we will only | |
2682 print parts of it. Read without preparation, the function can be | |
2683 daunting! | |
2684 | |
2685 In outline, the function looks like this: | |
2686 | |
2687 (defun forward-paragraph (&optional arg) | |
2688 "DOCUMENTATION..." | |
2689 (interactive "p") | |
2690 (or arg (setq arg 1)) | |
2691 (let* | |
2692 VARLIST | |
2693 (while (and (< arg 0) (not (bobp))) ; backward-moving-code | |
2694 ... | |
2695 (while (and (> arg 0) (not (eobp))) ; forward-moving-code | |
2696 ... | |
2697 | |
2698 The first parts of the function are routine: the function's argument | |
2699 list consists of one optional argument. Documentation follows. | |
2700 | |
2701 The lower case `p' in the `interactive' declaration means that the | |
2702 processed prefix argument, if any, is passed to the function. This | |
2703 will be a number, and is the repeat count of how many paragraphs point | |
2704 will move. The `or' expression in the next line handles the common | |
2705 case when no argument is passed to the function, which occurs if the | |
2706 function is called from other code rather than interactively. This | |
2707 case was described earlier. (*Note The `forward-sentence' function: | |
2708 forward-sentence.) Now we reach the end of the familiar part of this | |
2709 function. | |
2710 | |
2711 | |
2712 File: eintr, Node: fwd-para let, Next: fwd-para while, Prev: forward-paragraph in brief, Up: forward-paragraph | |
2713 | |
2714 The `let*' expression | |
2715 --------------------- | |
2716 | |
2717 The next line of the `forward-paragraph' function begins a `let*' | |
2718 expression. This is a different than `let'. The symbol is `let*' not | |
2719 `let'. | |
2720 | |
2721 The `let*' special form is like `let' except that Emacs sets each | |
2722 variable in sequence, one after another, and variables in the latter | |
2723 part of the varlist can make use of the values to which Emacs set | |
2724 variables in the earlier part of the varlist. | |
2725 | |
2726 (*Note `save-excursion' in `append-to-buffer': append save-excursion.) | |
2727 | |
2728 In the `let*' expression in this function, Emacs binds a total of seven | |
2729 variables: `opoint', `fill-prefix-regexp', `parstart', `parsep', | |
2730 `sp-parstart', `start', and `found-start'. | |
2731 | |
2732 The variable `parsep' appears twice, first, to remove instances of `^', | |
2733 and second, to handle fill prefixes. | |
2734 | |
2735 The variable `opoint' is just the value of `point'. As you can guess, | |
2736 it is used in a `constrain-to-field' expression, just as in | |
2737 `forward-sentence'. | |
2738 | |
2739 The variable `fill-prefix-regexp' is set to the value returned by | |
2740 evaluating the following list: | |
2741 | |
2742 (and fill-prefix | |
2743 (not (equal fill-prefix "")) | |
2744 (not paragraph-ignore-fill-prefix) | |
2745 (regexp-quote fill-prefix)) | |
2746 | |
2747 This is an expression whose first element is the `and' special form. | |
2748 | |
2749 As we learned earlier (*note The `kill-new' function: kill-new | |
2750 function.), the `and' special form evaluates each of its arguments | |
2751 until one of the arguments returns a value of `nil', in which case the | |
2752 `and' expression returns `nil'; however, if none of the arguments | |
2753 returns a value of `nil', the value resulting from evaluating the last | |
2754 argument is returned. (Since such a value is not `nil', it is | |
2755 considered true in Lisp.) In other words, an `and' expression returns | |
2756 a true value only if all its arguments are true. | |
2757 | |
2758 In this case, the variable `fill-prefix-regexp' is bound to a non-`nil' | |
2759 value only if the following four expressions produce a true (i.e., a | |
2760 non-`nil') value when they are evaluated; otherwise, | |
2761 `fill-prefix-regexp' is bound to `nil'. | |
2762 | |
2763 `fill-prefix' | |
2764 When this variable is evaluated, the value of the fill prefix, if | |
2765 any, is returned. If there is no fill prefix, this variable | |
2766 returns `nil'. | |
2767 | |
2768 `(not (equal fill-prefix "")' | |
2769 This expression checks whether an existing fill prefix is an empty | |
2770 string, that is, a string with no characters in it. An empty | |
2771 string is not a useful fill prefix. | |
2772 | |
2773 `(not paragraph-ignore-fill-prefix)' | |
2774 This expression returns `nil' if the variable | |
2775 `paragraph-ignore-fill-prefix' has been turned on by being set to a | |
2776 true value such as `t'. | |
2777 | |
2778 `(regexp-quote fill-prefix)' | |
2779 This is the last argument to the `and' special form. If all the | |
2780 arguments to the `and' are true, the value resulting from | |
2781 evaluating this expression will be returned by the `and' expression | |
2782 and bound to the variable `fill-prefix-regexp', | |
2783 | |
2784 The result of evaluating this `and' expression successfully is that | |
2785 `fill-prefix-regexp' will be bound to the value of `fill-prefix' as | |
2786 modified by the `regexp-quote' function. What `regexp-quote' does is | |
2787 read a string and return a regular expression that will exactly match | |
2788 the string and match nothing else. This means that | |
2789 `fill-prefix-regexp' will be set to a value that will exactly match the | |
2790 fill prefix if the fill prefix exists. Otherwise, the variable will be | |
2791 set to `nil'. | |
2792 | |
2793 The next two local variables in the `let*' expression are designed to | |
2794 remove instances of `^' from `parstart' and `parsep', the local | |
2795 variables indicate the paragraph start and the paragraph separator. | |
2796 The next expression sets `parsep' again. That is to handle fill | |
2797 prefixes. | |
2798 | |
2799 This is the setting that requires the definition call `let*' rather | |
2800 than `let'. The true-or-false-test for the `if' depends on whether the | |
2801 variable `fill-prefix-regexp' evaluates to `nil' or some other value. | |
2802 | |
2803 If `fill-prefix-regexp' does not have a value, Emacs evaluates the | |
2804 else-part of the `if' expression and binds `parsep' to its local value. | |
2805 (`parsep' is a regular expression that matches what separates | |
2806 paragraphs.) | |
2807 | |
2808 But if `fill-prefix-regexp' does have a value, Emacs evaluates the | |
2809 then-part of the `if' expression and binds `parsep' to a regular | |
2810 expression that includes the `fill-prefix-regexp' as part of the | |
2811 pattern. | |
2812 | |
2813 Specifically, `parsep' is set to the original value of the paragraph | |
2814 separate regular expression concatenated with an alternative expression | |
2815 that consists of the `fill-prefix-regexp' followed by optional | |
2816 whitespace to the end of the line. The whitespace is defined by | |
2817 `"[ \t]*$"'.) The `\\|' defines this portion of the regexp as an | |
2818 alternative to `parsep'. | |
2819 | |
2820 According to a comment in the code, the next local variable, | |
2821 `sp-parstart', is used for searching, and then the final two, `start' | |
2822 and `found-start', are set to `nil'. | |
2823 | |
2824 Now we get into the body of the `let*'. The first part of the body of | |
2825 the `let*' deals with the case when the function is given a negative | |
2826 argument and is therefore moving backwards. We will skip this section. | |
2827 | |
2828 | |
2829 File: eintr, Node: fwd-para while, Prev: fwd-para let, Up: forward-paragraph | |
2830 | |
2831 The forward motion `while' loop | |
2832 ------------------------------- | |
2833 | |
2834 The second part of the body of the `let*' deals with forward motion. | |
2835 It is a `while' loop that repeats itself so long as the value of `arg' | |
2836 is greater than zero. In the most common use of the function, the | |
2837 value of the argument is 1, so the body of the `while' loop is | |
2838 evaluated exactly once, and the cursor moves forward one paragraph. | |
2839 | |
2840 This part handles three situations: when point is between paragraphs, | |
2841 when there is a fill prefix and when there is no fill prefix. | |
2842 | |
2843 The `while' loop looks like this: | |
2844 | |
2845 ;; going forwards and not at the end of the buffer | |
2846 (while (and (> arg 0) (not (eobp))) | |
2847 | |
2848 ;; between paragraphs | |
2849 ;; Move forward over separator lines... | |
2850 (while (and (not (eobp)) | |
2851 (progn (move-to-left-margin) (not (eobp))) | |
2852 (looking-at parsep)) | |
2853 (forward-line 1)) | |
2854 ;; This decrements the loop | |
2855 (unless (eobp) (setq arg (1- arg))) | |
2856 ;; ... and one more line. | |
2857 (forward-line 1) | |
2858 | |
2859 (if fill-prefix-regexp | |
2860 ;; There is a fill prefix; it overrides parstart; | |
2861 ;; we go forward line by line | |
2862 (while (and (not (eobp)) | |
2863 (progn (move-to-left-margin) (not (eobp))) | |
2864 (not (looking-at parsep)) | |
2865 (looking-at fill-prefix-regexp)) | |
2866 (forward-line 1)) | |
2867 | |
2868 ;; There is no fill prefix; | |
2869 ;; we go forward character by character | |
2870 (while (and (re-search-forward sp-parstart nil 1) | |
2871 (progn (setq start (match-beginning 0)) | |
2872 (goto-char start) | |
2873 (not (eobp))) | |
2874 (progn (move-to-left-margin) | |
2875 (not (looking-at parsep))) | |
2876 (or (not (looking-at parstart)) | |
2877 (and use-hard-newlines | |
2878 (not (get-text-property (1- start) 'hard))))) | |
2879 (forward-char 1)) | |
2880 | |
2881 ;; and if there is no fill prefix and if we are not at the end, | |
2882 ;; go to whatever was found in the regular expression search | |
2883 ;; for sp-parstart | |
2884 (if (< (point) (point-max)) | |
2885 (goto-char start)))) | |
2886 | |
2887 We can see that this is a decrementing counter `while' loop, using the | |
2888 expression `(setq arg (1- arg))' as the decrementer. That expression | |
2889 is not far from the `while', but is hidden in another Lisp macro, an | |
2890 `unless' macro. Unless we are at the end of the buffer -- that is what | |
2891 the `eobp' function determines; it is an abbreviation of `End Of Buffer | |
2892 P' -- we decrease the value of `arg' by one. | |
2893 | |
2894 (If we are at the end of the buffer, we cannot go forward any more and | |
2895 the next loop of the `while' expression will test false since the test | |
2896 is an `and' with `(not (eobp))'. The `not' function means exactly as | |
2897 you expect; it is another name for `null', a function that returns true | |
2898 when its argument is false.) | |
2899 | |
2900 Interestingly, the loop count is not decremented until we leave the | |
2901 space between paragraphs, unless we come to the end of buffer or stop | |
2902 seeing the local value of the paragraph separator. | |
2903 | |
2904 That second `while' also has a `(move-to-left-margin)' expression. The | |
2905 function is self-explanatory. It is inside a `progn' expression and | |
2906 not the last element of its body, so it is only invoked for its side | |
2907 effect, which is to move point to the left margin of the current line. | |
2908 | |
2909 The `looking-at' function is also self-explanatory; it returns true if | |
2910 the text after point matches the regular expression given as its | |
2911 argument. | |
2912 | |
2913 The rest of the body of the loop looks difficult at first, but makes | |
2914 sense as you come to understand it. | |
2915 | |
2916 First consider what happens if there is a fill prefix: | |
2917 | |
2918 (if fill-prefix-regexp | |
2919 ;; There is a fill prefix; it overrides parstart; | |
2920 ;; we go forward line by line | |
2921 (while (and (not (eobp)) | |
2922 (progn (move-to-left-margin) (not (eobp))) | |
2923 (not (looking-at parsep)) | |
2924 (looking-at fill-prefix-regexp)) | |
2925 (forward-line 1)) | |
2926 | |
2927 This expression moves point forward line by line so long as four | |
2928 conditions are true: | |
2929 | |
2930 1. Point is not at the end of the buffer. | |
2931 | |
2932 2. We can move to the left margin of the text and are not at the end | |
2933 of the buffer. | |
2934 | |
2935 3. The text following point does not separate paragraphs. | |
2936 | |
2937 4. The pattern following point is the fill prefix regular expression. | |
2938 | |
2939 The last condition may be puzzling, until you remember that point was | |
2940 moved to the beginning of the line early in the `forward-paragraph' | |
2941 function. This means that if the text has a fill prefix, the | |
2942 `looking-at' function will see it. | |
2943 | |
2944 Consider what happens when there is no fill prefix. | |
2945 | |
2946 (while (and (re-search-forward sp-parstart nil 1) | |
2947 (progn (setq start (match-beginning 0)) | |
2948 (goto-char start) | |
2949 (not (eobp))) | |
2950 (progn (move-to-left-margin) | |
2951 (not (looking-at parsep))) | |
2952 (or (not (looking-at parstart)) | |
2953 (and use-hard-newlines | |
2954 (not (get-text-property (1- start) 'hard))))) | |
2955 (forward-char 1)) | |
2956 | |
2957 This `while' loop has us searching forward for `sp-parstart', which is | |
2958 the combination of possible whitespace with a the local value of the | |
2959 start of a paragraph or of a paragraph separator. (The latter two are | |
2960 within an expression starting `\(?:' so that they are not referenced by | |
2961 the `match-beginning' function.) | |
2962 | |
2963 The two expressions, | |
2964 | |
2965 (setq start (match-beginning 0)) | |
2966 (goto-char start) | |
2967 | |
2968 mean go to the start of the text matched by the regular expression | |
2969 search. | |
2970 | |
2971 The `(match-beginning 0)' expression is new. It returns a number | |
2972 specifying the location of the start of the text that was matched by | |
2973 the last search. | |
2974 | |
2975 The `match-beginning' function is used here because of a characteristic | |
2976 of a forward search: a successful forward search, regardless of whether | |
2977 it is a plain search or a regular expression search, moves point to the | |
2978 end of the text that is found. In this case, a successful search moves | |
2979 point to the end of the pattern for `sp-parstart'. | |
2980 | |
2981 However, we want to put point at the end of the current paragraph, not | |
2982 somewhere else. Indeed, since the search possibly includes the | |
2983 paragraph separator, point may end up at the beginning of the next one | |
2984 unless we use an expression that includes `match-beginning'. | |
2985 | |
2986 When given an argument of 0, `match-beginning' returns the position | |
2987 that is the start of the text matched by the most recent search. In | |
2988 this case, the most recent search looks for `sp-parstart'. The | |
2989 `(match-beginning 0)' expression returns the beginning position of that | |
2990 pattern, rather than the end position of that pattern. | |
2991 | |
2992 (Incidentally, when passed a positive number as an argument, the | |
2993 `match-beginning' function returns the location of point at that | |
2994 parenthesized expression in the last search unless that parenthesized | |
2995 expression begins with `\(?:'. I don't know why `\(?:' appears here | |
2996 since the argument is 0.) | |
2997 | |
2998 The last expression when there is no fill prefix is | |
2999 | |
3000 (if (< (point) (point-max)) | |
3001 (goto-char start)))) | |
3002 | |
3003 This says that if there is no fill prefix and if we are not at the end, | |
3004 point should move to the beginning of whatever was found by the regular | |
3005 expression search for `sp-parstart'. | |
3006 | |
3007 The full definition for the `forward-paragraph' function not only | |
3008 includes code for going forwards, but also code for going backwards. | |
3009 | |
3010 If you are reading this inside of GNU Emacs and you want to see the | |
3011 whole function, you can type `C-h f' (`describe-function') and the name | |
3012 of the function. This gives you the function documentation and the | |
3013 name of the library containing the function's source. Place point over | |
3014 the name of the library and press the RET key; you will be taken | |
3015 directly to the source. (Be sure to install your sources! Without | |
3016 them, you are like a person who tries to drive a car with his eyes | |
3017 shut!) | |
3018 | |
3019 | |
3020 File: eintr, Node: etags, Next: Regexp Review, Prev: forward-paragraph, Up: Regexp Search | |
3021 | |
3022 12.5 Create Your Own `TAGS' File | |
3023 ================================ | |
3024 | |
3025 Besides `C-h f' (`describe-function'), another way to see the source of | |
3026 a function is to type `M-.' (`find-tag') and the name of the function | |
3027 when prompted for it. This is a good habit to get into. This will | |
3028 take you directly to the source. If the `find-tag' function first asks | |
3029 you for the name of a `TAGS' table, give it the name of a `TAGS' file | |
3030 such as `/usr/local/src/emacs/src/TAGS'. (The exact path to your | |
3031 `TAGS' file depends on how your copy of Emacs was installed. I just | |
3032 told you the location that provides both my C and my Emacs Lisp | |
3033 sources.) | |
3034 | |
3035 You can also create your own `TAGS' file for directories that lack one. | |
3036 | |
3037 The `M-.' (`find-tag') command takes you directly to the source for a | |
3038 function, variable, node, or other source. The function depends on | |
3039 tags tables to tell it where to go. | |
3040 | |
3041 You often need to build and install tags tables yourself. They are not | |
3042 built automatically. A tags table is called a `TAGS' file; the name is | |
3043 in upper case letters. | |
3044 | |
3045 You can create a `TAGS' file by calling the `etags' program that comes | |
3046 as a part of the Emacs distribution. Usually, `etags' is compiled and | |
3047 installed when Emacs is built. (`etags' is not an Emacs Lisp function | |
3048 or a part of Emacs; it is a C program.) | |
3049 | |
3050 To create a `TAGS' file, first switch to the directory in which you | |
3051 want to create the file. In Emacs you can do this with the `M-x cd' | |
3052 command, or by visiting a file in the directory, or by listing the | |
3053 directory with `C-x d' (`dired'). Then run the compile command, with | |
3054 `etags *.el' as the command to execute | |
3055 | |
3056 M-x compile RET etags *.el RET | |
3057 | |
3058 to create a `TAGS' file. | |
3059 | |
3060 For example, if you have a large number of files in your `~/emacs' | |
3061 directory, as I do--I have 137 `.el' files in it, of which I load | |
3062 12--you can create a `TAGS' file for the Emacs Lisp files in that | |
3063 directory. | |
3064 | |
3065 The `etags' program takes all the usual shell `wildcards'. For | |
3066 example, if you have two directories for which you want a single `TAGS | |
3067 file', type `etags *.el ../elisp/*.el', where `../elisp/' is the second | |
3068 directory: | |
3069 | |
3070 M-x compile RET etags *.el ../elisp/*.el RET | |
3071 | |
3072 Type | |
3073 | |
3074 M-x compile RET etags --help RET | |
3075 | |
3076 to see a list of the options accepted by `etags' as well as a list of | |
3077 supported languages. | |
3078 | |
3079 The `etags' program handles more than 20 languages, including Emacs | |
3080 Lisp, Common Lisp, Scheme, C, C++, Ada, Fortran, Java, LaTeX, Pascal, | |
3081 Perl, Python, Texinfo, makefiles, and most assemblers. The program has | |
3082 no switches for specifying the language; it recognizes the language in | |
3083 an input file according to its file name and contents. | |
3084 | |
3085 `etags' is very helpful when you are writing code yourself and want to | |
3086 refer back to functions you have already written. Just run `etags' | |
3087 again at intervals as you write new functions, so they become part of | |
3088 the `TAGS' file. | |
3089 | |
3090 If you think an appropriate `TAGS' file already exists for what you | |
3091 want, but do not know where it is, you can use the `locate' program to | |
3092 attempt to find it. | |
3093 | |
3094 Type `M-x locate <RET> TAGS <RET>' and Emacs will list for you the full | |
3095 path names of all your `TAGS' files. On my system, this command lists | |
3096 34 `TAGS' files. On the other hand, a `plain vanilla' system I | |
3097 recently installed did not contain any `TAGS' files. | |
3098 | |
3099 If the tags table you want has been created, you can use the `M-x | |
3100 visit-tags-table' command to specify it. Otherwise, you will need to | |
3101 create the tag table yourself and then use `M-x visit-tags-table'. | |
3102 | |
3103 Building Tags in the Emacs sources | |
3104 .................................. | |
3105 | |
3106 The GNU Emacs sources come with a `Makefile' that contains a | |
3107 sophisticated `etags' command that creates, collects, and merges tags | |
3108 tables from all over the Emacs sources and puts the information into | |
3109 one `TAGS' file in the `src/' directory below the top level of your | |
3110 Emacs source directory. | |
3111 | |
3112 To build this `TAGS' file, go to the top level of your Emacs source | |
3113 directory and run the compile command `make tags': | |
3114 | |
3115 M-x compile RET make tags RET | |
3116 | |
3117 (The `make tags' command works well with the GNU Emacs sources, as well | |
3118 as with some other source packages.) | |
3119 | |
3120 For more information, see *Note Tag Tables: (emacs)Tags. | |
3121 | |
3122 | |
3123 File: eintr, Node: Regexp Review, Next: re-search Exercises, Prev: etags, Up: Regexp Search | |
3124 | |
3125 12.6 Review | |
3126 =========== | |
3127 | |
3128 Here is a brief summary of some recently introduced functions. | |
3129 | |
3130 `while' | |
3131 Repeatedly evaluate the body of the expression so long as the first | |
3132 element of the body tests true. Then return `nil'. (The | |
3133 expression is evaluated only for its side effects.) | |
3134 | |
3135 For example: | |
3136 | |
3137 (let ((foo 2)) | |
3138 (while (> foo 0) | |
3139 (insert (format "foo is %d.\n" foo)) | |
3140 (setq foo (1- foo)))) | |
3141 | |
3142 => foo is 2. | |
3143 foo is 1. | |
3144 nil | |
3145 | |
3146 (The `insert' function inserts its arguments at point; the | |
3147 `format' function returns a string formatted from its arguments | |
3148 the way `message' formats its arguments; `\n' produces a new line.) | |
3149 | |
3150 `re-search-forward' | |
3151 Search for a pattern, and if the pattern is found, move point to | |
3152 rest just after it. | |
3153 | |
3154 Takes four arguments, like `search-forward': | |
3155 | |
3156 1. A regular expression that specifies the pattern to search for. | |
3157 (Remember to put quotation marks around this argument!) | |
3158 | |
3159 2. Optionally, the limit of the search. | |
3160 | |
3161 3. Optionally, what to do if the search fails, return `nil' or an | |
3162 error message. | |
3163 | |
3164 4. Optionally, how many times to repeat the search; if negative, | |
3165 the search goes backwards. | |
3166 | |
3167 `let*' | |
3168 Bind some variables locally to particular values, and then | |
3169 evaluate the remaining arguments, returning the value of the last | |
3170 one. While binding the local variables, use the local values of | |
3171 variables bound earlier, if any. | |
3172 | |
3173 For example: | |
3174 | |
3175 (let* ((foo 7) | |
3176 (bar (* 3 foo))) | |
3177 (message "`bar' is %d." bar)) | |
3178 => `bar' is 21. | |
3179 | |
3180 `match-beginning' | |
3181 Return the position of the start of the text found by the last | |
3182 regular expression search. | |
3183 | |
3184 `looking-at' | |
3185 Return `t' for true if the text after point matches the argument, | |
3186 which should be a regular expression. | |
3187 | |
3188 `eobp' | |
3189 Return `t' for true if point is at the end of the accessible part | |
3190 of a buffer. The end of the accessible part is the end of the | |
3191 buffer if the buffer is not narrowed; it is the end of the | |
3192 narrowed part if the buffer is narrowed. | |
3193 | |
3194 | |
3195 File: eintr, Node: re-search Exercises, Prev: Regexp Review, Up: Regexp Search | |
3196 | |
3197 12.7 Exercises with `re-search-forward' | |
3198 ======================================= | |
3199 | |
3200 * Write a function to search for a regular expression that matches | |
3201 two or more blank lines in sequence. | |
3202 | |
3203 * Write a function to search for duplicated words, such as `the the'. | |
3204 *Note Syntax of Regular Expressions: (emacs)Regexps, for | |
3205 information on how to write a regexp (a regular expression) to | |
3206 match a string that is composed of two identical halves. You can | |
3207 devise several regexps; some are better than others. The function | |
3208 I use is described in an appendix, along with several regexps. | |
3209 *Note `the-the' Duplicated Words Function: the-the. | |
3210 | |
3211 | |
3212 File: eintr, Node: Counting Words, Next: Words in a defun, Prev: Regexp Search, Up: Top | |
3213 | |
3214 13 Counting: Repetition and Regexps | |
3215 *********************************** | |
3216 | |
3217 Repetition and regular expression searches are powerful tools that you | |
3218 often use when you write code in Emacs Lisp. This chapter illustrates | |
3219 the use of regular expression searches through the construction of word | |
3220 count commands using `while' loops and recursion. | |
3221 | |
3222 * Menu: | |
3223 | |
3224 * Why Count Words:: | |
3225 * count-words-region:: | |
3226 * recursive-count-words:: | |
3227 * Counting Exercise:: | |
3228 | |
3229 | |
3230 File: eintr, Node: Why Count Words, Next: count-words-region, Prev: Counting Words, Up: Counting Words | |
3231 | |
3232 Counting words | |
3233 ============== | |
3234 | |
3235 The standard Emacs distribution contains a function for counting the | |
3236 number of lines within a region. However, there is no corresponding | |
3237 function for counting words. | |
3238 | |
3239 Certain types of writing ask you to count words. Thus, if you write an | |
3240 essay, you may be limited to 800 words; if you write a novel, you may | |
3241 discipline yourself to write 1000 words a day. It seems odd to me that | |
3242 Emacs lacks a word count command. Perhaps people use Emacs mostly for | |
3243 code or types of documentation that do not require word counts; or | |
3244 perhaps they restrict themselves to the operating system word count | |
3245 command, `wc'. Alternatively, people may follow the publishers' | |
3246 convention and compute a word count by dividing the number of | |
3247 characters in a document by five. In any event, here are commands to | |
3248 count words. | |
3249 | |
3250 | |
3251 File: eintr, Node: count-words-region, Next: recursive-count-words, Prev: Why Count Words, Up: Counting Words | |
3252 | |
3253 13.1 The `count-words-region' Function | |
3254 ====================================== | |
3255 | |
3256 A word count command could count words in a line, paragraph, region, or | |
3257 buffer. What should the command cover? You could design the command | |
3258 to count the number of words in a complete buffer. However, the Emacs | |
3259 tradition encourages flexibility--you may want to count words in just a | |
3260 section, rather than all of a buffer. So it makes more sense to design | |
3261 the command to count the number of words in a region. Once you have a | |
3262 `count-words-region' command, you can, if you wish, count words in a | |
3263 whole buffer by marking it with `C-x h' (`mark-whole-buffer'). | |
3264 | |
3265 Clearly, counting words is a repetitive act: starting from the | |
3266 beginning of the region, you count the first word, then the second | |
3267 word, then the third word, and so on, until you reach the end of the | |
3268 region. This means that word counting is ideally suited to recursion | |
3269 or to a `while' loop. | |
3270 | |
3271 * Menu: | |
3272 | |
3273 * Design count-words-region:: | |
3274 * Whitespace Bug:: | |
3275 | |
3276 | |
3277 File: eintr, Node: Design count-words-region, Next: Whitespace Bug, Prev: count-words-region, Up: count-words-region | |
3278 | |
3279 Designing `count-words-region' | |
3280 ------------------------------ | |
3281 | |
3282 First, we will implement the word count command with a `while' loop, | |
3283 then with recursion. The command will, of course, be interactive. | |
3284 | |
3285 The template for an interactive function definition is, as always: | |
3286 | |
3287 (defun NAME-OF-FUNCTION (ARGUMENT-LIST) | |
3288 "DOCUMENTATION..." | |
3289 (INTERACTIVE-EXPRESSION...) | |
3290 BODY...) | |
3291 | |
3292 What we need to do is fill in the slots. | |
3293 | |
3294 The name of the function should be self-explanatory and similar to the | |
3295 existing `count-lines-region' name. This makes the name easier to | |
3296 remember. `count-words-region' is a good choice. | |
3297 | |
3298 The function counts words within a region. This means that the | |
3299 argument list must contain symbols that are bound to the two positions, | |
3300 the beginning and end of the region. These two positions can be called | |
3301 `beginning' and `end' respectively. The first line of the | |
3302 documentation should be a single sentence, since that is all that is | |
3303 printed as documentation by a command such as `apropos'. The | |
3304 interactive expression will be of the form `(interactive "r")', since | |
3305 that will cause Emacs to pass the beginning and end of the region to | |
3306 the function's argument list. All this is routine. | |
3307 | |
3308 The body of the function needs to be written to do three tasks: first, | |
3309 to set up conditions under which the `while' loop can count words, | |
3310 second, to run the `while' loop, and third, to send a message to the | |
3311 user. | |
3312 | |
3313 When a user calls `count-words-region', point may be at the beginning | |
3314 or the end of the region. However, the counting process must start at | |
3315 the beginning of the region. This means we will want to put point | |
3316 there if it is not already there. Executing `(goto-char beginning)' | |
3317 ensures this. Of course, we will want to return point to its expected | |
3318 position when the function finishes its work. For this reason, the | |
3319 body must be enclosed in a `save-excursion' expression. | |
3320 | |
3321 The central part of the body of the function consists of a `while' loop | |
3322 in which one expression jumps point forward word by word, and another | |
3323 expression counts those jumps. The true-or-false-test of the `while' | |
3324 loop should test true so long as point should jump forward, and false | |
3325 when point is at the end of the region. | |
3326 | |
3327 We could use `(forward-word 1)' as the expression for moving point | |
3328 forward word by word, but it is easier to see what Emacs identifies as a | |
3329 `word' if we use a regular expression search. | |
3330 | |
3331 A regular expression search that finds the pattern for which it is | |
3332 searching leaves point after the last character matched. This means | |
3333 that a succession of successful word searches will move point forward | |
3334 word by word. | |
3335 | |
3336 As a practical matter, we want the regular expression search to jump | |
3337 over whitespace and punctuation between words as well as over the words | |
3338 themselves. A regexp that refuses to jump over interword whitespace | |
3339 would never jump more than one word! This means that the regexp should | |
3340 include the whitespace and punctuation that follows a word, if any, as | |
3341 well as the word itself. (A word may end a buffer and not have any | |
3342 following whitespace or punctuation, so that part of the regexp must be | |
3343 optional.) | |
3344 | |
3345 Thus, what we want for the regexp is a pattern defining one or more | |
3346 word constituent characters followed, optionally, by one or more | |
3347 characters that are not word constituents. The regular expression for | |
3348 this is: | |
3349 | |
3350 \w+\W* | |
3351 | |
3352 The buffer's syntax table determines which characters are and are not | |
3353 word constituents. (*Note What Constitutes a Word or Symbol?: Syntax, | |
3354 for more about syntax. Also, see *Note Syntax: (emacs)Syntax, and | |
3355 *Note Syntax Tables: (elisp)Syntax Tables.) | |
3356 | |
3357 The search expression looks like this: | |
3358 | |
3359 (re-search-forward "\\w+\\W*") | |
3360 | |
3361 (Note that paired backslashes precede the `w' and `W'. A single | |
3362 backslash has special meaning to the Emacs Lisp interpreter. It | |
3363 indicates that the following character is interpreted differently than | |
3364 usual. For example, the two characters, `\n', stand for `newline', | |
3365 rather than for a backslash followed by `n'. Two backslashes in a row | |
3366 stand for an ordinary, `unspecial' backslash, which in this case is | |
3367 followed by a letter, the combination of which is important to | |
3368 `re-search-forward'.) | |
3369 | |
3370 We need a counter to count how many words there are; this variable must | |
3371 first be set to 0 and then incremented each time Emacs goes around the | |
3372 `while' loop. The incrementing expression is simply: | |
3373 | |
3374 (setq count (1+ count)) | |
3375 | |
3376 Finally, we want to tell the user how many words there are in the | |
3377 region. The `message' function is intended for presenting this kind of | |
3378 information to the user. The message has to be phrased so that it | |
3379 reads properly regardless of how many words there are in the region: we | |
3380 don't want to say that "there are 1 words in the region". The conflict | |
3381 between singular and plural is ungrammatical. We can solve this | |
3382 problem by using a conditional expression that evaluates different | |
3383 messages depending on the number of words in the region. There are | |
3384 three possibilities: no words in the region, one word in the region, | |
3385 and more than one word. This means that the `cond' special form is | |
3386 appropriate. | |
3387 | |
3388 All this leads to the following function definition: | |
3389 | |
3390 ;;; First version; has bugs! | |
3391 (defun count-words-region (beginning end) | |
3392 "Print number of words in the region. | |
3393 Words are defined as at least one word-constituent | |
3394 character followed by at least one character that | |
3395 is not a word-constituent. The buffer's syntax | |
3396 table determines which characters these are." | |
3397 (interactive "r") | |
3398 (message "Counting words in region ... ") | |
3399 | |
3400 ;;; 1. Set up appropriate conditions. | |
3401 (save-excursion | |
3402 (goto-char beginning) | |
3403 (let ((count 0)) | |
3404 | |
3405 ;;; 2. Run the while loop. | |
3406 (while (< (point) end) | |
3407 (re-search-forward "\\w+\\W*") | |
3408 (setq count (1+ count))) | |
3409 | |
3410 ;;; 3. Send a message to the user. | |
3411 (cond ((zerop count) | |
3412 (message | |
3413 "The region does NOT have any words.")) | |
3414 ((= 1 count) | |
3415 (message | |
3416 "The region has 1 word.")) | |
3417 (t | |
3418 (message | |
3419 "The region has %d words." count)))))) | |
3420 | |
3421 As written, the function works, but not in all circumstances. | |
3422 | |
3423 | |
3424 File: eintr, Node: Whitespace Bug, Prev: Design count-words-region, Up: count-words-region | |
3425 | |
3426 13.1.1 The Whitespace Bug in `count-words-region' | |
3427 ------------------------------------------------- | |
3428 | |
3429 The `count-words-region' command described in the preceding section has | |
3430 two bugs, or rather, one bug with two manifestations. First, if you | |
3431 mark a region containing only whitespace in the middle of some text, | |
3432 the `count-words-region' command tells you that the region contains one | |
3433 word! Second, if you mark a region containing only whitespace at the | |
3434 end of the buffer or the accessible portion of a narrowed buffer, the | |
3435 command displays an error message that looks like this: | |
3436 | |
3437 Search failed: "\\w+\\W*" | |
3438 | |
3439 If you are reading this in Info in GNU Emacs, you can test for these | |
3440 bugs yourself. | |
3441 | |
3442 First, evaluate the function in the usual manner to install it. Here | |
3443 is a copy of the definition. Place your cursor after the closing | |
3444 parenthesis and type `C-x C-e' to install it. | |
3445 | |
3446 ;; First version; has bugs! | |
3447 (defun count-words-region (beginning end) | |
3448 "Print number of words in the region. | |
3449 Words are defined as at least one word-constituent character followed | |
3450 by at least one character that is not a word-constituent. The buffer's | |
3451 syntax table determines which characters these are." | |
3452 (interactive "r") | |
3453 (message "Counting words in region ... ") | |
3454 | |
3455 ;;; 1. Set up appropriate conditions. | |
3456 (save-excursion | |
3457 (goto-char beginning) | |
3458 (let ((count 0)) | |
3459 | |
3460 ;;; 2. Run the while loop. | |
3461 (while (< (point) end) | |
3462 (re-search-forward "\\w+\\W*") | |
3463 (setq count (1+ count))) | |
3464 | |
3465 ;;; 3. Send a message to the user. | |
3466 (cond ((zerop count) | |
3467 (message "The region does NOT have any words.")) | |
3468 ((= 1 count) (message "The region has 1 word.")) | |
3469 (t (message "The region has %d words." count)))))) | |
3470 | |
3471 If you wish, you can also install this keybinding by evaluating it: | |
3472 | |
3473 (global-set-key "\C-c=" 'count-words-region) | |
3474 | |
3475 To conduct the first test, set mark and point to the beginning and end | |
3476 of the following line and then type `C-c =' (or `M-x | |
3477 count-words-region' if you have not bound `C-c ='): | |
3478 | |
3479 one two three | |
3480 | |
3481 Emacs will tell you, correctly, that the region has three words. | |
3482 | |
3483 Repeat the test, but place mark at the beginning of the line and place | |
3484 point just _before_ the word `one'. Again type the command `C-c =' (or | |
3485 `M-x count-words-region'). Emacs should tell you that the region has | |
3486 no words, since it is composed only of the whitespace at the beginning | |
3487 of the line. But instead Emacs tells you that the region has one word! | |
3488 | |
3489 For the third test, copy the sample line to the end of the `*scratch*' | |
3490 buffer and then type several spaces at the end of the line. Place mark | |
3491 right after the word `three' and point at the end of line. (The end of | |
3492 the line will be the end of the buffer.) Type `C-c =' (or `M-x | |
3493 count-words-region') as you did before. Again, Emacs should tell you | |
3494 that the region has no words, since it is composed only of the | |
3495 whitespace at the end of the line. Instead, Emacs displays an error | |
3496 message saying `Search failed'. | |
3497 | |
3498 The two bugs stem from the same problem. | |
3499 | |
3500 Consider the first manifestation of the bug, in which the command tells | |
3501 you that the whitespace at the beginning of the line contains one word. | |
3502 What happens is this: The `M-x count-words-region' command moves point | |
3503 to the beginning of the region. The `while' tests whether the value of | |
3504 point is smaller than the value of `end', which it is. Consequently, | |
3505 the regular expression search looks for and finds the first word. It | |
3506 leaves point after the word. `count' is set to one. The `while' loop | |
3507 repeats; but this time the value of point is larger than the value of | |
3508 `end', the loop is exited; and the function displays a message saying | |
3509 the number of words in the region is one. In brief, the regular | |
3510 expression search looks for and finds the word even though it is outside | |
3511 the marked region. | |
3512 | |
3513 In the second manifestation of the bug, the region is whitespace at the | |
3514 end of the buffer. Emacs says `Search failed'. What happens is that | |
3515 the true-or-false-test in the `while' loop tests true, so the search | |
3516 expression is executed. But since there are no more words in the | |
3517 buffer, the search fails. | |
3518 | |
3519 In both manifestations of the bug, the search extends or attempts to | |
3520 extend outside of the region. | |
3521 | |
3522 The solution is to limit the search to the region--this is a fairly | |
3523 simple action, but as you may have come to expect, it is not quite as | |
3524 simple as you might think. | |
3525 | |
3526 As we have seen, the `re-search-forward' function takes a search | |
3527 pattern as its first argument. But in addition to this first, | |
3528 mandatory argument, it accepts three optional arguments. The optional | |
3529 second argument bounds the search. The optional third argument, if | |
3530 `t', causes the function to return `nil' rather than signal an error if | |
3531 the search fails. The optional fourth argument is a repeat count. (In | |
3532 Emacs, you can see a function's documentation by typing `C-h f', the | |
3533 name of the function, and then <RET>.) | |
3534 | |
3535 In the `count-words-region' definition, the value of the end of the | |
3536 region is held by the variable `end' which is passed as an argument to | |
3537 the function. Thus, we can add `end' as an argument to the regular | |
3538 expression search expression: | |
3539 | |
3540 (re-search-forward "\\w+\\W*" end) | |
3541 | |
3542 However, if you make only this change to the `count-words-region' | |
3543 definition and then test the new version of the definition on a stretch | |
3544 of whitespace, you will receive an error message saying `Search failed'. | |
3545 | |
3546 What happens is this: the search is limited to the region, and fails as | |
3547 you expect because there are no word-constituent characters in the | |
3548 region. Since it fails, we receive an error message. But we do not | |
3549 want to receive an error message in this case; we want to receive the | |
3550 message that "The region does NOT have any words." | |
3551 | |
3552 The solution to this problem is to provide `re-search-forward' with a | |
3553 third argument of `t', which causes the function to return `nil' rather | |
3554 than signal an error if the search fails. | |
3555 | |
3556 However, if you make this change and try it, you will see the message | |
3557 "Counting words in region ... " and ... you will keep on seeing that | |
3558 message ..., until you type `C-g' (`keyboard-quit'). | |
3559 | |
3560 Here is what happens: the search is limited to the region, as before, | |
3561 and it fails because there are no word-constituent characters in the | |
3562 region, as expected. Consequently, the `re-search-forward' expression | |
3563 returns `nil'. It does nothing else. In particular, it does not move | |
3564 point, which it does as a side effect if it finds the search target. | |
3565 After the `re-search-forward' expression returns `nil', the next | |
3566 expression in the `while' loop is evaluated. This expression | |
3567 increments the count. Then the loop repeats. The true-or-false-test | |
3568 tests true because the value of point is still less than the value of | |
3569 end, since the `re-search-forward' expression did not move point. ... | |
3570 and the cycle repeats ... | |
3571 | |
3572 The `count-words-region' definition requires yet another modification, | |
3573 to cause the true-or-false-test of the `while' loop to test false if | |
3574 the search fails. Put another way, there are two conditions that must | |
3575 be satisfied in the true-or-false-test before the word count variable | |
3576 is incremented: point must still be within the region and the search | |
3577 expression must have found a word to count. | |
3578 | |
3579 Since both the first condition and the second condition must be true | |
3580 together, the two expressions, the region test and the search | |
3581 expression, can be joined with an `and' special form and embedded in | |
3582 the `while' loop as the true-or-false-test, like this: | |
3583 | |
3584 (and (< (point) end) (re-search-forward "\\w+\\W*" end t)) | |
3585 | |
3586 (*Note The `kill-new' function: kill-new function, for information | |
3587 about `and'.) | |
3588 | |
3589 The `re-search-forward' expression returns `t' if the search succeeds | |
3590 and as a side effect moves point. Consequently, as words are found, | |
3591 point is moved through the region. When the search expression fails to | |
3592 find another word, or when point reaches the end of the region, the | |
3593 true-or-false-test tests false, the `while' loop exits, and the | |
3594 `count-words-region' function displays one or other of its messages. | |
3595 | |
3596 After incorporating these final changes, the `count-words-region' works | |
3597 without bugs (or at least, without bugs that I have found!). Here is | |
3598 what it looks like: | |
3599 | |
3600 ;;; Final version: `while' | |
3601 (defun count-words-region (beginning end) | |
3602 "Print number of words in the region." | |
3603 (interactive "r") | |
3604 (message "Counting words in region ... ") | |
3605 | |
3606 ;;; 1. Set up appropriate conditions. | |
3607 (save-excursion | |
3608 (let ((count 0)) | |
3609 (goto-char beginning) | |
3610 | |
3611 ;;; 2. Run the while loop. | |
3612 (while (and (< (point) end) | |
3613 (re-search-forward "\\w+\\W*" end t)) | |
3614 (setq count (1+ count))) | |
3615 | |
3616 ;;; 3. Send a message to the user. | |
3617 (cond ((zerop count) | |
3618 (message | |
3619 "The region does NOT have any words.")) | |
3620 ((= 1 count) | |
3621 (message | |
3622 "The region has 1 word.")) | |
3623 (t | |
3624 (message | |
3625 "The region has %d words." count)))))) | |
3626 | |
3627 | |
3628 File: eintr, Node: recursive-count-words, Next: Counting Exercise, Prev: count-words-region, Up: Counting Words | |
3629 | |
3630 13.2 Count Words Recursively | |
3631 ============================ | |
3632 | |
3633 You can write the function for counting words recursively as well as | |
3634 with a `while' loop. Let's see how this is done. | |
3635 | |
3636 First, we need to recognize that the `count-words-region' function has | |
3637 three jobs: it sets up the appropriate conditions for counting to | |
3638 occur; it counts the words in the region; and it sends a message to the | |
3639 user telling how many words there are. | |
3640 | |
3641 If we write a single recursive function to do everything, we will | |
3642 receive a message for every recursive call. If the region contains 13 | |
3643 words, we will receive thirteen messages, one right after the other. | |
3644 We don't want this! Instead, we must write two functions to do the | |
3645 job, one of which (the recursive function) will be used inside of the | |
3646 other. One function will set up the conditions and display the | |
3647 message; the other will return the word count. | |
3648 | |
3649 Let us start with the function that causes the message to be displayed. | |
3650 We can continue to call this `count-words-region'. | |
3651 | |
3652 This is the function that the user will call. It will be interactive. | |
3653 Indeed, it will be similar to our previous versions of this function, | |
3654 except that it will call `recursive-count-words' to determine how many | |
3655 words are in the region. | |
3656 | |
3657 We can readily construct a template for this function, based on our | |
3658 previous versions: | |
3659 | |
3660 ;; Recursive version; uses regular expression search | |
3661 (defun count-words-region (beginning end) | |
3662 "DOCUMENTATION..." | |
3663 (INTERACTIVE-EXPRESSION...) | |
3664 | |
3665 ;;; 1. Set up appropriate conditions. | |
3666 (EXPLANATORY MESSAGE) | |
3667 (SET-UP FUNCTIONS... | |
3668 | |
3669 ;;; 2. Count the words. | |
3670 RECURSIVE CALL | |
3671 | |
3672 ;;; 3. Send a message to the user. | |
3673 MESSAGE PROVIDING WORD COUNT)) | |
3674 | |
3675 The definition looks straightforward, except that somehow the count | |
3676 returned by the recursive call must be passed to the message displaying | |
3677 the word count. A little thought suggests that this can be done by | |
3678 making use of a `let' expression: we can bind a variable in the varlist | |
3679 of a `let' expression to the number of words in the region, as returned | |
3680 by the recursive call; and then the `cond' expression, using binding, | |
3681 can display the value to the user. | |
3682 | |
3683 Often, one thinks of the binding within a `let' expression as somehow | |
3684 secondary to the `primary' work of a function. But in this case, what | |
3685 you might consider the `primary' job of the function, counting words, | |
3686 is done within the `let' expression. | |
3687 | |
3688 Using `let', the function definition looks like this: | |
3689 | |
3690 (defun count-words-region (beginning end) | |
3691 "Print number of words in the region." | |
3692 (interactive "r") | |
3693 | |
3694 ;;; 1. Set up appropriate conditions. | |
3695 (message "Counting words in region ... ") | |
3696 (save-excursion | |
3697 (goto-char beginning) | |
3698 | |
3699 ;;; 2. Count the words. | |
3700 (let ((count (recursive-count-words end))) | |
3701 | |
3702 ;;; 3. Send a message to the user. | |
3703 (cond ((zerop count) | |
3704 (message | |
3705 "The region does NOT have any words.")) | |
3706 ((= 1 count) | |
3707 (message | |
3708 "The region has 1 word.")) | |
3709 (t | |
3710 (message | |
3711 "The region has %d words." count)))))) | |
3712 | |
3713 Next, we need to write the recursive counting function. | |
3714 | |
3715 A recursive function has at least three parts: the `do-again-test', the | |
3716 `next-step-expression', and the recursive call. | |
3717 | |
3718 The do-again-test determines whether the function will or will not be | |
3719 called again. Since we are counting words in a region and can use a | |
3720 function that moves point forward for every word, the do-again-test can | |
3721 check whether point is still within the region. The do-again-test | |
3722 should find the value of point and determine whether point is before, | |
3723 at, or after the value of the end of the region. We can use the | |
3724 `point' function to locate point. Clearly, we must pass the value of | |
3725 the end of the region to the recursive counting function as an argument. | |
3726 | |
3727 In addition, the do-again-test should also test whether the search | |
3728 finds a word. If it does not, the function should not call itself | |
3729 again. | |
3730 | |
3731 The next-step-expression changes a value so that when the recursive | |
3732 function is supposed to stop calling itself, it stops. More precisely, | |
3733 the next-step-expression changes a value so that at the right time, the | |
3734 do-again-test stops the recursive function from calling itself again. | |
3735 In this case, the next-step-expression can be the expression that moves | |
3736 point forward, word by word. | |
3737 | |
3738 The third part of a recursive function is the recursive call. | |
3739 | |
3740 Somewhere, also, we also need a part that does the `work' of the | |
3741 function, a part that does the counting. A vital part! | |
3742 | |
3743 But already, we have an outline of the recursive counting function: | |
3744 | |
3745 (defun recursive-count-words (region-end) | |
3746 "DOCUMENTATION..." | |
3747 DO-AGAIN-TEST | |
3748 NEXT-STEP-EXPRESSION | |
3749 RECURSIVE CALL) | |
3750 | |
3751 Now we need to fill in the slots. Let's start with the simplest cases | |
3752 first: if point is at or beyond the end of the region, there cannot be | |
3753 any words in the region, so the function should return zero. Likewise, | |
3754 if the search fails, there are no words to count, so the function | |
3755 should return zero. | |
3756 | |
3757 On the other hand, if point is within the region and the search | |
3758 succeeds, the function should call itself again. | |
3759 | |
3760 Thus, the do-again-test should look like this: | |
3761 | |
3762 (and (< (point) region-end) | |
3763 (re-search-forward "\\w+\\W*" region-end t)) | |
3764 | |
3765 Note that the search expression is part of the do-again-test--the | |
3766 function returns `t' if its search succeeds and `nil' if it fails. | |
3767 (*Note The Whitespace Bug in `count-words-region': Whitespace Bug, for | |
3768 an explanation of how `re-search-forward' works.) | |
3769 | |
3770 The do-again-test is the true-or-false test of an `if' clause. | |
3771 Clearly, if the do-again-test succeeds, the then-part of the `if' | |
3772 clause should call the function again; but if it fails, the else-part | |
3773 should return zero since either point is outside the region or the | |
3774 search failed because there were no words to find. | |
3775 | |
3776 But before considering the recursive call, we need to consider the | |
3777 next-step-expression. What is it? Interestingly, it is the search | |
3778 part of the do-again-test. | |
3779 | |
3780 In addition to returning `t' or `nil' for the do-again-test, | |
3781 `re-search-forward' moves point forward as a side effect of a | |
3782 successful search. This is the action that changes the value of point | |
3783 so that the recursive function stops calling itself when point | |
3784 completes its movement through the region. Consequently, the | |
3785 `re-search-forward' expression is the next-step-expression. | |
3786 | |
3787 In outline, then, the body of the `recursive-count-words' function | |
3788 looks like this: | |
3789 | |
3790 (if DO-AGAIN-TEST-AND-NEXT-STEP-COMBINED | |
3791 ;; then | |
3792 RECURSIVE-CALL-RETURNING-COUNT | |
3793 ;; else | |
3794 RETURN-ZERO) | |
3795 | |
3796 How to incorporate the mechanism that counts? | |
3797 | |
3798 If you are not used to writing recursive functions, a question like | |
3799 this can be troublesome. But it can and should be approached | |
3800 systematically. | |
3801 | |
3802 We know that the counting mechanism should be associated in some way | |
3803 with the recursive call. Indeed, since the next-step-expression moves | |
3804 point forward by one word, and since a recursive call is made for each | |
3805 word, the counting mechanism must be an expression that adds one to the | |
3806 value returned by a call to `recursive-count-words'. | |
3807 | |
3808 Consider several cases: | |
3809 | |
3810 * If there are two words in the region, the function should return a | |
3811 value resulting from adding one to the value returned when it | |
3812 counts the first word, plus the number returned when it counts the | |
3813 remaining words in the region, which in this case is one. | |
3814 | |
3815 * If there is one word in the region, the function should return a | |
3816 value resulting from adding one to the value returned when it | |
3817 counts that word, plus the number returned when it counts the | |
3818 remaining words in the region, which in this case is zero. | |
3819 | |
3820 * If there are no words in the region, the function should return | |
3821 zero. | |
3822 | |
3823 From the sketch we can see that the else-part of the `if' returns zero | |
3824 for the case of no words. This means that the then-part of the `if' | |
3825 must return a value resulting from adding one to the value returned | |
3826 from a count of the remaining words. | |
3827 | |
3828 The expression will look like this, where `1+' is a function that adds | |
3829 one to its argument. | |
3830 | |
3831 (1+ (recursive-count-words region-end)) | |
3832 | |
3833 The whole `recursive-count-words' function will then look like this: | |
3834 | |
3835 (defun recursive-count-words (region-end) | |
3836 "DOCUMENTATION..." | |
3837 | |
3838 ;;; 1. do-again-test | |
3839 (if (and (< (point) region-end) | |
3840 (re-search-forward "\\w+\\W*" region-end t)) | |
3841 | |
3842 ;;; 2. then-part: the recursive call | |
3843 (1+ (recursive-count-words region-end)) | |
3844 | |
3845 ;;; 3. else-part | |
3846 0)) | |
3847 | |
3848 Let's examine how this works: | |
3849 | |
3850 If there are no words in the region, the else part of the `if' | |
3851 expression is evaluated and consequently the function returns zero. | |
3852 | |
3853 If there is one word in the region, the value of point is less than the | |
3854 value of `region-end' and the search succeeds. In this case, the | |
3855 true-or-false-test of the `if' expression tests true, and the then-part | |
3856 of the `if' expression is evaluated. The counting expression is | |
3857 evaluated. This expression returns a value (which will be the value | |
3858 returned by the whole function) that is the sum of one added to the | |
3859 value returned by a recursive call. | |
3860 | |
3861 Meanwhile, the next-step-expression has caused point to jump over the | |
3862 first (and in this case only) word in the region. This means that when | |
3863 `(recursive-count-words region-end)' is evaluated a second time, as a | |
3864 result of the recursive call, the value of point will be equal to or | |
3865 greater than the value of region end. So this time, | |
3866 `recursive-count-words' will return zero. The zero will be added to | |
3867 one, and the original evaluation of `recursive-count-words' will return | |
3868 one plus zero, which is one, which is the correct amount. | |
3869 | |
3870 Clearly, if there are two words in the region, the first call to | |
3871 `recursive-count-words' returns one added to the value returned by | |
3872 calling `recursive-count-words' on a region containing the remaining | |
3873 word--that is, it adds one to one, producing two, which is the correct | |
3874 amount. | |
3875 | |
3876 Similarly, if there are three words in the region, the first call to | |
3877 `recursive-count-words' returns one added to the value returned by | |
3878 calling `recursive-count-words' on a region containing the remaining | |
3879 two words--and so on and so on. | |
3880 | |
3881 With full documentation the two functions look like this: | |
3882 | |
3883 The recursive function: | |
3884 | |
3885 (defun recursive-count-words (region-end) | |
3886 "Number of words between point and REGION-END." | |
3887 | |
3888 ;;; 1. do-again-test | |
3889 (if (and (< (point) region-end) | |
3890 (re-search-forward "\\w+\\W*" region-end t)) | |
3891 | |
3892 ;;; 2. then-part: the recursive call | |
3893 (1+ (recursive-count-words region-end)) | |
3894 | |
3895 ;;; 3. else-part | |
3896 0)) | |
3897 | |
3898 The wrapper: | |
3899 | |
3900 ;;; Recursive version | |
3901 (defun count-words-region (beginning end) | |
3902 "Print number of words in the region. | |
3903 | |
3904 Words are defined as at least one word-constituent | |
3905 character followed by at least one character that is | |
3906 not a word-constituent. The buffer's syntax table | |
3907 determines which characters these are." | |
3908 (interactive "r") | |
3909 (message "Counting words in region ... ") | |
3910 (save-excursion | |
3911 (goto-char beginning) | |
3912 (let ((count (recursive-count-words end))) | |
3913 (cond ((zerop count) | |
3914 (message | |
3915 "The region does NOT have any words.")) | |
3916 ((= 1 count) | |
3917 (message "The region has 1 word.")) | |
3918 (t | |
3919 (message | |
3920 "The region has %d words." count)))))) | |
3921 | |
3922 | |
3923 File: eintr, Node: Counting Exercise, Prev: recursive-count-words, Up: Counting Words | |
3924 | |
3925 13.3 Exercise: Counting Punctuation | |
3926 =================================== | |
3927 | |
3928 Using a `while' loop, write a function to count the number of | |
3929 punctuation marks in a region--period, comma, semicolon, colon, | |
3930 exclamation mark, and question mark. Do the same using recursion. | |
3931 | |
3932 | |
3933 File: eintr, Node: Words in a defun, Next: Readying a Graph, Prev: Counting Words, Up: Top | |
3934 | |
3935 14 Counting Words in a `defun' | |
3936 ****************************** | |
3937 | |
3938 Our next project is to count the number of words in a function | |
3939 definition. Clearly, this can be done using some variant of | |
3940 `count-word-region'. *Note Counting Words: Repetition and Regexps: | |
3941 Counting Words. If we are just going to count the words in one | |
3942 definition, it is easy enough to mark the definition with the `C-M-h' | |
3943 (`mark-defun') command, and then call `count-word-region'. | |
3944 | |
3945 However, I am more ambitious: I want to count the words and symbols in | |
3946 every definition in the Emacs sources and then print a graph that shows | |
3947 how many functions there are of each length: how many contain 40 to 49 | |
3948 words or symbols, how many contain 50 to 59 words or symbols, and so | |
3949 on. I have often been curious how long a typical function is, and this | |
3950 will tell. | |
3951 | |
3952 * Menu: | |
3953 | |
3954 * Divide and Conquer:: | |
3955 * Words and Symbols:: | |
3956 * Syntax:: | |
3957 * count-words-in-defun:: | |
3958 * Several defuns:: | |
3959 * Find a File:: | |
3960 * lengths-list-file:: | |
3961 * Several files:: | |
3962 * Several files recursively:: | |
3963 * Prepare the data:: | |
3964 | |
3965 | |
3966 File: eintr, Node: Divide and Conquer, Next: Words and Symbols, Prev: Words in a defun, Up: Words in a defun | |
3967 | |
3968 Divide and Conquer | |
3969 ================== | |
3970 | |
3971 Described in one phrase, the histogram project is daunting; but divided | |
3972 into numerous small steps, each of which we can take one at a time, the | |
3973 project becomes less fearsome. Let us consider what the steps must be: | |
3974 | |
3975 * First, write a function to count the words in one definition. This | |
3976 includes the problem of handling symbols as well as words. | |
3977 | |
3978 * Second, write a function to list the numbers of words in each | |
3979 function in a file. This function can use the | |
3980 `count-words-in-defun' function. | |
3981 | |
3982 * Third, write a function to list the numbers of words in each | |
3983 function in each of several files. This entails automatically | |
3984 finding the various files, switching to them, and counting the | |
3985 words in the definitions within them. | |
3986 | |
3987 * Fourth, write a function to convert the list of numbers that we | |
3988 created in step three to a form that will be suitable for printing | |
3989 as a graph. | |
3990 | |
3991 * Fifth, write a function to print the results as a graph. | |
3992 | |
3993 This is quite a project! But if we take each step slowly, it will not | |
3994 be difficult. | |
3995 | |
3996 | |
3997 File: eintr, Node: Words and Symbols, Next: Syntax, Prev: Divide and Conquer, Up: Words in a defun | |
3998 | |
3999 14.1 What to Count? | |
4000 =================== | |
4001 | |
4002 When we first start thinking about how to count the words in a function | |
4003 definition, the first question is (or ought to be) what are we going to | |
4004 count? When we speak of `words' with respect to a Lisp function | |
4005 definition, we are actually speaking, in large part, of `symbols'. For | |
4006 example, the following `multiply-by-seven' function contains the five | |
4007 symbols `defun', `multiply-by-seven', `number', `*', and `7'. In | |
4008 addition, in the documentation string, it contains the four words | |
4009 `Multiply', `NUMBER', `by', and `seven'. The symbol `number' is | |
4010 repeated, so the definition contains a total of ten words and symbols. | |
4011 | |
4012 (defun multiply-by-seven (number) | |
4013 "Multiply NUMBER by seven." | |
4014 (* 7 number)) | |
4015 | |
4016 However, if we mark the `multiply-by-seven' definition with `C-M-h' | |
4017 (`mark-defun'), and then call `count-words-region' on it, we will find | |
4018 that `count-words-region' claims the definition has eleven words, not | |
4019 ten! Something is wrong! | |
4020 | |
4021 The problem is twofold: `count-words-region' does not count the `*' as | |
4022 a word, and it counts the single symbol, `multiply-by-seven', as | |
4023 containing three words. The hyphens are treated as if they were | |
4024 interword spaces rather than intraword connectors: `multiply-by-seven' | |
4025 is counted as if it were written `multiply by seven'. | |
4026 | |
4027 The cause of this confusion is the regular expression search within the | |
4028 `count-words-region' definition that moves point forward word by word. | |
4029 In the canonical version of `count-words-region', the regexp is: | |
4030 | |
4031 "\\w+\\W*" | |
4032 | |
4033 This regular expression is a pattern defining one or more word | |
4034 constituent characters possibly followed by one or more characters that | |
4035 are not word constituents. What is meant by `word constituent | |
4036 characters' brings us to the issue of syntax, which is worth a section | |
4037 of its own. | |
4038 | |
4039 | |
4040 File: eintr, Node: Syntax, Next: count-words-in-defun, Prev: Words and Symbols, Up: Words in a defun | |
4041 | |
4042 14.2 What Constitutes a Word or Symbol? | |
4043 ======================================= | |
4044 | |
4045 Emacs treats different characters as belonging to different "syntax | |
4046 categories". For example, the regular expression, `\\w+', is a pattern | |
4047 specifying one or more _word constituent_ characters. Word constituent | |
4048 characters are members of one syntax category. Other syntax categories | |
4049 include the class of punctuation characters, such as the period and the | |
4050 comma, and the class of whitespace characters, such as the blank space | |
4051 and the tab character. (For more information, see *Note Syntax: | |
4052 (emacs)Syntax, and *Note Syntax Tables: (elisp)Syntax Tables.) | |
4053 | |
4054 Syntax tables specify which characters belong to which categories. | |
4055 Usually, a hyphen is not specified as a `word constituent character'. | |
4056 Instead, it is specified as being in the `class of characters that are | |
4057 part of symbol names but not words.' This means that the | |
4058 `count-words-region' function treats it in the same way it treats an | |
4059 interword white space, which is why `count-words-region' counts | |
4060 `multiply-by-seven' as three words. | |
4061 | |
4062 There are two ways to cause Emacs to count `multiply-by-seven' as one | |
4063 symbol: modify the syntax table or modify the regular expression. | |
4064 | |
4065 We could redefine a hyphen as a word constituent character by modifying | |
4066 the syntax table that Emacs keeps for each mode. This action would | |
4067 serve our purpose, except that a hyphen is merely the most common | |
4068 character within symbols that is not typically a word constituent | |
4069 character; there are others, too. | |
4070 | |
4071 Alternatively, we can redefine the regular expression used in the | |
4072 `count-words' definition so as to include symbols. This procedure has | |
4073 the merit of clarity, but the task is a little tricky. | |
4074 | |
4075 The first part is simple enough: the pattern must match "at least one | |
4076 character that is a word or symbol constituent". Thus: | |
4077 | |
4078 "\\(\\w\\|\\s_\\)+" | |
4079 | |
4080 The `\\(' is the first part of the grouping construct that includes the | |
4081 `\\w' and the `\\s_' as alternatives, separated by the `\\|'. The | |
4082 `\\w' matches any word-constituent character and the `\\s_' matches any | |
4083 character that is part of a symbol name but not a word-constituent | |
4084 character. The `+' following the group indicates that the word or | |
4085 symbol constituent characters must be matched at least once. | |
4086 | |
4087 However, the second part of the regexp is more difficult to design. | |
4088 What we want is to follow the first part with "optionally one or more | |
4089 characters that are not constituents of a word or symbol". At first, I | |
4090 thought I could define this with the following: | |
4091 | |
4092 "\\(\\W\\|\\S_\\)*" | |
4093 | |
4094 The upper case `W' and `S' match characters that are _not_ word or | |
4095 symbol constituents. Unfortunately, this expression matches any | |
4096 character that is either not a word constituent or not a symbol | |
4097 constituent. This matches any character! | |
4098 | |
4099 I then noticed that every word or symbol in my test region was followed | |
4100 by white space (blank space, tab, or newline). So I tried placing a | |
4101 pattern to match one or more blank spaces after the pattern for one or | |
4102 more word or symbol constituents. This failed, too. Words and symbols | |
4103 are often separated by whitespace, but in actual code parentheses may | |
4104 follow symbols and punctuation may follow words. So finally, I | |
4105 designed a pattern in which the word or symbol constituents are | |
4106 followed optionally by characters that are not white space and then | |
4107 followed optionally by white space. | |
4108 | |
4109 Here is the full regular expression: | |
4110 | |
4111 "\\(\\w\\|\\s_\\)+[^ \t\n]*[ \t\n]*" | |
4112 | |
4113 | |
4114 File: eintr, Node: count-words-in-defun, Next: Several defuns, Prev: Syntax, Up: Words in a defun | |
4115 | |
4116 14.3 The `count-words-in-defun' Function | |
4117 ======================================== | |
4118 | |
4119 We have seen that there are several ways to write a `count-word-region' | |
4120 function. To write a `count-words-in-defun', we need merely adapt one | |
4121 of these versions. | |
4122 | |
4123 The version that uses a `while' loop is easy to understand, so I am | |
4124 going to adapt that. Because `count-words-in-defun' will be part of a | |
4125 more complex program, it need not be interactive and it need not | |
4126 display a message but just return the count. These considerations | |
4127 simplify the definition a little. | |
4128 | |
4129 On the other hand, `count-words-in-defun' will be used within a buffer | |
4130 that contains function definitions. Consequently, it is reasonable to | |
4131 ask that the function determine whether it is called when point is | |
4132 within a function definition, and if it is, to return the count for | |
4133 that definition. This adds complexity to the definition, but saves us | |
4134 from needing to pass arguments to the function. | |
4135 | |
4136 These considerations lead us to prepare the following template: | |
4137 | |
4138 (defun count-words-in-defun () | |
4139 "DOCUMENTATION..." | |
4140 (SET UP... | |
4141 (WHILE LOOP...) | |
4142 RETURN COUNT) | |
4143 | |
4144 As usual, our job is to fill in the slots. | |
4145 | |
4146 First, the set up. | |
4147 | |
4148 We are presuming that this function will be called within a buffer | |
4149 containing function definitions. Point will either be within a | |
4150 function definition or not. For `count-words-in-defun' to work, point | |
4151 must move to the beginning of the definition, a counter must start at | |
4152 zero, and the counting loop must stop when point reaches the end of the | |
4153 definition. | |
4154 | |
4155 The `beginning-of-defun' function searches backwards for an opening | |
4156 delimiter such as a `(' at the beginning of a line, and moves point to | |
4157 that position, or else to the limit of the search. In practice, this | |
4158 means that `beginning-of-defun' moves point to the beginning of an | |
4159 enclosing or preceding function definition, or else to the beginning of | |
4160 the buffer. We can use `beginning-of-defun' to place point where we | |
4161 wish to start. | |
4162 | |
4163 The `while' loop requires a counter to keep track of the words or | |
4164 symbols being counted. A `let' expression can be used to create a | |
4165 local variable for this purpose, and bind it to an initial value of | |
4166 zero. | |
4167 | |
4168 The `end-of-defun' function works like `beginning-of-defun' except that | |
4169 it moves point to the end of the definition. `end-of-defun' can be | |
4170 used as part of an expression that determines the position of the end | |
4171 of the definition. | |
4172 | |
4173 The set up for `count-words-in-defun' takes shape rapidly: first we | |
4174 move point to the beginning of the definition, then we create a local | |
4175 variable to hold the count, and finally, we record the position of the | |
4176 end of the definition so the `while' loop will know when to stop | |
4177 looping. | |
4178 | |
4179 The code looks like this: | |
4180 | |
4181 (beginning-of-defun) | |
4182 (let ((count 0) | |
4183 (end (save-excursion (end-of-defun) (point)))) | |
4184 | |
4185 The code is simple. The only slight complication is likely to concern | |
4186 `end': it is bound to the position of the end of the definition by a | |
4187 `save-excursion' expression that returns the value of point after | |
4188 `end-of-defun' temporarily moves it to the end of the definition. | |
4189 | |
4190 The second part of the `count-words-in-defun', after the set up, is the | |
4191 `while' loop. | |
4192 | |
4193 The loop must contain an expression that jumps point forward word by | |
4194 word and symbol by symbol, and another expression that counts the | |
4195 jumps. The true-or-false-test for the `while' loop should test true so | |
4196 long as point should jump forward, and false when point is at the end | |
4197 of the definition. We have already redefined the regular expression | |
4198 for this (*note Syntax::), so the loop is straightforward: | |
4199 | |
4200 (while (and (< (point) end) | |
4201 (re-search-forward | |
4202 "\\(\\w\\|\\s_\\)+[^ \t\n]*[ \t\n]*" end t) | |
4203 (setq count (1+ count))) | |
4204 | |
4205 The third part of the function definition returns the count of words | |
4206 and symbols. This part is the last expression within the body of the | |
4207 `let' expression, and can be, very simply, the local variable `count', | |
4208 which when evaluated returns the count. | |
4209 | |
4210 Put together, the `count-words-in-defun' definition looks like this: | |
4211 | |
4212 (defun count-words-in-defun () | |
4213 "Return the number of words and symbols in a defun." | |
4214 (beginning-of-defun) | |
4215 (let ((count 0) | |
4216 (end (save-excursion (end-of-defun) (point)))) | |
4217 (while | |
4218 (and (< (point) end) | |
4219 (re-search-forward | |
4220 "\\(\\w\\|\\s_\\)+[^ \t\n]*[ \t\n]*" | |
4221 end t)) | |
4222 (setq count (1+ count))) | |
4223 count)) | |
4224 | |
4225 How to test this? The function is not interactive, but it is easy to | |
4226 put a wrapper around the function to make it interactive; we can use | |
4227 almost the same code as for the recursive version of | |
4228 `count-words-region': | |
4229 | |
4230 ;;; Interactive version. | |
4231 (defun count-words-defun () | |
4232 "Number of words and symbols in a function definition." | |
4233 (interactive) | |
4234 (message | |
4235 "Counting words and symbols in function definition ... ") | |
4236 (let ((count (count-words-in-defun))) | |
4237 (cond | |
4238 ((zerop count) | |
4239 (message | |
4240 "The definition does NOT have any words or symbols.")) | |
4241 ((= 1 count) | |
4242 (message | |
4243 "The definition has 1 word or symbol.")) | |
4244 (t | |
4245 (message | |
4246 "The definition has %d words or symbols." count))))) | |
4247 | |
4248 Let's re-use `C-c =' as a convenient keybinding: | |
4249 | |
4250 (global-set-key "\C-c=" 'count-words-defun) | |
4251 | |
4252 Now we can try out `count-words-defun': install both | |
4253 `count-words-in-defun' and `count-words-defun', and set the keybinding, | |
4254 and then place the cursor within the following definition: | |
4255 | |
4256 (defun multiply-by-seven (number) | |
4257 "Multiply NUMBER by seven." | |
4258 (* 7 number)) | |
4259 => 10 | |
4260 | |
4261 Success! The definition has 10 words and symbols. | |
4262 | |
4263 The next problem is to count the numbers of words and symbols in | |
4264 several definitions within a single file. | |
4265 | |
4266 | |
4267 File: eintr, Node: Several defuns, Next: Find a File, Prev: count-words-in-defun, Up: Words in a defun | |
4268 | |
4269 14.4 Count Several `defuns' Within a File | |
4270 ========================================= | |
4271 | |
4272 A file such as `simple.el' may have a hundred or more function | |
4273 definitions within it. Our long term goal is to collect statistics on | |
4274 many files, but as a first step, our immediate goal is to collect | |
4275 statistics on one file. | |
4276 | |
4277 The information will be a series of numbers, each number being the | |
4278 length of a function definition. We can store the numbers in a list. | |
4279 | |
4280 We know that we will want to incorporate the information regarding one | |
4281 file with information about many other files; this means that the | |
4282 function for counting definition lengths within one file need only | |
4283 return the list of lengths. It need not and should not display any | |
4284 messages. | |
4285 | |
4286 The word count commands contain one expression to jump point forward | |
4287 word by word and another expression to count the jumps. The function | |
4288 to return the lengths of definitions can be designed to work the same | |
4289 way, with one expression to jump point forward definition by definition | |
4290 and another expression to construct the lengths' list. | |
4291 | |
4292 This statement of the problem makes it elementary to write the function | |
4293 definition. Clearly, we will start the count at the beginning of the | |
4294 file, so the first command will be `(goto-char (point-min))'. Next, we | |
4295 start the `while' loop; and the true-or-false test of the loop can be a | |
4296 regular expression search for the next function definition--so long as | |
4297 the search succeeds, point is moved forward and then the body of the | |
4298 loop is evaluated. The body needs an expression that constructs the | |
4299 lengths' list. `cons', the list construction command, can be used to | |
4300 create the list. That is almost all there is to it. | |
4301 | |
4302 Here is what this fragment of code looks like: | |
4303 | |
4304 (goto-char (point-min)) | |
4305 (while (re-search-forward "^(defun" nil t) | |
4306 (setq lengths-list | |
4307 (cons (count-words-in-defun) lengths-list))) | |
4308 | |
4309 What we have left out is the mechanism for finding the file that | |
4310 contains the function definitions. | |
4311 | |
4312 In previous examples, we either used this, the Info file, or we | |
4313 switched back and forth to some other buffer, such as the `*scratch*' | |
4314 buffer. | |
4315 | |
4316 Finding a file is a new process that we have not yet discussed. | |
4317 | |
4318 | |
4319 File: eintr, Node: Find a File, Next: lengths-list-file, Prev: Several defuns, Up: Words in a defun | |
4320 | |
4321 14.5 Find a File | |
4322 ================ | |
4323 | |
4324 To find a file in Emacs, you use the `C-x C-f' (`find-file') command. | |
4325 This command is almost, but not quite right for the lengths problem. | |
4326 | |
4327 Let's look at the source for `find-file': | |
4328 | |
4329 (defun find-file (filename) | |
4330 "Edit file FILENAME. | |
4331 Switch to a buffer visiting file FILENAME, | |
4332 creating one if none already exists." | |
4333 (interactive "FFind file: ") | |
4334 (switch-to-buffer (find-file-noselect filename))) | |
4335 | |
4336 (The most recent version of the `find-file' function definition permits | |
4337 you to specify optional wildcards visit multiple files; that makes the | |
4338 definition more complex and we will not discuss it here, since it is | |
4339 not relevant. You can see its source using either `M-.' (`find-tag') | |
4340 or `C-h f' (`describe-function').) | |
4341 | |
4342 The definition I am showing possesses short but complete documentation | |
4343 and an interactive specification that prompts you for a file name when | |
4344 you use the command interactively. The body of the definition contains | |
4345 two functions, `find-file-noselect' and `switch-to-buffer'. | |
4346 | |
4347 According to its documentation as shown by `C-h f' (the | |
4348 `describe-function' command), the `find-file-noselect' function reads | |
4349 the named file into a buffer and returns the buffer. (Its most recent | |
4350 version includes an optional wildcards argument, too, as well as | |
4351 another to read a file literally and an other you suppress warning | |
4352 messages. These optional arguments are irrelevant.) | |
4353 | |
4354 However, the `find-file-noselect' function does not select the buffer | |
4355 in which it puts the file. Emacs does not switch its attention (or | |
4356 yours if you are using `find-file-noselect') to the named buffer. That | |
4357 is what `switch-to-buffer' does: it switches the buffer to which Emacs | |
4358 attention is directed; and it switches the buffer displayed in the | |
4359 window to the new buffer. We have discussed buffer switching | |
4360 elsewhere. (*Note Switching Buffers::.) | |
4361 | |
4362 In this histogram project, we do not need to display each file on the | |
4363 screen as the program determines the length of each definition within | |
4364 it. Instead of employing `switch-to-buffer', we can work with | |
4365 `set-buffer', which redirects the attention of the computer program to | |
4366 a different buffer but does not redisplay it on the screen. So instead | |
4367 of calling on `find-file' to do the job, we must write our own | |
4368 expression. | |
4369 | |
4370 The task is easy: use `find-file-noselect' and `set-buffer'. | |
4371 | |
4372 | |
4373 File: eintr, Node: lengths-list-file, Next: Several files, Prev: Find a File, Up: Words in a defun | |
4374 | |
4375 14.6 `lengths-list-file' in Detail | |
4376 ================================== | |
4377 | |
4378 The core of the `lengths-list-file' function is a `while' loop | |
4379 containing a function to move point forward `defun by defun' and a | |
4380 function to count the number of words and symbols in each defun. This | |
4381 core must be surrounded by functions that do various other tasks, | |
4382 including finding the file, and ensuring that point starts out at the | |
4383 beginning of the file. The function definition looks like this: | |
4384 | |
4385 (defun lengths-list-file (filename) | |
4386 "Return list of definitions' lengths within FILE. | |
4387 The returned list is a list of numbers. | |
4388 Each number is the number of words or | |
4389 symbols in one function definition." | |
4390 (message "Working on `%s' ... " filename) | |
4391 (save-excursion | |
4392 (let ((buffer (find-file-noselect filename)) | |
4393 (lengths-list)) | |
4394 (set-buffer buffer) | |
4395 (setq buffer-read-only t) | |
4396 (widen) | |
4397 (goto-char (point-min)) | |
4398 (while (re-search-forward "^(defun" nil t) | |
4399 (setq lengths-list | |
4400 (cons (count-words-in-defun) lengths-list))) | |
4401 (kill-buffer buffer) | |
4402 lengths-list))) | |
4403 | |
4404 The function is passed one argument, the name of the file on which it | |
4405 will work. It has four lines of documentation, but no interactive | |
4406 specification. Since people worry that a computer is broken if they | |
4407 don't see anything going on, the first line of the body is a message. | |
4408 | |
4409 The next line contains a `save-excursion' that returns Emacs' attention | |
4410 to the current buffer when the function completes. This is useful in | |
4411 case you embed this function in another function that presumes point is | |
4412 restored to the original buffer. | |
4413 | |
4414 In the varlist of the `let' expression, Emacs finds the file and binds | |
4415 the local variable `buffer' to the buffer containing the file. At the | |
4416 same time, Emacs creates `lengths-list' as a local variable. | |
4417 | |
4418 Next, Emacs switches its attention to the buffer. | |
4419 | |
4420 In the following line, Emacs makes the buffer read-only. Ideally, this | |
4421 line is not necessary. None of the functions for counting words and | |
4422 symbols in a function definition should change the buffer. Besides, | |
4423 the buffer is not going to be saved, even if it were changed. This | |
4424 line is entirely the consequence of great, perhaps excessive, caution. | |
4425 The reason for the caution is that this function and those it calls | |
4426 work on the sources for Emacs and it is very inconvenient if they are | |
4427 inadvertently modified. It goes without saying that I did not realize | |
4428 a need for this line until an experiment went awry and started to | |
4429 modify my Emacs source files ... | |
4430 | |
4431 Next comes a call to widen the buffer if it is narrowed. This function | |
4432 is usually not needed--Emacs creates a fresh buffer if none already | |
4433 exists; but if a buffer visiting the file already exists Emacs returns | |
4434 that one. In this case, the buffer may be narrowed and must be | |
4435 widened. If we wanted to be fully `user-friendly', we would arrange to | |
4436 save the restriction and the location of point, but we won't. | |
4437 | |
4438 The `(goto-char (point-min))' expression moves point to the beginning | |
4439 of the buffer. | |
4440 | |
4441 Then comes a `while' loop in which the `work' of the function is | |
4442 carried out. In the loop, Emacs determines the length of each | |
4443 definition and constructs a lengths' list containing the information. | |
4444 | |
4445 Emacs kills the buffer after working through it. This is to save space | |
4446 inside of Emacs. My version of GNU Emacs 19 contained over 300 source | |
4447 files of interest; GNU Emacs 22 contains over a thousand source files. | |
4448 Another function will apply `lengths-list-file' to each of the files. | |
4449 | |
4450 Finally, the last expression within the `let' expression is the | |
4451 `lengths-list' variable; its value is returned as the value of the | |
4452 whole function. | |
4453 | |
4454 You can try this function by installing it in the usual fashion. Then | |
4455 place your cursor after the following expression and type `C-x C-e' | |
4456 (`eval-last-sexp'). | |
4457 | |
4458 (lengths-list-file | |
4459 "/usr/local/share/emacs/22.0.100/lisp/emacs-lisp/debug.el") | |
4460 | |
4461 (You may need to change the pathname of the file; the one here is for | |
4462 GNU Emacs version 22.0.100. To change the expression, copy it to the | |
4463 `*scratch*' buffer and edit it. | |
4464 | |
4465 (Also, to see the full length of the list, rather than a truncated | |
4466 version, you may have to evaluate the following: | |
4467 | |
4468 (custom-set-variables '(eval-expression-print-length nil)) | |
4469 | |
4470 (*Note Specifying Variables using `defcustom': defcustom.) Then | |
4471 evaluate the `lengths-list-file' expression.) | |
4472 | |
4473 The lengths' list for `debug.el' takes less than a second to produce | |
4474 and looks like this in GNU Emacs 22: | |
4475 | |
4476 (83 113 105 144 289 22 30 97 48 89 25 52 52 88 28 29 77 49 43 290 232 587) | |
4477 | |
4478 (Using my old machine, the version 19 lengths' list for `debug.el' took | |
4479 seven seconds to produce and looked like this: | |
4480 | |
4481 (75 41 80 62 20 45 44 68 45 12 34 235) | |
4482 | |
4483 (The newer version of `debug.el' contains more defuns than the earlier | |
4484 one; and my new machine is much faster than the old one.) | |
4485 | |
4486 Note that the length of the last definition in the file is first in the | |
4487 list. | |
4488 | |
4489 | |
4490 File: eintr, Node: Several files, Next: Several files recursively, Prev: lengths-list-file, Up: Words in a defun | |
4491 | |
4492 14.7 Count Words in `defuns' in Different Files | |
4493 =============================================== | |
4494 | |
4495 In the previous section, we created a function that returns a list of | |
4496 the lengths of each definition in a file. Now, we want to define a | |
4497 function to return a master list of the lengths of the definitions in a | |
4498 list of files. | |
4499 | |
4500 Working on each of a list of files is a repetitious act, so we can use | |
4501 either a `while' loop or recursion. | |
4502 | |
4503 * Menu: | |
4504 | |
4505 * lengths-list-many-files:: | |
4506 * append:: | |
4507 | |
4508 | |
4509 File: eintr, Node: lengths-list-many-files, Next: append, Prev: Several files, Up: Several files | |
4510 | |
4511 Determine the lengths of `defuns' | |
4512 --------------------------------- | |
4513 | |
4514 The design using a `while' loop is routine. The argument passed the | |
4515 function is a list of files. As we saw earlier (*note Loop Example::), | |
4516 you can write a `while' loop so that the body of the loop is evaluated | |
4517 if such a list contains elements, but to exit the loop if the list is | |
4518 empty. For this design to work, the body of the loop must contain an | |
4519 expression that shortens the list each time the body is evaluated, so | |
4520 that eventually the list is empty. The usual technique is to set the | |
4521 value of the list to the value of the CDR of the list each time the | |
4522 body is evaluated. | |
4523 | |
4524 The template looks like this: | |
4525 | |
4526 (while TEST-WHETHER-LIST-IS-EMPTY | |
4527 BODY... | |
4528 SET-LIST-TO-CDR-OF-LIST) | |
4529 | |
4530 Also, we remember that a `while' loop returns `nil' (the result of | |
4531 evaluating the true-or-false-test), not the result of any evaluation | |
4532 within its body. (The evaluations within the body of the loop are done | |
4533 for their side effects.) However, the expression that sets the | |
4534 lengths' list is part of the body--and that is the value that we want | |
4535 returned by the function as a whole. To do this, we enclose the | |
4536 `while' loop within a `let' expression, and arrange that the last | |
4537 element of the `let' expression contains the value of the lengths' | |
4538 list. (*Note Loop Example with an Incrementing Counter: Incrementing | |
4539 Example.) | |
4540 | |
4541 These considerations lead us directly to the function itself: | |
4542 | |
4543 ;;; Use `while' loop. | |
4544 (defun lengths-list-many-files (list-of-files) | |
4545 "Return list of lengths of defuns in LIST-OF-FILES." | |
4546 (let (lengths-list) | |
4547 | |
4548 ;;; true-or-false-test | |
4549 (while list-of-files | |
4550 (setq lengths-list | |
4551 (append | |
4552 lengths-list | |
4553 | |
4554 ;;; Generate a lengths' list. | |
4555 (lengths-list-file | |
4556 (expand-file-name (car list-of-files))))) | |
4557 | |
4558 ;;; Make files' list shorter. | |
4559 (setq list-of-files (cdr list-of-files))) | |
4560 | |
4561 ;;; Return final value of lengths' list. | |
4562 lengths-list)) | |
4563 | |
4564 `expand-file-name' is a built-in function that converts a file name to | |
4565 the absolute, long, path name form of the directory in which the | |
4566 function is called. | |
4567 | |
4568 Thus, if `expand-file-name' is called on `debug.el' when Emacs is | |
4569 visiting the `/usr/local/share/emacs/22.0.100/lisp/emacs-lisp/' | |
4570 directory, | |
4571 | |
4572 debug.el | |
4573 | |
4574 becomes | |
4575 | |
4576 /usr/local/share/emacs/22.0.100/lisp/emacs-lisp/debug.el | |
4577 | |
4578 The only other new element of this function definition is the as yet | |
4579 unstudied function `append', which merits a short section for itself. | |
4580 | |
4581 | |
4582 File: eintr, Node: append, Prev: lengths-list-many-files, Up: Several files | |
4583 | |
4584 14.7.1 The `append' Function | |
4585 ---------------------------- | |
4586 | |
4587 The `append' function attaches one list to another. Thus, | |
4588 | |
4589 (append '(1 2 3 4) '(5 6 7 8)) | |
4590 | |
4591 produces the list | |
4592 | |
4593 (1 2 3 4 5 6 7 8) | |
4594 | |
4595 This is exactly how we want to attach two lengths' lists produced by | |
4596 `lengths-list-file' to each other. The results contrast with `cons', | |
4597 | |
4598 (cons '(1 2 3 4) '(5 6 7 8)) | |
4599 | |
4600 which constructs a new list in which the first argument to `cons' | |
4601 becomes the first element of the new list: | |
4602 | |
4603 ((1 2 3 4) 5 6 7 8) | |
4604 | |
4605 | |
4606 File: eintr, Node: Several files recursively, Next: Prepare the data, Prev: Several files, Up: Words in a defun | |
4607 | |
4608 14.8 Recursively Count Words in Different Files | |
4609 =============================================== | |
4610 | |
4611 Besides a `while' loop, you can work on each of a list of files with | |
4612 recursion. A recursive version of `lengths-list-many-files' is short | |
4613 and simple. | |
4614 | |
4615 The recursive function has the usual parts: the `do-again-test', the | |
4616 `next-step-expression', and the recursive call. The `do-again-test' | |
4617 determines whether the function should call itself again, which it will | |
4618 do if the `list-of-files' contains any remaining elements; the | |
4619 `next-step-expression' resets the `list-of-files' to the CDR of itself, | |
4620 so eventually the list will be empty; and the recursive call calls | |
4621 itself on the shorter list. The complete function is shorter than this | |
4622 description! | |
4623 | |
4624 (defun recursive-lengths-list-many-files (list-of-files) | |
4625 "Return list of lengths of each defun in LIST-OF-FILES." | |
4626 (if list-of-files ; do-again-test | |
4627 (append | |
4628 (lengths-list-file | |
4629 (expand-file-name (car list-of-files))) | |
4630 (recursive-lengths-list-many-files | |
4631 (cdr list-of-files))))) | |
4632 | |
4633 In a sentence, the function returns the lengths' list for the first of | |
4634 the `list-of-files' appended to the result of calling itself on the | |
4635 rest of the `list-of-files'. | |
4636 | |
4637 Here is a test of `recursive-lengths-list-many-files', along with the | |
4638 results of running `lengths-list-file' on each of the files | |
4639 individually. | |
4640 | |
4641 Install `recursive-lengths-list-many-files' and `lengths-list-file', if | |
4642 necessary, and then evaluate the following expressions. You may need | |
4643 to change the files' pathnames; those here work when this Info file and | |
4644 the Emacs sources are located in their customary places. To change the | |
4645 expressions, copy them to the `*scratch*' buffer, edit them, and then | |
4646 evaluate them. | |
4647 | |
4648 The results are shown after the `=>'. (These results are for files | |
4649 from Emacs Version 22.0.100; files from other versions of Emacs may | |
4650 produce different results.) | |
4651 | |
4652 (cd "/usr/local/share/emacs/22.0.100/") | |
4653 | |
4654 (lengths-list-file "./lisp/macros.el") | |
4655 => (283 263 480 90) | |
4656 | |
4657 (lengths-list-file "./lisp/mail/mailalias.el") | |
4658 => (38 32 29 95 178 180 321 218 324) | |
4659 | |
4660 (lengths-list-file "./lisp/makesum.el") | |
4661 => (85 181) | |
4662 | |
4663 (recursive-lengths-list-many-files | |
4664 '("./lisp/macros.el" | |
4665 "./lisp/mail/mailalias.el" | |
4666 "./lisp/makesum.el")) | |
4667 => (283 263 480 90 38 32 29 95 178 180 321 218 324 85 181) | |
4668 | |
4669 The `recursive-lengths-list-many-files' function produces the output we | |
4670 want. | |
4671 | |
4672 The next step is to prepare the data in the list for display in a graph. | |
4673 | |
4674 | |
4675 File: eintr, Node: Prepare the data, Prev: Several files recursively, Up: Words in a defun | |
4676 | |
4677 14.9 Prepare the Data for Display in a Graph | |
4678 ============================================ | |
4679 | |
4680 The `recursive-lengths-list-many-files' function returns a list of | |
4681 numbers. Each number records the length of a function definition. | |
4682 What we need to do now is transform this data into a list of numbers | |
4683 suitable for generating a graph. The new list will tell how many | |
4684 functions definitions contain less than 10 words and symbols, how many | |
4685 contain between 10 and 19 words and symbols, how many contain between | |
4686 20 and 29 words and symbols, and so on. | |
4687 | |
4688 In brief, we need to go through the lengths' list produced by the | |
4689 `recursive-lengths-list-many-files' function and count the number of | |
4690 defuns within each range of lengths, and produce a list of those | |
4691 numbers. | |
4692 | |
4693 Based on what we have done before, we can readily foresee that it | |
4694 should not be too hard to write a function that `CDRs' down the | |
4695 lengths' list, looks at each element, determines which length range it | |
4696 is in, and increments a counter for that range. | |
4697 | |
4698 However, before beginning to write such a function, we should consider | |
4699 the advantages of sorting the lengths' list first, so the numbers are | |
4700 ordered from smallest to largest. First, sorting will make it easier | |
4701 to count the numbers in each range, since two adjacent numbers will | |
4702 either be in the same length range or in adjacent ranges. Second, by | |
4703 inspecting a sorted list, we can discover the highest and lowest | |
4704 number, and thereby determine the largest and smallest length range | |
4705 that we will need. | |
4706 | |
4707 * Menu: | |
4708 | |
4709 * Sorting:: | |
4710 * Files List:: | |
4711 * Counting function definitions:: | |
4712 | |
4713 | |
4714 File: eintr, Node: Sorting, Next: Files List, Prev: Prepare the data, Up: Prepare the data | |
4715 | |
4716 14.9.1 Sorting Lists | |
4717 -------------------- | |
4718 | |
4719 Emacs contains a function to sort lists, called (as you might guess) | |
4720 `sort'. The `sort' function takes two arguments, the list to be | |
4721 sorted, and a predicate that determines whether the first of two list | |
4722 elements is "less" than the second. | |
4723 | |
4724 As we saw earlier (*note Using the Wrong Type Object as an Argument: | |
4725 Wrong Type of Argument.), a predicate is a function that determines | |
4726 whether some property is true or false. The `sort' function will | |
4727 reorder a list according to whatever property the predicate uses; this | |
4728 means that `sort' can be used to sort non-numeric lists by non-numeric | |
4729 criteria--it can, for example, alphabetize a list. | |
4730 | |
4731 The `<' function is used when sorting a numeric list. For example, | |
4732 | |
4733 (sort '(4 8 21 17 33 7 21 7) '<) | |
4734 | |
4735 produces this: | |
4736 | |
4737 (4 7 7 8 17 21 21 33) | |
4738 | |
4739 (Note that in this example, both the arguments are quoted so that the | |
4740 symbols are not evaluated before being passed to `sort' as arguments.) | |
4741 | |
4742 Sorting the list returned by the `recursive-lengths-list-many-files' | |
4743 function is straightforward; it uses the `<' function: | |
4744 | |
4745 (sort | |
4746 (recursive-lengths-list-many-files | |
4747 '("./lisp/macros.el" | |
4748 "./lisp/mailalias.el" | |
4749 "./lisp/makesum.el")) | |
4750 '<) | |
4751 | |
4752 which produces: | |
4753 | |
4754 (29 32 38 85 90 95 178 180 181 218 263 283 321 324 480) | |
4755 | |
4756 (Note that in this example, the first argument to `sort' is not quoted, | |
4757 since the expression must be evaluated so as to produce the list that | |
4758 is passed to `sort'.) | |
4759 | |
4760 | |
4761 File: eintr, Node: Files List, Next: Counting function definitions, Prev: Sorting, Up: Prepare the data | |
4762 | |
4763 14.9.2 Making a List of Files | |
4764 ----------------------------- | |
4765 | |
4766 The `recursive-lengths-list-many-files' function requires a list of | |
4767 files as its argument. For our test examples, we constructed such a | |
4768 list by hand; but the Emacs Lisp source directory is too large for us | |
4769 to do for that. Instead, we will write a function to do the job for | |
4770 us. In this function, we will use both a `while' loop and a recursive | |
4771 call. | |
4772 | |
4773 We did not have to write a function like this for older versions of GNU | |
4774 Emacs, since they placed all the `.el' files in one directory. | |
4775 Instead, we were able to use the `directory-files' function, which | |
4776 lists the names of files that match a specified pattern within a single | |
4777 directory. | |
4778 | |
4779 However, recent versions of Emacs place Emacs Lisp files in | |
4780 sub-directories of the top level `lisp' directory. This re-arrangement | |
4781 eases navigation. For example, all the mail related files are in a | |
4782 `lisp' sub-directory called `mail'. But at the same time, this | |
4783 arrangement forces us to create a file listing function that descends | |
4784 into the sub-directories. | |
4785 | |
4786 We can create this function, called `files-in-below-directory', using | |
4787 familiar functions such as `car', `nthcdr', and `substring' in | |
4788 conjunction with an existing function called | |
4789 `directory-files-and-attributes'. This latter function not only lists | |
4790 all the filenames in a directory, including the names of | |
4791 sub-directories, but also their attributes. | |
4792 | |
4793 To restate our goal: to create a function that will enable us to feed | |
4794 filenames to `recursive-lengths-list-many-files' as a list that looks | |
4795 like this (but with more elements): | |
4796 | |
4797 ("./lisp/macros.el" | |
4798 "./lisp/mail/rmail.el" | |
4799 "./lisp/makesum.el") | |
4800 | |
4801 The `directory-files-and-attributes' function returns a list of lists. | |
4802 Each of the lists within the main list consists of 13 elements. The | |
4803 first element is a string that contains the name of the file - which, | |
4804 in GNU/Linux, may be a `directory file', that is to say, a file with | |
4805 the special attributes of a directory. The second element of the list | |
4806 is `t' for a directory, a string for symbolic link (the string is the | |
4807 name linked to), or `nil'. | |
4808 | |
4809 For example, the first `.el' file in the `lisp/' directory is | |
4810 `abbrev.el'. Its name is | |
4811 `/usr/local/share/emacs/22.0.100/lisp/abbrev.el' and it is not a | |
4812 directory or a symbolic link. | |
4813 | |
4814 This is how `directory-files-and-attributes' lists that file and its | |
4815 attributes: | |
4816 | |
4817 ("abbrev.el" | |
4818 nil | |
4819 1 | |
4820 1000 | |
4821 100 | |
4822 (17733 259) | |
4823 (17491 28834) | |
4824 (17596 62124) | |
4825 13157 | |
4826 "-rw-rw-r--" | |
4827 nil | |
4828 2971624 | |
4829 773) | |
4830 | |
4831 On the other hand, `mail/' is a directory within the `lisp/' directory. | |
4832 The beginning of its listing looks like this: | |
4833 | |
4834 ("mail" | |
4835 t | |
4836 ... | |
4837 ) | |
4838 | |
4839 (To learn about the different attributes, look at the documentation of | |
4840 `file-attributes'. Bear in mind that the `file-attributes' function | |
4841 does not list the filename, so its first element is | |
4842 `directory-files-and-attributes''s second element.) | |
4843 | |
4844 We will want our new function, `files-in-below-directory', to list the | |
4845 `.el' files in the directory it is told to check, and in any | |
4846 directories below that directory. | |
4847 | |
4848 This gives us a hint on how to construct `files-in-below-directory': | |
4849 within a directory, the function should add `.el' filenames to a list; | |
4850 and if, within a directory, the function comes upon a sub-directory, it | |
4851 should go into that sub-directory and repeat its actions. | |
4852 | |
4853 However, we should note that every directory contains a name that | |
4854 refers to itself, called `.', ("dot") and a name that refers to its | |
4855 parent directory, called `..' ("double dot"). (In `/', the root | |
4856 directory, `..' refers to itself, since `/' has no parent.) Clearly, | |
4857 we do not want our `files-in-below-directory' function to enter those | |
4858 directories, since they always lead us, directly or indirectly, to the | |
4859 current directory. | |
4860 | |
4861 Consequently, our `files-in-below-directory' function must do several | |
4862 tasks: | |
4863 | |
4864 * Check to see whether it is looking at a filename that ends in | |
4865 `.el'; and if so, add its name to a list. | |
4866 | |
4867 * Check to see whether it is looking at a filename that is the name | |
4868 of a directory; and if so, | |
4869 | |
4870 - Check to see whether it is looking at `.' or `..'; and if so | |
4871 skip it. | |
4872 | |
4873 - Or else, go into that directory and repeat the process. | |
4874 | |
4875 Let's write a function definition to do these tasks. We will use a | |
4876 `while' loop to move from one filename to another within a directory, | |
4877 checking what needs to be done; and we will use a recursive call to | |
4878 repeat the actions on each sub-directory. The recursive pattern is | |
4879 `accumulate' (*note Recursive Pattern: _accumulate_: Accumulate.), | |
4880 using `append' as the combiner. | |
4881 | |
4882 Here is the function: | |
4883 | |
4884 (defun files-in-below-directory (directory) | |
4885 "List the .el files in DIRECTORY and in its sub-directories." | |
4886 ;; Although the function will be used non-interactively, | |
4887 ;; it will be easier to test if we make it interactive. | |
4888 ;; The directory will have a name such as | |
4889 ;; "/usr/local/share/emacs/22.0.100/lisp/" | |
4890 (interactive "DDirectory name: ") | |
4891 (let (el-files-list | |
4892 (current-directory-list | |
4893 (directory-files-and-attributes directory t))) | |
4894 ;; while we are in the current directory | |
4895 (while current-directory-list | |
4896 (cond | |
4897 ;; check to see whether filename ends in `.el' | |
4898 ;; and if so, append its name to a list. | |
4899 ((equal ".el" (substring (car (car current-directory-list)) -3)) | |
4900 (setq el-files-list | |
4901 (cons (car (car current-directory-list)) el-files-list))) | |
4902 ;; check whether filename is that of a directory | |
4903 ((eq t (car (cdr (car current-directory-list)))) | |
4904 ;; decide whether to skip or recurse | |
4905 (if | |
4906 (equal "." | |
4907 (substring (car (car current-directory-list)) -1)) | |
4908 ;; then do nothing since filename is that of | |
4909 ;; current directory or parent, "." or ".." | |
4910 () | |
4911 ;; else descend into the directory and repeat the process | |
4912 (setq el-files-list | |
4913 (append | |
4914 (files-in-below-directory | |
4915 (car (car current-directory-list))) | |
4916 el-files-list))))) | |
4917 ;; move to the next filename in the list; this also | |
4918 ;; shortens the list so the while loop eventually comes to an end | |
4919 (setq current-directory-list (cdr current-directory-list))) | |
4920 ;; return the filenames | |
4921 el-files-list)) | |
4922 | |
4923 The `files-in-below-directory' `directory-files' function takes one | |
4924 argument, the name of a directory. | |
4925 | |
4926 Thus, on my system, | |
4927 | |
4928 (length | |
4929 (files-in-below-directory "/usr/local/share/emacs/22.0.100/lisp/")) | |
4930 | |
4931 tells me that my Lisp sources directory contains 1031 `.el' files. | |
4932 | |
4933 `files-in-below-directory' returns a list in reverse alphabetical | |
4934 order. An expression to sort the list in alphabetical order looks like | |
4935 this: | |
4936 | |
4937 (sort | |
4938 (files-in-below-directory "/usr/local/share/emacs/22.0.100/lisp/") | |
4939 'string-lessp) | |
4940 | |
4941 | |
4942 File: eintr, Node: Counting function definitions, Prev: Files List, Up: Prepare the data | |
4943 | |
4944 14.9.3 Counting function definitions | |
4945 ------------------------------------ | |
4946 | |
4947 Our immediate goal is to generate a list that tells us how many | |
4948 function definitions contain fewer than 10 words and symbols, how many | |
4949 contain between 10 and 19 words and symbols, how many contain between | |
4950 20 and 29 words and symbols, and so on. | |
4951 | |
4952 With a sorted list of numbers, this is easy: count how many elements of | |
4953 the list are smaller than 10, then, after moving past the numbers just | |
4954 counted, count how many are smaller than 20, then, after moving past | |
4955 the numbers just counted, count how many are smaller than 30, and so | |
4956 on. Each of the numbers, 10, 20, 30, 40, and the like, is one larger | |
4957 than the top of that range. We can call the list of such numbers the | |
4958 `top-of-ranges' list. | |
4959 | |
4960 If we wished, we could generate this list automatically, but it is | |
4961 simpler to write a list manually. Here it is: | |
4962 | |
4963 (defvar top-of-ranges | |
4964 '(10 20 30 40 50 | |
4965 60 70 80 90 100 | |
4966 110 120 130 140 150 | |
4967 160 170 180 190 200 | |
4968 210 220 230 240 250 | |
4969 260 270 280 290 300) | |
4970 "List specifying ranges for `defuns-per-range'.") | |
4971 | |
4972 To change the ranges, we edit this list. | |
4973 | |
4974 Next, we need to write the function that creates the list of the number | |
4975 of definitions within each range. Clearly, this function must take the | |
4976 `sorted-lengths' and the `top-of-ranges' lists as arguments. | |
4977 | |
4978 The `defuns-per-range' function must do two things again and again: it | |
4979 must count the number of definitions within a range specified by the | |
4980 current top-of-range value; and it must shift to the next higher value | |
4981 in the `top-of-ranges' list after counting the number of definitions in | |
4982 the current range. Since each of these actions is repetitive, we can | |
4983 use `while' loops for the job. One loop counts the number of | |
4984 definitions in the range defined by the current top-of-range value, and | |
4985 the other loop selects each of the top-of-range values in turn. | |
4986 | |
4987 Several entries of the `sorted-lengths' list are counted for each | |
4988 range; this means that the loop for the `sorted-lengths' list will be | |
4989 inside the loop for the `top-of-ranges' list, like a small gear inside | |
4990 a big gear. | |
4991 | |
4992 The inner loop counts the number of definitions within the range. It | |
4993 is a simple counting loop of the type we have seen before. (*Note A | |
4994 loop with an incrementing counter: Incrementing Loop.) The | |
4995 true-or-false test of the loop tests whether the value from the | |
4996 `sorted-lengths' list is smaller than the current value of the top of | |
4997 the range. If it is, the function increments the counter and tests the | |
4998 next value from the `sorted-lengths' list. | |
4999 | |
5000 The inner loop looks like this: | |
5001 | |
5002 (while LENGTH-ELEMENT-SMALLER-THAN-TOP-OF-RANGE | |
5003 (setq number-within-range (1+ number-within-range)) | |
5004 (setq sorted-lengths (cdr sorted-lengths))) | |
5005 | |
5006 The outer loop must start with the lowest value of the `top-of-ranges' | |
5007 list, and then be set to each of the succeeding higher values in turn. | |
5008 This can be done with a loop like this: | |
5009 | |
5010 (while top-of-ranges | |
5011 BODY-OF-LOOP... | |
5012 (setq top-of-ranges (cdr top-of-ranges))) | |
5013 | |
5014 Put together, the two loops look like this: | |
5015 | |
5016 (while top-of-ranges | |
5017 | |
5018 ;; Count the number of elements within the current range. | |
5019 (while LENGTH-ELEMENT-SMALLER-THAN-TOP-OF-RANGE | |
5020 (setq number-within-range (1+ number-within-range)) | |
5021 (setq sorted-lengths (cdr sorted-lengths))) | |
5022 | |
5023 ;; Move to next range. | |
5024 (setq top-of-ranges (cdr top-of-ranges))) | |
5025 | |
5026 In addition, in each circuit of the outer loop, Emacs should record the | |
5027 number of definitions within that range (the value of | |
5028 `number-within-range') in a list. We can use `cons' for this purpose. | |
5029 (*Note `cons': cons.) | |
5030 | |
5031 The `cons' function works fine, except that the list it constructs will | |
5032 contain the number of definitions for the highest range at its | |
5033 beginning and the number of definitions for the lowest range at its | |
5034 end. This is because `cons' attaches new elements of the list to the | |
5035 beginning of the list, and since the two loops are working their way | |
5036 through the lengths' list from the lower end first, the | |
5037 `defuns-per-range-list' will end up largest number first. But we will | |
5038 want to print our graph with smallest values first and the larger | |
5039 later. The solution is to reverse the order of the | |
5040 `defuns-per-range-list'. We can do this using the `nreverse' function, | |
5041 which reverses the order of a list. | |
5042 | |
5043 For example, | |
5044 | |
5045 (nreverse '(1 2 3 4)) | |
5046 | |
5047 produces: | |
5048 | |
5049 (4 3 2 1) | |
5050 | |
5051 Note that the `nreverse' function is "destructive"--that is, it changes | |
5052 the list to which it is applied; this contrasts with the `car' and | |
5053 `cdr' functions, which are non-destructive. In this case, we do not | |
5054 want the original `defuns-per-range-list', so it does not matter that | |
5055 it is destroyed. (The `reverse' function provides a reversed copy of a | |
5056 list, leaving the original list as is.) | |
5057 | |
5058 Put all together, the `defuns-per-range' looks like this: | |
5059 | |
5060 (defun defuns-per-range (sorted-lengths top-of-ranges) | |
5061 "SORTED-LENGTHS defuns in each TOP-OF-RANGES range." | |
5062 (let ((top-of-range (car top-of-ranges)) | |
5063 (number-within-range 0) | |
5064 defuns-per-range-list) | |
5065 | |
5066 ;; Outer loop. | |
5067 (while top-of-ranges | |
5068 | |
5069 ;; Inner loop. | |
5070 (while (and | |
5071 ;; Need number for numeric test. | |
5072 (car sorted-lengths) | |
5073 (< (car sorted-lengths) top-of-range)) | |
5074 | |
5075 ;; Count number of definitions within current range. | |
5076 (setq number-within-range (1+ number-within-range)) | |
5077 (setq sorted-lengths (cdr sorted-lengths))) | |
5078 | |
5079 ;; Exit inner loop but remain within outer loop. | |
5080 | |
5081 (setq defuns-per-range-list | |
5082 (cons number-within-range defuns-per-range-list)) | |
5083 (setq number-within-range 0) ; Reset count to zero. | |
5084 | |
5085 ;; Move to next range. | |
5086 (setq top-of-ranges (cdr top-of-ranges)) | |
5087 ;; Specify next top of range value. | |
5088 (setq top-of-range (car top-of-ranges))) | |
5089 | |
5090 ;; Exit outer loop and count the number of defuns larger than | |
5091 ;; the largest top-of-range value. | |
5092 (setq defuns-per-range-list | |
5093 (cons | |
5094 (length sorted-lengths) | |
5095 defuns-per-range-list)) | |
5096 | |
5097 ;; Return a list of the number of definitions within each range, | |
5098 ;; smallest to largest. | |
5099 (nreverse defuns-per-range-list))) | |
5100 | |
5101 The function is straightforward except for one subtle feature. The | |
5102 true-or-false test of the inner loop looks like this: | |
5103 | |
5104 (and (car sorted-lengths) | |
5105 (< (car sorted-lengths) top-of-range)) | |
5106 | |
5107 instead of like this: | |
5108 | |
5109 (< (car sorted-lengths) top-of-range) | |
5110 | |
5111 The purpose of the test is to determine whether the first item in the | |
5112 `sorted-lengths' list is less than the value of the top of the range. | |
5113 | |
5114 The simple version of the test works fine unless the `sorted-lengths' | |
5115 list has a `nil' value. In that case, the `(car sorted-lengths)' | |
5116 expression function returns `nil'. The `<' function cannot compare a | |
5117 number to `nil', which is an empty list, so Emacs signals an error and | |
5118 stops the function from attempting to continue to execute. | |
5119 | |
5120 The `sorted-lengths' list always becomes `nil' when the counter reaches | |
5121 the end of the list. This means that any attempt to use the | |
5122 `defuns-per-range' function with the simple version of the test will | |
5123 fail. | |
5124 | |
5125 We solve the problem by using the `(car sorted-lengths)' expression in | |
5126 conjunction with the `and' expression. The `(car sorted-lengths)' | |
5127 expression returns a non-`nil' value so long as the list has at least | |
5128 one number within it, but returns `nil' if the list is empty. The | |
5129 `and' expression first evaluates the `(car sorted-lengths)' expression, | |
5130 and if it is `nil', returns false _without_ evaluating the `<' | |
5131 expression. But if the `(car sorted-lengths)' expression returns a | |
5132 non-`nil' value, the `and' expression evaluates the `<' expression, and | |
5133 returns that value as the value of the `and' expression. | |
5134 | |
5135 This way, we avoid an error. (*Note The `kill-new' function: kill-new | |
5136 function, for information about `and'.) | |
5137 | |
5138 Here is a short test of the `defuns-per-range' function. First, | |
5139 evaluate the expression that binds (a shortened) `top-of-ranges' list | |
5140 to the list of values, then evaluate the expression for binding the | |
5141 `sorted-lengths' list, and then evaluate the `defuns-per-range' | |
5142 function. | |
5143 | |
5144 ;; (Shorter list than we will use later.) | |
5145 (setq top-of-ranges | |
5146 '(110 120 130 140 150 | |
5147 160 170 180 190 200)) | |
5148 | |
5149 (setq sorted-lengths | |
5150 '(85 86 110 116 122 129 154 176 179 200 265 300 300)) | |
5151 | |
5152 (defuns-per-range sorted-lengths top-of-ranges) | |
5153 | |
5154 The list returned looks like this: | |
5155 | |
5156 (2 2 2 0 0 1 0 2 0 0 4) | |
5157 | |
5158 Indeed, there are two elements of the `sorted-lengths' list smaller | |
5159 than 110, two elements between 110 and 119, two elements between 120 | |
5160 and 129, and so on. There are four elements with a value of 200 or | |
5161 larger. | |
5162 | |
5163 | |
5164 File: eintr, Node: Readying a Graph, Next: Emacs Initialization, Prev: Words in a defun, Up: Top | |
5165 | |
5166 15 Readying a Graph | |
5167 ******************* | |
5168 | |
5169 Our goal is to construct a graph showing the numbers of function | |
5170 definitions of various lengths in the Emacs lisp sources. | |
5171 | |
5172 As a practical matter, if you were creating a graph, you would probably | |
5173 use a program such as `gnuplot' to do the job. (`gnuplot' is nicely | |
5174 integrated into GNU Emacs.) In this case, however, we create one from | |
5175 scratch, and in the process we will re-acquaint ourselves with some of | |
5176 what we learned before and learn more. | |
5177 | |
5178 In this chapter, we will first write a simple graph printing function. | |
5179 This first definition will be a "prototype", a rapidly written function | |
5180 that enables us to reconnoiter this unknown graph-making territory. We | |
5181 will discover dragons, or find that they are myth. After scouting the | |
5182 terrain, we will feel more confident and enhance the function to label | |
5183 the axes automatically. | |
5184 | |
5185 * Menu: | |
5186 | |
5187 * Columns of a graph:: | |
5188 * graph-body-print:: | |
5189 * recursive-graph-body-print:: | |
5190 * Printed Axes:: | |
5191 * Line Graph Exercise:: | |
5192 | |
5193 | |
5194 File: eintr, Node: Columns of a graph, Next: graph-body-print, Prev: Readying a Graph, Up: Readying a Graph | |
5195 | |
5196 Printing the Columns of a Graph | |
5197 =============================== | |
5198 | |
5199 Since Emacs is designed to be flexible and work with all kinds of | |
5200 terminals, including character-only terminals, the graph will need to | |
5201 be made from one of the `typewriter' symbols. An asterisk will do; as | |
5202 we enhance the graph-printing function, we can make the choice of | |
5203 symbol a user option. | |
5204 | |
5205 We can call this function `graph-body-print'; it will take a | |
5206 `numbers-list' as its only argument. At this stage, we will not label | |
5207 the graph, but only print its body. | |
5208 | |
5209 The `graph-body-print' function inserts a vertical column of asterisks | |
5210 for each element in the `numbers-list'. The height of each line is | |
5211 determined by the value of that element of the `numbers-list'. | |
5212 | |
5213 Inserting columns is a repetitive act; that means that this function can | |
5214 be written either with a `while' loop or recursively. | |
5215 | |
5216 Our first challenge is to discover how to print a column of asterisks. | |
5217 Usually, in Emacs, we print characters onto a screen horizontally, line | |
5218 by line, by typing. We have two routes we can follow: write our own | |
5219 column-insertion function or discover whether one exists in Emacs. | |
5220 | |
5221 To see whether there is one in Emacs, we can use the `M-x apropos' | |
5222 command. This command is like the `C-h a' (`command-apropos') command, | |
5223 except that the latter finds only those functions that are commands. | |
5224 The `M-x apropos' command lists all symbols that match a regular | |
5225 expression, including functions that are not interactive. | |
5226 | |
5227 What we want to look for is some command that prints or inserts | |
5228 columns. Very likely, the name of the function will contain either the | |
5229 word `print' or the word `insert' or the word `column'. Therefore, we | |
5230 can simply type `M-x apropos RET print\|insert\|column RET' and look at | |
5231 the result. On my system, this command once too takes quite some time, | |
5232 and then produced a list of 79 functions and variables. Now it does | |
5233 not take much time at all and produces a list of 211 functions and | |
5234 variables. Scanning down the list, the only function that looks as if | |
5235 it might do the job is `insert-rectangle'. | |
5236 | |
5237 Indeed, this is the function we want; its documentation says: | |
5238 | |
5239 insert-rectangle: | |
5240 Insert text of RECTANGLE with upper left corner at point. | |
5241 RECTANGLE's first line is inserted at point, | |
5242 its second line is inserted at a point vertically under point, etc. | |
5243 RECTANGLE should be a list of strings. | |
5244 After this command, the mark is at the upper left corner | |
5245 and point is at the lower right corner. | |
5246 | |
5247 We can run a quick test, to make sure it does what we expect of it. | |
5248 | |
5249 Here is the result of placing the cursor after the `insert-rectangle' | |
5250 expression and typing `C-u C-x C-e' (`eval-last-sexp'). The function | |
5251 inserts the strings `"first"', `"second"', and `"third"' at and below | |
5252 point. Also the function returns `nil'. | |
5253 | |
5254 (insert-rectangle '("first" "second" "third"))first | |
5255 second | |
5256 thirdnil | |
5257 | |
5258 Of course, we won't be inserting the text of the `insert-rectangle' | |
5259 expression itself into the buffer in which we are making the graph, but | |
5260 will call the function from our program. We shall, however, have to | |
5261 make sure that point is in the buffer at the place where the | |
5262 `insert-rectangle' function will insert its column of strings. | |
5263 | |
5264 If you are reading this in Info, you can see how this works by | |
5265 switching to another buffer, such as the `*scratch*' buffer, placing | |
5266 point somewhere in the buffer, typing `M-:', typing the | |
5267 `insert-rectangle' expression into the minibuffer at the prompt, and | |
5268 then typing <RET>. This causes Emacs to evaluate the expression in the | |
5269 minibuffer, but to use as the value of point the position of point in | |
5270 the `*scratch*' buffer. (`M-:' is the keybinding for | |
5271 `eval-expression'. Also, `nil' does not appear in the `*scratch*' | |
5272 buffer since the expression is evaluated in the minibuffer.) | |
5273 | |
5274 We find when we do this that point ends up at the end of the last | |
5275 inserted line--that is to say, this function moves point as a | |
5276 side-effect. If we were to repeat the command, with point at this | |
5277 position, the next insertion would be below and to the right of the | |
5278 previous insertion. We don't want this! If we are going to make a bar | |
5279 graph, the columns need to be beside each other. | |
5280 | |
5281 So we discover that each cycle of the column-inserting `while' loop | |
5282 must reposition point to the place we want it, and that place will be | |
5283 at the top, not the bottom, of the column. Moreover, we remember that | |
5284 when we print a graph, we do not expect all the columns to be the same | |
5285 height. This means that the top of each column may be at a different | |
5286 height from the previous one. We cannot simply reposition point to the | |
5287 same line each time, but moved over to the right--or perhaps we can... | |
5288 | |
5289 We are planning to make the columns of the bar graph out of asterisks. | |
5290 The number of asterisks in the column is the number specified by the | |
5291 current element of the `numbers-list'. We need to construct a list of | |
5292 asterisks of the right length for each call to `insert-rectangle'. If | |
5293 this list consists solely of the requisite number of asterisks, then we | |
5294 will have position point the right number of lines above the base for | |
5295 the graph to print correctly. This could be difficult. | |
5296 | |
5297 Alternatively, if we can figure out some way to pass `insert-rectangle' | |
5298 a list of the same length each time, then we can place point on the | |
5299 same line each time, but move it over one column to the right for each | |
5300 new column. If we do this, however, some of the entries in the list | |
5301 passed to `insert-rectangle' must be blanks rather than asterisks. For | |
5302 example, if the maximum height of the graph is 5, but the height of the | |
5303 column is 3, then `insert-rectangle' requires an argument that looks | |
5304 like this: | |
5305 | |
5306 (" " " " "*" "*" "*") | |
5307 | |
5308 This last proposal is not so difficult, so long as we can determine the | |
5309 column height. There are two ways for us to specify the column height: | |
5310 we can arbitrarily state what it will be, which would work fine for | |
5311 graphs of that height; or we can search through the list of numbers and | |
5312 use the maximum height of the list as the maximum height of the graph. | |
5313 If the latter operation were difficult, then the former procedure would | |
5314 be easiest, but there is a function built into Emacs that determines | |
5315 the maximum of its arguments. We can use that function. The function | |
5316 is called `max' and it returns the largest of all its arguments, which | |
5317 must be numbers. Thus, for example, | |
5318 | |
5319 (max 3 4 6 5 7 3) | |
5320 | |
5321 returns 7. (A corresponding function called `min' returns the smallest | |
5322 of all its arguments.) | |
5323 | |
5324 However, we cannot simply call `max' on the `numbers-list'; the `max' | |
5325 function expects numbers as its argument, not a list of numbers. Thus, | |
5326 the following expression, | |
5327 | |
5328 (max '(3 4 6 5 7 3)) | |
5329 | |
5330 produces the following error message; | |
5331 | |
5332 Wrong type of argument: number-or-marker-p, (3 4 6 5 7 3) | |
5333 | |
5334 We need a function that passes a list of arguments to a function. This | |
5335 function is `apply'. This function `applies' its first argument (a | |
5336 function) to its remaining arguments, the last of which may be a list. | |
5337 | |
5338 For example, | |
5339 | |
5340 (apply 'max 3 4 7 3 '(4 8 5)) | |
5341 | |
5342 returns 8. | |
5343 | |
5344 (Incidentally, I don't know how you would learn of this function | |
5345 without a book such as this. It is possible to discover other | |
5346 functions, like `search-forward' or `insert-rectangle', by guessing at | |
5347 a part of their names and then using `apropos'. Even though its base | |
5348 in metaphor is clear--`apply' its first argument to the rest--I doubt a | |
5349 novice would come up with that particular word when using `apropos' or | |
5350 other aid. Of course, I could be wrong; after all, the function was | |
5351 first named by someone who had to invent it.) | |
5352 | |
5353 The second and subsequent arguments to `apply' are optional, so we can | |
5354 use `apply' to call a function and pass the elements of a list to it, | |
5355 like this, which also returns 8: | |
5356 | |
5357 (apply 'max '(4 8 5)) | |
5358 | |
5359 This latter way is how we will use `apply'. The | |
5360 `recursive-lengths-list-many-files' function returns a numbers' list to | |
5361 which we can apply `max' (we could also apply `max' to the sorted | |
5362 numbers' list; it does not matter whether the list is sorted or not.) | |
5363 | |
5364 Hence, the operation for finding the maximum height of the graph is | |
5365 this: | |
5366 | |
5367 (setq max-graph-height (apply 'max numbers-list)) | |
5368 | |
5369 Now we can return to the question of how to create a list of strings | |
5370 for a column of the graph. Told the maximum height of the graph and | |
5371 the number of asterisks that should appear in the column, the function | |
5372 should return a list of strings for the `insert-rectangle' command to | |
5373 insert. | |
5374 | |
5375 Each column is made up of asterisks or blanks. Since the function is | |
5376 passed the value of the height of the column and the number of | |
5377 asterisks in the column, the number of blanks can be found by | |
5378 subtracting the number of asterisks from the height of the column. | |
5379 Given the number of blanks and the number of asterisks, two `while' | |
5380 loops can be used to construct the list: | |
5381 | |
5382 ;;; First version. | |
5383 (defun column-of-graph (max-graph-height actual-height) | |
5384 "Return list of strings that is one column of a graph." | |
5385 (let ((insert-list nil) | |
5386 (number-of-top-blanks | |
5387 (- max-graph-height actual-height))) | |
5388 | |
5389 ;; Fill in asterisks. | |
5390 (while (> actual-height 0) | |
5391 (setq insert-list (cons "*" insert-list)) | |
5392 (setq actual-height (1- actual-height))) | |
5393 | |
5394 ;; Fill in blanks. | |
5395 (while (> number-of-top-blanks 0) | |
5396 (setq insert-list (cons " " insert-list)) | |
5397 (setq number-of-top-blanks | |
5398 (1- number-of-top-blanks))) | |
5399 | |
5400 ;; Return whole list. | |
5401 insert-list)) | |
5402 | |
5403 If you install this function and then evaluate the following expression | |
5404 you will see that it returns the list as desired: | |
5405 | |
5406 (column-of-graph 5 3) | |
5407 | |
5408 returns | |
5409 | |
5410 (" " " " "*" "*" "*") | |
5411 | |
5412 As written, `column-of-graph' contains a major flaw: the symbols used | |
5413 for the blank and for the marked entries in the column are `hard-coded' | |
5414 as a space and asterisk. This is fine for a prototype, but you, or | |
5415 another user, may wish to use other symbols. For example, in testing | |
5416 the graph function, you many want to use a period in place of the | |
5417 space, to make sure the point is being repositioned properly each time | |
5418 the `insert-rectangle' function is called; or you might want to | |
5419 substitute a `+' sign or other symbol for the asterisk. You might even | |
5420 want to make a graph-column that is more than one display column wide. | |
5421 The program should be more flexible. The way to do that is to replace | |
5422 the blank and the asterisk with two variables that we can call | |
5423 `graph-blank' and `graph-symbol' and define those variables separately. | |
5424 | |
5425 Also, the documentation is not well written. These considerations lead | |
5426 us to the second version of the function: | |
5427 | |
5428 (defvar graph-symbol "*" | |
5429 "String used as symbol in graph, usually an asterisk.") | |
5430 | |
5431 (defvar graph-blank " " | |
5432 "String used as blank in graph, usually a blank space. | |
5433 graph-blank must be the same number of columns wide | |
5434 as graph-symbol.") | |
5435 | |
5436 (For an explanation of `defvar', see *Note Initializing a Variable with | |
5437 `defvar': defvar.) | |
5438 | |
5439 ;;; Second version. | |
5440 (defun column-of-graph (max-graph-height actual-height) | |
5441 "Return MAX-GRAPH-HEIGHT strings; ACTUAL-HEIGHT are graph-symbols. | |
5442 The graph-symbols are contiguous entries at the end | |
5443 of the list. | |
5444 The list will be inserted as one column of a graph. | |
5445 The strings are either graph-blank or graph-symbol." | |
5446 | |
5447 (let ((insert-list nil) | |
5448 (number-of-top-blanks | |
5449 (- max-graph-height actual-height))) | |
5450 | |
5451 ;; Fill in `graph-symbols'. | |
5452 (while (> actual-height 0) | |
5453 (setq insert-list (cons graph-symbol insert-list)) | |
5454 (setq actual-height (1- actual-height))) | |
5455 | |
5456 ;; Fill in `graph-blanks'. | |
5457 (while (> number-of-top-blanks 0) | |
5458 (setq insert-list (cons graph-blank insert-list)) | |
5459 (setq number-of-top-blanks | |
5460 (1- number-of-top-blanks))) | |
5461 | |
5462 ;; Return whole list. | |
5463 insert-list)) | |
5464 | |
5465 If we wished, we could rewrite `column-of-graph' a third time to | |
5466 provide optionally for a line graph as well as for a bar graph. This | |
5467 would not be hard to do. One way to think of a line graph is that it | |
5468 is no more than a bar graph in which the part of each bar that is below | |
5469 the top is blank. To construct a column for a line graph, the function | |
5470 first constructs a list of blanks that is one shorter than the value, | |
5471 then it uses `cons' to attach a graph symbol to the list; then it uses | |
5472 `cons' again to attach the `top blanks' to the list. | |
5473 | |
5474 It is easy to see how to write such a function, but since we don't need | |
5475 it, we will not do it. But the job could be done, and if it were done, | |
5476 it would be done with `column-of-graph'. Even more important, it is | |
5477 worth noting that few changes would have to be made anywhere else. The | |
5478 enhancement, if we ever wish to make it, is simple. | |
5479 | |
5480 Now, finally, we come to our first actual graph printing function. | |
5481 This prints the body of a graph, not the labels for the vertical and | |
5482 horizontal axes, so we can call this `graph-body-print'. | |
5483 | |
5484 | |
5485 File: eintr, Node: graph-body-print, Next: recursive-graph-body-print, Prev: Columns of a graph, Up: Readying a Graph | |
5486 | |
5487 15.1 The `graph-body-print' Function | |
5488 ==================================== | |
5489 | |
5490 After our preparation in the preceding section, the `graph-body-print' | |
5491 function is straightforward. The function will print column after | |
5492 column of asterisks and blanks, using the elements of a numbers' list | |
5493 to specify the number of asterisks in each column. This is a | |
5494 repetitive act, which means we can use a decrementing `while' loop or | |
5495 recursive function for the job. In this section, we will write the | |
5496 definition using a `while' loop. | |
5497 | |
5498 The `column-of-graph' function requires the height of the graph as an | |
5499 argument, so we should determine and record that as a local variable. | |
5500 | |
5501 This leads us to the following template for the `while' loop version of | |
5502 this function: | |
5503 | |
5504 (defun graph-body-print (numbers-list) | |
5505 "DOCUMENTATION..." | |
5506 (let ((height ... | |
5507 ...)) | |
5508 | |
5509 (while numbers-list | |
5510 INSERT-COLUMNS-AND-REPOSITION-POINT | |
5511 (setq numbers-list (cdr numbers-list))))) | |
5512 | |
5513 We need to fill in the slots of the template. | |
5514 | |
5515 Clearly, we can use the `(apply 'max numbers-list)' expression to | |
5516 determine the height of the graph. | |
5517 | |
5518 The `while' loop will cycle through the `numbers-list' one element at a | |
5519 time. As it is shortened by the `(setq numbers-list (cdr | |
5520 numbers-list))' expression, the CAR of each instance of the list is the | |
5521 value of the argument for `column-of-graph'. | |
5522 | |
5523 At each cycle of the `while' loop, the `insert-rectangle' function | |
5524 inserts the list returned by `column-of-graph'. Since the | |
5525 `insert-rectangle' function moves point to the lower right of the | |
5526 inserted rectangle, we need to save the location of point at the time | |
5527 the rectangle is inserted, move back to that position after the | |
5528 rectangle is inserted, and then move horizontally to the next place | |
5529 from which `insert-rectangle' is called. | |
5530 | |
5531 If the inserted columns are one character wide, as they will be if | |
5532 single blanks and asterisks are used, the repositioning command is | |
5533 simply `(forward-char 1)'; however, the width of a column may be | |
5534 greater than one. This means that the repositioning command should be | |
5535 written `(forward-char symbol-width)'. The `symbol-width' itself is | |
5536 the length of a `graph-blank' and can be found using the expression | |
5537 `(length graph-blank)'. The best place to bind the `symbol-width' | |
5538 variable to the value of the width of graph column is in the varlist of | |
5539 the `let' expression. | |
5540 | |
5541 These considerations lead to the following function definition: | |
5542 | |
5543 (defun graph-body-print (numbers-list) | |
5544 "Print a bar graph of the NUMBERS-LIST. | |
5545 The numbers-list consists of the Y-axis values." | |
5546 | |
5547 (let ((height (apply 'max numbers-list)) | |
5548 (symbol-width (length graph-blank)) | |
5549 from-position) | |
5550 | |
5551 (while numbers-list | |
5552 (setq from-position (point)) | |
5553 (insert-rectangle | |
5554 (column-of-graph height (car numbers-list))) | |
5555 (goto-char from-position) | |
5556 (forward-char symbol-width) | |
5557 ;; Draw graph column by column. | |
5558 (sit-for 0) | |
5559 (setq numbers-list (cdr numbers-list))) | |
5560 ;; Place point for X axis labels. | |
5561 (forward-line height) | |
5562 (insert "\n") | |
5563 )) | |
5564 | |
5565 The one unexpected expression in this function is the `(sit-for 0)' | |
5566 expression in the `while' loop. This expression makes the graph | |
5567 printing operation more interesting to watch than it would be | |
5568 otherwise. The expression causes Emacs to `sit' or do nothing for a | |
5569 zero length of time and then redraw the screen. Placed here, it causes | |
5570 Emacs to redraw the screen column by column. Without it, Emacs would | |
5571 not redraw the screen until the function exits. | |
5572 | |
5573 We can test `graph-body-print' with a short list of numbers. | |
5574 | |
5575 1. Install `graph-symbol', `graph-blank', `column-of-graph', which | |
5576 are in *Note Columns of a graph::, and `graph-body-print'. | |
5577 | |
5578 2. Copy the following expression: | |
5579 | |
5580 (graph-body-print '(1 2 3 4 6 4 3 5 7 6 5 2 3)) | |
5581 | |
5582 3. Switch to the `*scratch*' buffer and place the cursor where you | |
5583 want the graph to start. | |
5584 | |
5585 4. Type `M-:' (`eval-expression'). | |
5586 | |
5587 5. Yank the `graph-body-print' expression into the minibuffer with | |
5588 `C-y' (`yank)'. | |
5589 | |
5590 6. Press <RET> to evaluate the `graph-body-print' expression. | |
5591 | |
5592 Emacs will print a graph like this: | |
5593 | |
5594 * | |
5595 * ** | |
5596 * **** | |
5597 *** **** | |
5598 ********* * | |
5599 ************ | |
5600 ************* | |
5601 | |
5602 | |
5603 File: eintr, Node: recursive-graph-body-print, Next: Printed Axes, Prev: graph-body-print, Up: Readying a Graph | |
5604 | |
5605 15.2 The `recursive-graph-body-print' Function | |
5606 ============================================== | |
5607 | |
5608 The `graph-body-print' function may also be written recursively. The | |
5609 recursive solution is divided into two parts: an outside `wrapper' that | |
5610 uses a `let' expression to determine the values of several variables | |
5611 that need only be found once, such as the maximum height of the graph, | |
5612 and an inside function that is called recursively to print the graph. | |
5613 | |
5614 The `wrapper' is uncomplicated: | |
5615 | |
5616 (defun recursive-graph-body-print (numbers-list) | |
5617 "Print a bar graph of the NUMBERS-LIST. | |
5618 The numbers-list consists of the Y-axis values." | |
5619 (let ((height (apply 'max numbers-list)) | |
5620 (symbol-width (length graph-blank)) | |
5621 from-position) | |
5622 (recursive-graph-body-print-internal | |
5623 numbers-list | |
5624 height | |
5625 symbol-width))) | |
5626 | |
5627 The recursive function is a little more difficult. It has four parts: | |
5628 the `do-again-test', the printing code, the recursive call, and the | |
5629 `next-step-expression'. The `do-again-test' is an `if' expression that | |
5630 determines whether the `numbers-list' contains any remaining elements; | |
5631 if it does, the function prints one column of the graph using the | |
5632 printing code and calls itself again. The function calls itself again | |
5633 according to the value produced by the `next-step-expression' which | |
5634 causes the call to act on a shorter version of the `numbers-list'. | |
5635 | |
5636 (defun recursive-graph-body-print-internal | |
5637 (numbers-list height symbol-width) | |
5638 "Print a bar graph. | |
5639 Used within recursive-graph-body-print function." | |
5640 | |
5641 (if numbers-list | |
5642 (progn | |
5643 (setq from-position (point)) | |
5644 (insert-rectangle | |
5645 (column-of-graph height (car numbers-list))) | |
5646 (goto-char from-position) | |
5647 (forward-char symbol-width) | |
5648 (sit-for 0) ; Draw graph column by column. | |
5649 (recursive-graph-body-print-internal | |
5650 (cdr numbers-list) height symbol-width)))) | |
5651 | |
5652 After installation, this expression can be tested; here is a sample: | |
5653 | |
5654 (recursive-graph-body-print '(3 2 5 6 7 5 3 4 6 4 3 2 1)) | |
5655 | |
5656 Here is what `recursive-graph-body-print' produces: | |
5657 | |
5658 * | |
5659 ** * | |
5660 **** * | |
5661 **** *** | |
5662 * ********* | |
5663 ************ | |
5664 ************* | |
5665 | |
5666 Either of these two functions, `graph-body-print' or | |
5667 `recursive-graph-body-print', create the body of a graph. | |
5668 | |
5669 | |
5670 File: eintr, Node: Printed Axes, Next: Line Graph Exercise, Prev: recursive-graph-body-print, Up: Readying a Graph | |
5671 | |
5672 15.3 Need for Printed Axes | |
5673 ========================== | |
5674 | |
5675 A graph needs printed axes, so you can orient yourself. For a do-once | |
5676 project, it may be reasonable to draw the axes by hand using Emacs' | |
5677 Picture mode; but a graph drawing function may be used more than once. | |
5678 | |
5679 For this reason, I have written enhancements to the basic | |
5680 `print-graph-body' function that automatically print labels for the | |
5681 horizontal and vertical axes. Since the label printing functions do | |
5682 not contain much new material, I have placed their description in an | |
5683 appendix. *Note A Graph with Labelled Axes: Full Graph. | |
5684 | |
5685 | |
5686 File: eintr, Node: Line Graph Exercise, Prev: Printed Axes, Up: Readying a Graph | |
5687 | |
5688 15.4 Exercise | |
5689 ============= | |
5690 | |
5691 Write a line graph version of the graph printing functions. | |
5692 | |
5693 | |
5694 File: eintr, Node: Emacs Initialization, Next: Debugging, Prev: Readying a Graph, Up: Top | |
5695 | |
5696 16 Your `.emacs' File | |
5697 ********************* | |
5698 | |
5699 "You don't have to like Emacs to like it" - this seemingly paradoxical | |
5700 statement is the secret of GNU Emacs. The plain, `out of the box' | |
5701 Emacs is a generic tool. Most people who use it, customize it to suit | |
5702 themselves. | |
5703 | |
5704 GNU Emacs is mostly written in Emacs Lisp; this means that by writing | |
5705 expressions in Emacs Lisp you can change or extend Emacs. | |
5706 | |
5707 * Menu: | |
5708 | |
5709 * Default Configuration:: | |
5710 * Site-wide Init:: | |
5711 * defcustom:: | |
5712 * Beginning a .emacs File:: | |
5713 * Text and Auto-fill:: | |
5714 * Mail Aliases:: | |
5715 * Indent Tabs Mode:: | |
5716 * Keybindings:: | |
5717 * Keymaps:: | |
5718 * Loading Files:: | |
5719 * Autoload:: | |
5720 * Simple Extension:: | |
5721 * X11 Colors:: | |
5722 * Miscellaneous:: | |
5723 * Mode Line:: | |
5724 | |
5725 | |
5726 File: eintr, Node: Default Configuration, Next: Site-wide Init, Prev: Emacs Initialization, Up: Emacs Initialization | |
5727 | |
5728 Emacs' Default Configuration | |
5729 ============================ | |
5730 | |
5731 There are those who appreciate Emacs' default configuration. After | |
5732 all, Emacs starts you in C mode when you edit a C file, starts you in | |
5733 Fortran mode when you edit a Fortran file, and starts you in | |
5734 Fundamental mode when you edit an unadorned file. This all makes | |
5735 sense, if you do not know who is going to use Emacs. Who knows what a | |
5736 person hopes to do with an unadorned file? Fundamental mode is the | |
5737 right default for such a file, just as C mode is the right default for | |
5738 editing C code. (Enough programming languages have syntaxes that | |
5739 enable them to share or nearly share features, so C mode is now | |
5740 provided by by CC mode, the `C Collection'.) | |
5741 | |
5742 But when you do know who is going to use Emacs--you, yourself--then it | |
5743 makes sense to customize Emacs. | |
5744 | |
5745 For example, I seldom want Fundamental mode when I edit an otherwise | |
5746 undistinguished file; I want Text mode. This is why I customize Emacs: | |
5747 so it suits me. | |
5748 | |
5749 You can customize and extend Emacs by writing or adapting a `~/.emacs' | |
5750 file. This is your personal initialization file; its contents, written | |
5751 in Emacs Lisp, tell Emacs what to do.(1) | |
5752 | |
5753 A `~/.emacs' file contains Emacs Lisp code. You can write this code | |
5754 yourself; or you can use Emacs' `customize' feature to write the code | |
5755 for you. You can combine your own expressions and auto-written | |
5756 Customize expressions in your `.emacs' file. | |
5757 | |
5758 (I myself prefer to write my own expressions, except for those, | |
5759 particularly fonts, that I find easier to manipulate using the | |
5760 `customize' command. I combine the two methods.) | |
5761 | |
5762 Most of this chapter is about writing expressions yourself. It | |
5763 describes a simple `.emacs' file; for more information, see *Note The | |
5764 Init File: (emacs)Init File, and *Note The Init File: (elisp)Init File. | |
5765 | |
5766 ---------- Footnotes ---------- | |
5767 | |
5768 (1) You may also add `.el' to `~/.emacs' and call it a `~/.emacs.el' | |
5769 file. In the past, you were forbidden to type the extra keystrokes | |
5770 that the name `~/.emacs.el' requires, but now you may. The new format | |
5771 is consistent with the Emacs Lisp file naming conventions; the old | |
5772 format saves typing. | |
5773 | |
5774 | |
5775 File: eintr, Node: Site-wide Init, Next: defcustom, Prev: Default Configuration, Up: Emacs Initialization | |
5776 | |
5777 16.1 Site-wide Initialization Files | |
5778 =================================== | |
5779 | |
5780 In addition to your personal initialization file, Emacs automatically | |
5781 loads various site-wide initialization files, if they exist. These | |
5782 have the same form as your `.emacs' file, but are loaded by everyone. | |
5783 | |
5784 Two site-wide initialization files, `site-load.el' and `site-init.el', | |
5785 are loaded into Emacs and then `dumped' if a `dumped' version of Emacs | |
5786 is created, as is most common. (Dumped copies of Emacs load more | |
5787 quickly. However, once a file is loaded and dumped, a change to it | |
5788 does not lead to a change in Emacs unless you load it yourself or | |
5789 re-dump Emacs. *Note Building Emacs: (elisp)Building Emacs, and the | |
5790 `INSTALL' file.) | |
5791 | |
5792 Three other site-wide initialization files are loaded automatically | |
5793 each time you start Emacs, if they exist. These are `site-start.el', | |
5794 which is loaded _before_ your `.emacs' file, and `default.el', and the | |
5795 terminal type file, which are both loaded _after_ your `.emacs' file. | |
5796 | |
5797 Settings and definitions in your `.emacs' file will overwrite | |
5798 conflicting settings and definitions in a `site-start.el' file, if it | |
5799 exists; but the settings and definitions in a `default.el' or terminal | |
5800 type file will overwrite those in your `.emacs' file. (You can prevent | |
5801 interference from a terminal type file by setting `term-file-prefix' to | |
5802 `nil'. *Note A Simple Extension: Simple Extension.) | |
5803 | |
5804 The `INSTALL' file that comes in the distribution contains descriptions | |
5805 of the `site-init.el' and `site-load.el' files. | |
5806 | |
5807 The `loadup.el', `startup.el', and `loaddefs.el' files control loading. | |
5808 These files are in the `lisp' directory of the Emacs distribution and | |
5809 are worth perusing. | |
5810 | |
5811 The `loaddefs.el' file contains a good many suggestions as to what to | |
5812 put into your own `.emacs' file, or into a site-wide initialization | |
5813 file. | |
5814 | |
5815 | |
5816 File: eintr, Node: defcustom, Next: Beginning a .emacs File, Prev: Site-wide Init, Up: Emacs Initialization | |
5817 | |
5818 16.2 Specifying Variables using `defcustom' | |
5819 =========================================== | |
5820 | |
5821 You can specify variables using `defcustom' so that you and others can | |
5822 then use Emacs' `customize' feature to set their values. (You cannot | |
5823 use `customize' to write function definitions; but you can write | |
5824 `defuns' in your `.emacs' file. Indeed, you can write any Lisp | |
5825 expression in your `.emacs' file.) | |
5826 | |
5827 The `customize' feature depends on the `defcustom' special form. | |
5828 Although you can use `defvar' or `setq' for variables that users set, | |
5829 the `defcustom' special form is designed for the job. | |
5830 | |
5831 You can use your knowledge of `defvar' for writing the first three | |
5832 arguments for `defcustom'. The first argument to `defcustom' is the | |
5833 name of the variable. The second argument is the variable's initial | |
5834 value, if any; and this value is set only if the value has not already | |
5835 been set. The third argument is the documentation. | |
5836 | |
5837 The fourth and subsequent arguments to `defcustom' specify types and | |
5838 options; these are not featured in `defvar'. (These arguments are | |
5839 optional.) | |
5840 | |
5841 Each of these arguments consists of a keyword followed by a value. | |
5842 Each keyword starts with the colon character `:'. | |
5843 | |
5844 For example, the customizable user option variable `text-mode-hook' | |
5845 looks like this: | |
5846 | |
5847 (defcustom text-mode-hook nil | |
5848 "Normal hook run when entering Text mode and many related modes." | |
5849 :type 'hook | |
5850 :options '(turn-on-auto-fill flyspell-mode) | |
5851 :group 'data) | |
5852 | |
5853 The name of the variable is `text-mode-hook'; it has no default value; | |
5854 and its documentation string tells you what it does. | |
5855 | |
5856 The `:type' keyword tells Emacs the kind of data to which | |
5857 `text-mode-hook' should be set and how to display the value in a | |
5858 Customization buffer. | |
5859 | |
5860 The `:options' keyword specifies a suggested list of values for the | |
5861 variable. Currently, you can use `:options' only for a hook. The list | |
5862 is only a suggestion; it is not exclusive; a person who sets the | |
5863 variable may set it to other values; the list shown following the | |
5864 `:options' keyword is intended to offer convenient choices to a user. | |
5865 | |
5866 Finally, the `:group' keyword tells the Emacs Customization command in | |
5867 which group the variable is located. This tells where to find it. | |
5868 | |
5869 For more information, see *Note Writing Customization Definitions: | |
5870 (elisp)Customization. | |
5871 | |
5872 Consider `text-mode-hook' as an example. | |
5873 | |
5874 There are two ways to customize this variable. You can use the | |
5875 customization command or write the appropriate expressions yourself. | |
5876 | |
5877 Using the customization command, you can type: | |
5878 | |
5879 M-x customize | |
5880 | |
5881 and find that the group for editing files of data is called `data'. | |
5882 Enter that group. Text Mode Hook is the first member. You can click | |
5883 on its various options, such as `turn-on-auto-fill', to set the values. | |
5884 After you click on the button to | |
5885 | |
5886 Save for Future Sessions | |
5887 | |
5888 Emacs will write an expression into your `.emacs' file. It will look | |
5889 like this: | |
5890 | |
5891 (custom-set-variables | |
5892 ;; custom-set-variables was added by Custom. | |
5893 ;; If you edit it by hand, you could mess it up, so be careful. | |
5894 ;; Your init file should contain only one such instance. | |
5895 ;; If there is more than one, they won't work right. | |
5896 '(text-mode-hook (quote (turn-on-auto-fill text-mode-hook-identify)))) | |
5897 | |
5898 (The `text-mode-hook-identify' function tells | |
5899 `toggle-text-mode-auto-fill' which buffers are in Text mode. It comes | |
5900 on automatically. ) | |
5901 | |
5902 The `custom-set-variables' function works somewhat differently than a | |
5903 `setq'. While I have never learned the differences, I modify the | |
5904 `custom-set-variables' expressions in my `.emacs' file by hand: I make | |
5905 the changes in what appears to me to be a reasonable manner and have | |
5906 not had any problems. Others prefer to use the Customization command | |
5907 and let Emacs do the work for them. | |
5908 | |
5909 Another `custom-set-...' function is `custom-set-faces'. This function | |
5910 sets the various font faces. Over time, I have set a considerable | |
5911 number of faces. Some of the time, I re-set them using `customize'; | |
5912 other times, I simply edit the `custom-set-faces' expression in my | |
5913 `.emacs' file itself. | |
5914 | |
5915 The second way to customize your `text-mode-hook' is to set it yourself | |
5916 in your `.emacs' file using code that has nothing to do with the | |
5917 `custom-set-...' functions. | |
5918 | |
5919 When you do this, and later use `customize', you will see a message | |
5920 that says | |
5921 | |
5922 CHANGED outside Customize; operating on it here may be unreliable. | |
5923 | |
5924 This message is only a warning. If you click on the button to | |
5925 | |
5926 Save for Future Sessions | |
5927 | |
5928 Emacs will write a `custom-set-...' expression near the end of your | |
5929 `.emacs' file that will be evaluated after your hand-written | |
5930 expression. It will, therefore, overrule your hand-written expression. | |
5931 No harm will be done. When you do this, however, be careful to | |
5932 remember which expression is active; if you forget, you may confuse | |
5933 yourself. | |
5934 | |
5935 So long as you remember where the values are set, you will have no | |
5936 trouble. In any event, the values are always set in your | |
5937 initialization file, which is usually called `.emacs'. | |
5938 | |
5939 I myself use `customize' for hardly anything. Mostly, I write | |
5940 expressions myself. | |
5941 | |
5942 Incidentally, `defsubst' defines an inline function. The syntax is | |
5943 just like that of `defun'. `defconst' defines a symbol as a constant. | |
5944 The intent is that neither programs nor users should ever change a | |
5945 value set by `defconst' | |
5946 | |
5947 | |
5948 File: eintr, Node: Beginning a .emacs File, Next: Text and Auto-fill, Prev: defcustom, Up: Emacs Initialization | |
5949 | |
5950 16.3 Beginning a `.emacs' File | |
5951 ============================== | |
5952 | |
5953 When you start Emacs, it loads your `.emacs' file unless you tell it | |
5954 not to by specifying `-q' on the command line. (The `emacs -q' command | |
5955 gives you a plain, out-of-the-box Emacs.) | |
5956 | |
5957 A `.emacs' file contains Lisp expressions. Often, these are no more | |
5958 than expressions to set values; sometimes they are function definitions. | |
5959 | |
5960 *Note The Init File `~/.emacs': (emacs)Init File, for a short | |
5961 description of initialization files. | |
5962 | |
5963 This chapter goes over some of the same ground, but is a walk among | |
5964 extracts from a complete, long-used `.emacs' file--my own. | |
5965 | |
5966 The first part of the file consists of comments: reminders to myself. | |
5967 By now, of course, I remember these things, but when I started, I did | |
5968 not. | |
5969 | |
5970 ;;;; Bob's .emacs file | |
5971 ; Robert J. Chassell | |
5972 ; 26 September 1985 | |
5973 | |
5974 Look at that date! I started this file a long time ago. I have been | |
5975 adding to it ever since. | |
5976 | |
5977 ; Each section in this file is introduced by a | |
5978 ; line beginning with four semicolons; and each | |
5979 ; entry is introduced by a line beginning with | |
5980 ; three semicolons. | |
5981 | |
5982 This describes the usual conventions for comments in Emacs Lisp. | |
5983 Everything on a line that follows a semicolon is a comment. Two, | |
5984 three, and four semicolons are used as section and subsection markers. | |
5985 (*Note Comments: (elisp)Comments, for more about comments.) | |
5986 | |
5987 ;;;; The Help Key | |
5988 ; Control-h is the help key; | |
5989 ; after typing control-h, type a letter to | |
5990 ; indicate the subject about which you want help. | |
5991 ; For an explanation of the help facility, | |
5992 ; type control-h two times in a row. | |
5993 | |
5994 Just remember: type `C-h' two times for help. | |
5995 | |
5996 ; To find out about any mode, type control-h m | |
5997 ; while in that mode. For example, to find out | |
5998 ; about mail mode, enter mail mode and then type | |
5999 ; control-h m. | |
6000 | |
6001 `Mode help', as I call this, is very helpful. Usually, it tells you | |
6002 all you need to know. | |
6003 | |
6004 Of course, you don't need to include comments like these in your | |
6005 `.emacs' file. I included them in mine because I kept forgetting about | |
6006 Mode help or the conventions for comments--but I was able to remember | |
6007 to look here to remind myself. | |
6008 | |
6009 | |
6010 File: eintr, Node: Text and Auto-fill, Next: Mail Aliases, Prev: Beginning a .emacs File, Up: Emacs Initialization | |
6011 | |
6012 16.4 Text and Auto Fill Mode | |
6013 ============================ | |
6014 | |
6015 Now we come to the part that `turns on' Text mode and Auto Fill mode. | |
6016 | |
6017 ;;; Text mode and Auto Fill mode | |
6018 ; The next two lines put Emacs into Text mode | |
6019 ; and Auto Fill mode, and are for writers who | |
6020 ; want to start writing prose rather than code. | |
6021 | |
6022 (setq default-major-mode 'text-mode) | |
6023 (add-hook 'text-mode-hook 'turn-on-auto-fill) | |
6024 | |
6025 Here is the first part of this `.emacs' file that does something | |
6026 besides remind a forgetful human! | |
6027 | |
6028 The first of the two lines in parentheses tells Emacs to turn on Text | |
6029 mode when you find a file, _unless_ that file should go into some other | |
6030 mode, such as C mode. | |
6031 | |
6032 When Emacs reads a file, it looks at the extension to the file name, if | |
6033 any. (The extension is the part that comes after a `.'.) If the file | |
6034 ends with a `.c' or `.h' extension then Emacs turns on C mode. Also, | |
6035 Emacs looks at first nonblank line of the file; if the line says | |
6036 `-*- C -*-', Emacs turns on C mode. Emacs possesses a list of | |
6037 extensions and specifications that it uses automatically. In addition, | |
6038 Emacs looks near the last page for a per-buffer, "local variables | |
6039 list", if any. | |
6040 | |
6041 *Note How Major Modes are Chosen: (emacs)Choosing Modes. | |
6042 | |
6043 *Note Local Variables in Files: (emacs)File Variables. | |
6044 | |
6045 Now, back to the `.emacs' file. | |
6046 | |
6047 Here is the line again; how does it work? | |
6048 | |
6049 (setq default-major-mode 'text-mode) | |
6050 | |
6051 This line is a short, but complete Emacs Lisp expression. | |
6052 | |
6053 We are already familiar with `setq'. It sets the following variable, | |
6054 `default-major-mode', to the subsequent value, which is `text-mode'. | |
6055 The single quote mark before `text-mode' tells Emacs to deal directly | |
6056 with the `text-mode' variable, not with whatever it might stand for. | |
6057 *Note Setting the Value of a Variable: set & setq, for a reminder of | |
6058 how `setq' works. The main point is that there is no difference | |
6059 between the procedure you use to set a value in your `.emacs' file and | |
6060 the procedure you use anywhere else in Emacs. | |
6061 | |
6062 Here is the next line: | |
6063 | |
6064 (add-hook 'text-mode-hook 'turn-on-auto-fill) | |
6065 | |
6066 In this line, the `add-hook' command adds `turn-on-auto-fill' to the | |
6067 variable. | |
6068 | |
6069 `turn-on-auto-fill' is the name of a program, that, you guessed it!, | |
6070 turns on Auto Fill mode. | |
6071 | |
6072 Every time Emacs turns on Text mode, Emacs runs the commands `hooked' | |
6073 onto Text mode. So every time Emacs turns on Text mode, Emacs also | |
6074 turns on Auto Fill mode. | |
6075 | |
6076 In brief, the first line causes Emacs to enter Text mode when you edit a | |
6077 file, unless the file name extension, a first non-blank line, or local | |
6078 variables to tell Emacs otherwise. | |
6079 | |
6080 Text mode among other actions, sets the syntax table to work | |
6081 conveniently for writers. In Text mode, Emacs considers an apostrophe | |
6082 as part of a word like a letter; but Emacs does not consider a period | |
6083 or a space as part of a word. Thus, `M-f' moves you over `it's'. On | |
6084 the other hand, in C mode, `M-f' stops just after the `t' of `it's'. | |
6085 | |
6086 The second line causes Emacs to turn on Auto Fill mode when it turns on | |
6087 Text mode. In Auto Fill mode, Emacs automatically breaks a line that | |
6088 is too wide and brings the excessively wide part of the line down to | |
6089 the next line. Emacs breaks lines between words, not within them. | |
6090 | |
6091 When Auto Fill mode is turned off, lines continue to the right as you | |
6092 type them. Depending on how you set the value of `truncate-lines', the | |
6093 words you type either disappear off the right side of the screen, or | |
6094 else are shown, in a rather ugly and unreadable manner, as a | |
6095 continuation line on the screen. | |
6096 | |
6097 In addition, in this part of my `.emacs' file, I tell the Emacs fill | |
6098 commands to insert two spaces after a colon: | |
6099 | |
6100 (setq colon-double-space t) | |
6101 | |
6102 | |
6103 File: eintr, Node: Mail Aliases, Next: Indent Tabs Mode, Prev: Text and Auto-fill, Up: Emacs Initialization | |
6104 | |
6105 16.5 Mail Aliases | |
6106 ================= | |
6107 | |
6108 Here is a `setq' that `turns on' mail aliases, along with more | |
6109 reminders. | |
6110 | |
6111 ;;; Mail mode | |
6112 ; To enter mail mode, type `C-x m' | |
6113 ; To enter RMAIL (for reading mail), | |
6114 ; type `M-x rmail' | |
6115 | |
6116 (setq mail-aliases t) | |
6117 | |
6118 This `setq' command sets the value of the variable `mail-aliases' to | |
6119 `t'. Since `t' means true, the line says, in effect, "Yes, use mail | |
6120 aliases." | |
6121 | |
6122 Mail aliases are convenient short names for long email addresses or for | |
6123 lists of email addresses. The file where you keep your `aliases' is | |
6124 `~/.mailrc'. You write an alias like this: | |
6125 | |
6126 alias geo george@foobar.wiz.edu | |
6127 | |
6128 When you write a message to George, address it to `geo'; the mailer | |
6129 will automatically expand `geo' to the full address. | |
6130 | |
6131 | |
6132 File: eintr, Node: Indent Tabs Mode, Next: Keybindings, Prev: Mail Aliases, Up: Emacs Initialization | |
6133 | |
6134 16.6 Indent Tabs Mode | |
6135 ===================== | |
6136 | |
6137 By default, Emacs inserts tabs in place of multiple spaces when it | |
6138 formats a region. (For example, you might indent many lines of text | |
6139 all at once with the `indent-region' command.) Tabs look fine on a | |
6140 terminal or with ordinary printing, but they produce badly indented | |
6141 output when you use TeX or Texinfo since TeX ignores tabs. | |
6142 | |
6143 The following turns off Indent Tabs mode: | |
6144 | |
6145 ;;; Prevent Extraneous Tabs | |
6146 (setq-default indent-tabs-mode nil) | |
6147 | |
6148 Note that this line uses `setq-default' rather than the `setq' command | |
6149 that we have seen before. The `setq-default' command sets values only | |
6150 in buffers that do not have their own local values for the variable. | |
6151 | |
6152 *Note Tabs vs. Spaces: (emacs)Just Spaces. | |
6153 | |
6154 *Note Local Variables in Files: (emacs)File Variables. | |
6155 | |
6156 | |
6157 File: eintr, Node: Keybindings, Next: Keymaps, Prev: Indent Tabs Mode, Up: Emacs Initialization | |
6158 | |
6159 16.7 Some Keybindings | |
6160 ===================== | |
6161 | |
6162 Now for some personal keybindings: | |
6163 | |
6164 ;;; Compare windows | |
6165 (global-set-key "\C-cw" 'compare-windows) | |
6166 | |
6167 `compare-windows' is a nifty command that compares the text in your | |
6168 current window with text in the next window. It makes the comparison | |
6169 by starting at point in each window, moving over text in each window as | |
6170 far as they match. I use this command all the time. | |
6171 | |
6172 This also shows how to set a key globally, for all modes. | |
6173 | |
6174 The command is `global-set-key'. It is followed by the keybinding. In | |
6175 a `.emacs' file, the keybinding is written as shown: `\C-c' stands for | |
6176 `control-c', which means `press the control key and the `c' key at the | |
6177 same time'. The `w' means `press the `w' key'. The keybinding is | |
6178 surrounded by double quotation marks. In documentation, you would | |
6179 write this as `C-c w'. (If you were binding a <META> key, such as | |
6180 `M-c', rather than a <CTRL> key, you would write `\M-c'. *Note | |
6181 Rebinding Keys in Your Init File: (emacs)Init Rebinding, for details.) | |
6182 | |
6183 The command invoked by the keys is `compare-windows'. Note that | |
6184 `compare-windows' is preceded by a single quote; otherwise, Emacs would | |
6185 first try to evaluate the symbol to determine its value. | |
6186 | |
6187 These three things, the double quotation marks, the backslash before | |
6188 the `C', and the single quote mark are necessary parts of keybinding | |
6189 that I tend to forget. Fortunately, I have come to remember that I | |
6190 should look at my existing `.emacs' file, and adapt what is there. | |
6191 | |
6192 As for the keybinding itself: `C-c w'. This combines the prefix key, | |
6193 `C-c', with a single character, in this case, `w'. This set of keys, | |
6194 `C-c' followed by a single character, is strictly reserved for | |
6195 individuals' own use. (I call these `own' keys, since these are for my | |
6196 own use.) You should always be able to create such a keybinding for | |
6197 your own use without stomping on someone else's keybinding. If you | |
6198 ever write an extension to Emacs, please avoid taking any of these keys | |
6199 for public use. Create a key like `C-c C-w' instead. Otherwise, we | |
6200 will run out of `own' keys. | |
6201 | |
6202 Here is another keybinding, with a comment: | |
6203 | |
6204 ;;; Keybinding for `occur' | |
6205 ; I use occur a lot, so let's bind it to a key: | |
6206 (global-set-key "\C-co" 'occur) | |
6207 | |
6208 The `occur' command shows all the lines in the current buffer that | |
6209 contain a match for a regular expression. Matching lines are shown in | |
6210 a buffer called `*Occur*'. That buffer serves as a menu to jump to | |
6211 occurrences. | |
6212 | |
6213 Here is how to unbind a key, so it does not work: | |
6214 | |
6215 ;;; Unbind `C-x f' | |
6216 (global-unset-key "\C-xf") | |
6217 | |
6218 There is a reason for this unbinding: I found I inadvertently typed | |
6219 `C-x f' when I meant to type `C-x C-f'. Rather than find a file, as I | |
6220 intended, I accidentally set the width for filled text, almost always | |
6221 to a width I did not want. Since I hardly ever reset my default width, | |
6222 I simply unbound the key. | |
6223 | |
6224 The following rebinds an existing key: | |
6225 | |
6226 ;;; Rebind `C-x C-b' for `buffer-menu' | |
6227 (global-set-key "\C-x\C-b" 'buffer-menu) | |
6228 | |
6229 By default, `C-x C-b' runs the `list-buffers' command. This command | |
6230 lists your buffers in _another_ window. Since I almost always want to | |
6231 do something in that window, I prefer the `buffer-menu' command, which | |
6232 not only lists the buffers, but moves point into that window. | |
6233 | |
6234 | |
6235 File: eintr, Node: Keymaps, Next: Loading Files, Prev: Keybindings, Up: Emacs Initialization | |
6236 | |
6237 16.8 Keymaps | |
6238 ============ | |
6239 | |
6240 Emacs uses "keymaps" to record which keys call which commands. When | |
6241 you use `global-set-key' to set the keybinding for a single command in | |
6242 all parts of Emacs, you are specifying the keybinding in | |
6243 `current-global-map'. | |
6244 | |
6245 Specific modes, such as C mode or Text mode, have their own keymaps; | |
6246 the mode-specific keymaps override the global map that is shared by all | |
6247 buffers. | |
6248 | |
6249 The `global-set-key' function binds, or rebinds, the global keymap. | |
6250 For example, the following binds the key `C-x C-b' to the function | |
6251 `buffer-menu': | |
6252 | |
6253 (global-set-key "\C-x\C-b" 'buffer-menu) | |
6254 | |
6255 Mode-specific keymaps are bound using the `define-key' function, which | |
6256 takes a specific keymap as an argument, as well as the key and the | |
6257 command. For example, my `.emacs' file contains the following | |
6258 expression to bind the `texinfo-insert-@group' command to `C-c C-c g': | |
6259 | |
6260 (define-key texinfo-mode-map "\C-c\C-cg" 'texinfo-insert-@group) | |
6261 | |
6262 The `texinfo-insert-@group' function itself is a little extension to | |
6263 Texinfo mode that inserts `@group' into a Texinfo file. I use this | |
6264 command all the time and prefer to type the three strokes `C-c C-c g' | |
6265 rather than the six strokes `@ g r o u p'. (`@group' and its matching | |
6266 `@end group' are commands that keep all enclosed text together on one | |
6267 page; many multi-line examples in this book are surrounded by `@group | |
6268 ... @end group'.) | |
6269 | |
6270 Here is the `texinfo-insert-@group' function definition: | |
6271 | |
6272 (defun texinfo-insert-@group () | |
6273 "Insert the string @group in a Texinfo buffer." | |
6274 (interactive) | |
6275 (beginning-of-line) | |
6276 (insert "@group\n")) | |
6277 | |
6278 (Of course, I could have used Abbrev mode to save typing, rather than | |
6279 write a function to insert a word; but I prefer key strokes consistent | |
6280 with other Texinfo mode key bindings.) | |
6281 | |
6282 You will see numerous `define-key' expressions in `loaddefs.el' as well | |
6283 as in the various mode libraries, such as `cc-mode.el' and | |
6284 `lisp-mode.el'. | |
6285 | |
6286 *Note Customizing Key Bindings: (emacs)Key Bindings, and *Note Keymaps: | |
6287 (elisp)Keymaps, for more information about keymaps. | |
6288 | |
6289 | |
6290 File: eintr, Node: Loading Files, Next: Autoload, Prev: Keymaps, Up: Emacs Initialization | |
6291 | |
6292 16.9 Loading Files | |
6293 ================== | |
6294 | |
6295 Many people in the GNU Emacs community have written extensions to | |
6296 Emacs. As time goes by, these extensions are often included in new | |
6297 releases. For example, the Calendar and Diary packages are now part of | |
6298 the standard GNU Emacs, as is Calc. | |
6299 | |
6300 You can use a `load' command to evaluate a complete file and thereby | |
6301 install all the functions and variables in the file into Emacs. For | |
6302 example: | |
6303 | |
6304 (load "~/emacs/slowsplit") | |
6305 | |
6306 This evaluates, i.e. loads, the `slowsplit.el' file or if it exists, | |
6307 the faster, byte compiled `slowsplit.elc' file from the `emacs' | |
6308 sub-directory of your home directory. The file contains the function | |
6309 `split-window-quietly', which John Robinson wrote in 1989. | |
6310 | |
6311 The `split-window-quietly' function splits a window with the minimum of | |
6312 redisplay. I installed it in 1989 because it worked well with the slow | |
6313 1200 baud terminals I was then using. Nowadays, I only occasionally | |
6314 come across such a slow connection, but I continue to use the function | |
6315 because I like the way it leaves the bottom half of a buffer in the | |
6316 lower of the new windows and the top half in the upper window. | |
6317 | |
6318 To replace the key binding for the default `split-window-vertically', | |
6319 you must also unset that key and bind the keys to | |
6320 `split-window-quietly', like this: | |
6321 | |
6322 (global-unset-key "\C-x2") | |
6323 (global-set-key "\C-x2" 'split-window-quietly) | |
6324 | |
6325 If you load many extensions, as I do, then instead of specifying the | |
6326 exact location of the extension file, as shown above, you can specify | |
6327 that directory as part of Emacs' `load-path'. Then, when Emacs loads a | |
6328 file, it will search that directory as well as its default list of | |
6329 directories. (The default list is specified in `paths.h' when Emacs is | |
6330 built.) | |
6331 | |
6332 The following command adds your `~/emacs' directory to the existing | |
6333 load path: | |
6334 | |
6335 ;;; Emacs Load Path | |
6336 (setq load-path (cons "~/emacs" load-path)) | |
6337 | |
6338 Incidentally, `load-library' is an interactive interface to the `load' | |
6339 function. The complete function looks like this: | |
6340 | |
6341 (defun load-library (library) | |
6342 "Load the library named LIBRARY. | |
6343 This is an interface to the function `load'." | |
6344 (interactive | |
6345 (list (completing-read "Load library: " | |
6346 'locate-file-completion | |
6347 (cons load-path (get-load-suffixes))))) | |
6348 (load library)) | |
6349 | |
6350 The name of the function, `load-library', comes from the use of | |
6351 `library' as a conventional synonym for `file'. The source for the | |
6352 `load-library' command is in the `files.el' library. | |
6353 | |
6354 Another interactive command that does a slightly different job is | |
6355 `load-file'. *Note Libraries of Lisp Code for Emacs: (emacs)Lisp | |
6356 Libraries, for information on the distinction between `load-library' | |
6357 and this command. | |
6358 | |
6359 | |
6360 File: eintr, Node: Autoload, Next: Simple Extension, Prev: Loading Files, Up: Emacs Initialization | |
6361 | |
6362 16.10 Autoloading | |
6363 ================= | |
6364 | |
6365 Instead of installing a function by loading the file that contains it, | |
6366 or by evaluating the function definition, you can make the function | |
6367 available but not actually install it until it is first called. This | |
6368 is called "autoloading". | |
6369 | |
6370 When you execute an autoloaded function, Emacs automatically evaluates | |
6371 the file that contains the definition, and then calls the function. | |
6372 | |
6373 Emacs starts quicker with autoloaded functions, since their libraries | |
6374 are not loaded right away; but you need to wait a moment when you first | |
6375 use such a function, while its containing file is evaluated. | |
6376 | |
6377 Rarely used functions are frequently autoloaded. The `loaddefs.el' | |
6378 library contains hundreds of autoloaded functions, from `bookmark-set' | |
6379 to `wordstar-mode'. Of course, you may come to use a `rare' function | |
6380 frequently. When you do, you should load that function's file with a | |
6381 `load' expression in your `.emacs' file. | |
6382 | |
6383 In my `.emacs' file for Emacs version 22, I load 14 libraries that | |
6384 contain functions that would otherwise be autoloaded. (Actually, it | |
6385 would have been better to include these files in my `dumped' Emacs, but | |
6386 I forgot. *Note Building Emacs: (elisp)Building Emacs, and the | |
6387 `INSTALL' file for more about dumping.) | |
6388 | |
6389 You may also want to include autoloaded expressions in your `.emacs' | |
6390 file. `autoload' is a built-in function that takes up to five | |
6391 arguments, the final three of which are optional. The first argument | |
6392 is the name of the function to be autoloaded; the second is the name of | |
6393 the file to be loaded. The third argument is documentation for the | |
6394 function, and the fourth tells whether the function can be called | |
6395 interactively. The fifth argument tells what type of | |
6396 object--`autoload' can handle a keymap or macro as well as a function | |
6397 (the default is a function). | |
6398 | |
6399 Here is a typical example: | |
6400 | |
6401 (autoload 'html-helper-mode | |
6402 "html-helper-mode" "Edit HTML documents" t) | |
6403 | |
6404 (`html-helper-mode' is an alternative to `html-mode', which is a | |
6405 standard part of the distribution). | |
6406 | |
6407 This expression autoloads the `html-helper-mode' function. It takes it | |
6408 from the `html-helper-mode.el' file (or from the byte compiled file | |
6409 `html-helper-mode.elc', if it exists.) The file must be located in a | |
6410 directory specified by `load-path'. The documentation says that this | |
6411 is a mode to help you edit documents written in the HyperText Markup | |
6412 Language. You can call this mode interactively by typing `M-x | |
6413 html-helper-mode'. (You need to duplicate the function's regular | |
6414 documentation in the autoload expression because the regular function | |
6415 is not yet loaded, so its documentation is not available.) | |
6416 | |
6417 *Note Autoload: (elisp)Autoload, for more information. | |
6418 | |
6419 | |
6420 File: eintr, Node: Simple Extension, Next: X11 Colors, Prev: Autoload, Up: Emacs Initialization | |
6421 | |
6422 16.11 A Simple Extension: `line-to-top-of-window' | |
6423 ================================================= | |
6424 | |
6425 Here is a simple extension to Emacs that moves the line point is on to | |
6426 the top of the window. I use this all the time, to make text easier to | |
6427 read. | |
6428 | |
6429 You can put the following code into a separate file and then load it | |
6430 from your `.emacs' file, or you can include it within your `.emacs' | |
6431 file. | |
6432 | |
6433 Here is the definition: | |
6434 | |
6435 ;;; Line to top of window; | |
6436 ;;; replace three keystroke sequence C-u 0 C-l | |
6437 (defun line-to-top-of-window () | |
6438 "Move the line point is on to top of window." | |
6439 (interactive) | |
6440 (recenter 0)) | |
6441 | |
6442 Now for the keybinding. | |
6443 | |
6444 Nowadays, function keys as well as mouse button events and non-ASCII | |
6445 characters are written within square brackets, without quotation marks. | |
6446 (In Emacs version 18 and before, you had to write different function | |
6447 key bindings for each different make of terminal.) | |
6448 | |
6449 I bind `line-to-top-of-window' to my <F6> function key like this: | |
6450 | |
6451 (global-set-key [f6] 'line-to-top-of-window) | |
6452 | |
6453 For more information, see *Note Rebinding Keys in Your Init File: | |
6454 (emacs)Init Rebinding. | |
6455 | |
6456 If you run two versions of GNU Emacs, such as versions 21 and 22, and | |
6457 use one `.emacs' file, you can select which code to evaluate with the | |
6458 following conditional: | |
6459 | |
6460 (cond | |
6461 ((string-equal (number-to-string 21) (substring (emacs-version) 10 12)) | |
6462 ;; evaluate version 21 code | |
6463 ( ... )) | |
6464 ((string-equal (number-to-string 22) (substring (emacs-version) 10 12)) | |
6465 ;; evaluate version 22 code | |
6466 ( ... ))) | |
6467 | |
6468 For example, in contrast to version 20, version 21 blinks its cursor by | |
6469 default. I hate such blinking, as well as some other features in | |
6470 version 21, so I placed the following in my `.emacs' file(1): | |
6471 | |
6472 (if (string-equal "21" (substring (emacs-version) 10 12)) | |
6473 (progn | |
6474 (blink-cursor-mode 0) | |
6475 ;; Insert newline when you press `C-n' (next-line) | |
6476 ;; at the end of the buffer | |
6477 (setq next-line-add-newlines t) | |
6478 ;; Turn on image viewing | |
6479 (auto-image-file-mode t) | |
6480 ;; Turn on menu bar (this bar has text) | |
6481 ;; (Use numeric argument to turn on) | |
6482 (menu-bar-mode 1) | |
6483 ;; Turn off tool bar (this bar has icons) | |
6484 ;; (Use numeric argument to turn on) | |
6485 (tool-bar-mode nil) | |
6486 ;; Turn off tooltip mode for tool bar | |
6487 ;; (This mode causes icon explanations to pop up) | |
6488 ;; (Use numeric argument to turn on) | |
6489 (tooltip-mode nil) | |
6490 ;; If tooltips turned on, make tips appear promptly | |
6491 (setq tooltip-delay 0.1) ; default is one second | |
6492 )) | |
6493 | |
6494 (You will note that instead of typing `(number-to-string 21)', I | |
6495 decided to save typing and wrote `21' as a string, `"21"', rather than | |
6496 convert it from an integer to a string. In this instance, this | |
6497 expression is better than the longer, but more general | |
6498 `(number-to-string 21)'. However, if you do not know ahead of time | |
6499 what type of information will be returned, then the `number-to-string' | |
6500 function will be needed.) | |
6501 | |
6502 ---------- Footnotes ---------- | |
6503 | |
6504 (1) When I start instances of Emacs that do not load my `.emacs' file | |
6505 or any site file, I also turn off blinking: | |
6506 | |
6507 emacs -q --no-site-file -eval '(blink-cursor-mode nil)' | |
6508 | |
6509 Or nowadays, using an even more sophisticated set of options, | |
6510 | |
6511 emacs -Q - D | |
6512 | |
6513 | |
6514 File: eintr, Node: X11 Colors, Next: Miscellaneous, Prev: Simple Extension, Up: Emacs Initialization | |
6515 | |
6516 16.12 X11 Colors | |
6517 ================ | |
6518 | |
6519 You can specify colors when you use Emacs with the MIT X Windowing | |
6520 system. | |
6521 | |
6522 I dislike the default colors and specify my own. | |
6523 | |
6524 Here are the expressions in my `.emacs' file that set values: | |
6525 | |
6526 ;; Set cursor color | |
6527 (set-cursor-color "white") | |
6528 | |
6529 ;; Set mouse color | |
6530 (set-mouse-color "white") | |
6531 | |
6532 ;; Set foreground and background | |
6533 (set-foreground-color "white") | |
6534 (set-background-color "darkblue") | |
6535 | |
6536 ;;; Set highlighting colors for isearch and drag | |
6537 (set-face-foreground 'highlight "white") | |
6538 (set-face-background 'highlight "blue") | |
6539 | |
6540 (set-face-foreground 'region "cyan") | |
6541 (set-face-background 'region "blue") | |
6542 | |
6543 (set-face-foreground 'secondary-selection "skyblue") | |
6544 (set-face-background 'secondary-selection "darkblue") | |
6545 | |
6546 ;; Set calendar highlighting colors | |
6547 (setq calendar-load-hook | |
6548 '(lambda () | |
6549 (set-face-foreground 'diary-face "skyblue") | |
6550 (set-face-background 'holiday-face "slate blue") | |
6551 (set-face-foreground 'holiday-face "white"))) | |
6552 | |
6553 The various shades of blue soothe my eye and prevent me from seeing the | |
6554 screen flicker. | |
6555 | |
6556 Alternatively, I could have set my specifications in various X | |
6557 initialization files. For example, I could set the foreground, | |
6558 background, cursor, and pointer (i.e., mouse) colors in my | |
6559 `~/.Xresources' file like this: | |
6560 | |
6561 Emacs*foreground: white | |
6562 Emacs*background: darkblue | |
6563 Emacs*cursorColor: white | |
6564 Emacs*pointerColor: white | |
6565 | |
6566 In any event, since it is not part of Emacs, I set the root color of my | |
6567 X window in my `~/.xinitrc' file, like this(1): | |
6568 | |
6569 xsetroot -solid Navy -fg white & | |
6570 | |
6571 ---------- Footnotes ---------- | |
6572 | |
6573 (1) I also run more modern window managers, such as Enlightenment, | |
6574 Gnome, or KDE; in those cases, I often specify an image rather than a | |
6575 plain color. | |
6576 | |
6577 | |
6578 File: eintr, Node: Miscellaneous, Next: Mode Line, Prev: X11 Colors, Up: Emacs Initialization | |
6579 | |
6580 16.13 Miscellaneous Settings for a `.emacs' File | |
6581 ================================================ | |
6582 | |
6583 Here are a few miscellaneous settings: | |
6584 | |
6585 - Set the shape and color of the mouse cursor: | |
6586 | |
6587 ; Cursor shapes are defined in | |
6588 ; `/usr/include/X11/cursorfont.h'; | |
6589 ; for example, the `target' cursor is number 128; | |
6590 ; the `top_left_arrow' cursor is number 132. | |
6591 | |
6592 (let ((mpointer (x-get-resource "*mpointer" | |
6593 "*emacs*mpointer"))) | |
6594 ;; If you have not set your mouse pointer | |
6595 ;; then set it, otherwise leave as is: | |
6596 (if (eq mpointer nil) | |
6597 (setq mpointer "132")) ; top_left_arrow | |
6598 (setq x-pointer-shape (string-to-int mpointer)) | |
6599 (set-mouse-color "white")) | |
6600 | |
6601 - Or you can set the values of a variety of features in an alist, | |
6602 like this: | |
6603 | |
6604 (setq-default | |
6605 default-frame-alist | |
6606 '((cursor-color . "white") | |
6607 (mouse-color . "white") | |
6608 (foreground-color . "white") | |
6609 (background-color . "DodgerBlue4") | |
6610 ;; (cursor-type . bar) | |
6611 (cursor-type . box) | |
6612 (tool-bar-lines . 0) | |
6613 (menu-bar-lines . 1) | |
6614 (width . 80) | |
6615 (height . 58) | |
6616 (font . | |
6617 "-Misc-Fixed-Medium-R-Normal--20-200-75-75-C-100-ISO8859-1") | |
6618 )) | |
6619 | |
6620 - Convert `<CTRL>-h' into <DEL> and <DEL> into `<CTRL>-h'. | |
6621 (Some older keyboards needed this, although I have not seen the | |
6622 problem recently.) | |
6623 | |
6624 ;; Translate `C-h' to <DEL>. | |
6625 ; (keyboard-translate ?\C-h ?\C-?) | |
6626 | |
6627 ;; Translate <DEL> to `C-h'. | |
6628 (keyboard-translate ?\C-? ?\C-h) | |
6629 | |
6630 - Turn off a blinking cursor! | |
6631 | |
6632 (if (fboundp 'blink-cursor-mode) | |
6633 (blink-cursor-mode -1)) | |
6634 | |
6635 or start GNU Emacs with the command `emacs -nbc'. | |
6636 | |
6637 - Ignore case when using `grep' | |
6638 `-n' Prefix each line of output with line number | |
6639 `-i' Ignore case distinctions | |
6640 `-e' Protect patterns beginning with a hyphen character, `-' | |
6641 | |
6642 (setq grep-command "grep -n -i -e ") | |
6643 | |
6644 - Find an existing buffer, even if it has a different name | |
6645 This avoids problems with symbolic links. | |
6646 | |
6647 (setq find-file-existing-other-name t) | |
6648 | |
6649 - Set your language environment and default input method | |
6650 | |
6651 (set-language-environment "latin-1") | |
6652 ;; Remember you can enable or disable multilingual text input | |
6653 ;; with the `toggle-input-method'' (C-\) command | |
6654 (setq default-input-method "latin-1-prefix") | |
6655 | |
6656 If you want to write with Chinese `GB' characters, set this | |
6657 instead: | |
6658 | |
6659 (set-language-environment "Chinese-GB") | |
6660 (setq default-input-method "chinese-tonepy") | |
6661 | |
6662 Fixing Unpleasant Key Bindings | |
6663 .............................. | |
6664 | |
6665 Some systems bind keys unpleasantly. Sometimes, for example, the | |
6666 <CTRL> key appears in an awkward spot rather than at the far left of | |
6667 the home row. | |
6668 | |
6669 Usually, when people fix these sorts of keybindings, they do not change | |
6670 their `~/.emacs' file. Instead, they bind the proper keys on their | |
6671 consoles with the `loadkeys' or `install-keymap' commands in their boot | |
6672 script and then include `xmodmap' commands in their `.xinitrc' or | |
6673 `.Xsession' file for X Windows. | |
6674 | |
6675 For a boot script: | |
6676 | |
6677 loadkeys /usr/share/keymaps/i386/qwerty/emacs2.kmap.gz | |
6678 | |
6679 or | |
6680 | |
6681 install-keymap emacs2 | |
6682 | |
6683 For a `.xinitrc' or `.Xsession' file when the <Caps Lock> key is at the | |
6684 far left of the home row: | |
6685 | |
6686 # Bind the key labeled `Caps Lock' to `Control' | |
6687 # (Such a broken user interface suggests that keyboard manufacturers | |
6688 # think that computers are typewriters from 1885.) | |
6689 | |
6690 xmodmap -e "clear Lock" | |
6691 xmodmap -e "add Control = Caps_Lock" | |
6692 | |
6693 In a `.xinitrc' or `.Xsession' file, to convert an <ALT> key to a | |
6694 <META> key: | |
6695 | |
6696 # Some ill designed keyboards have a key labeled ALT and no Meta | |
6697 xmodmap -e "keysym Alt_L = Meta_L Alt_L" | |
6698 | |
6699 | |
6700 File: eintr, Node: Mode Line, Prev: Miscellaneous, Up: Emacs Initialization | |
6701 | |
6702 16.14 A Modified Mode Line | |
6703 ========================== | |
6704 | |
6705 Finally, a feature I really like: a modified mode line. | |
6706 | |
6707 When I work over a network, I forget which machine I am using. Also, I | |
6708 tend to I lose track of where I am, and which line point is on. | |
6709 | |
6710 So I reset my mode line to look like this: | |
6711 | |
6712 -:-- foo.texi rattlesnake:/home/bob/ Line 1 (Texinfo Fill) Top | |
6713 | |
6714 I am visiting a file called `foo.texi', on my machine `rattlesnake' in | |
6715 my `/home/bob' buffer. I am on line 1, in Texinfo mode, and am at the | |
6716 top of the buffer. | |
6717 | |
6718 My `.emacs' file has a section that looks like this: | |
6719 | |
6720 ;; Set a Mode Line that tells me which machine, which directory, | |
6721 ;; and which line I am on, plus the other customary information. | |
6722 (setq default-mode-line-format | |
6723 (quote | |
6724 (#("-" 0 1 | |
6725 (help-echo | |
6726 "mouse-1: select window, mouse-2: delete others ...")) | |
6727 mode-line-mule-info | |
6728 mode-line-modified | |
6729 mode-line-frame-identification | |
6730 " " | |
6731 mode-line-buffer-identification | |
6732 " " | |
6733 (:eval (substring | |
6734 (system-name) 0 (string-match "\\..+" (system-name)))) | |
6735 ":" | |
6736 default-directory | |
6737 #(" " 0 1 | |
6738 (help-echo | |
6739 "mouse-1: select window, mouse-2: delete others ...")) | |
6740 (line-number-mode " Line %l ") | |
6741 global-mode-string | |
6742 #(" %[(" 0 6 | |
6743 (help-echo | |
6744 "mouse-1: select window, mouse-2: delete others ...")) | |
6745 (:eval (mode-line-mode-name)) | |
6746 mode-line-process | |
6747 minor-mode-alist | |
6748 #("%n" 0 2 (help-echo "mouse-2: widen" local-map (keymap ...))) | |
6749 ")%] " | |
6750 (-3 . "%P") | |
6751 ;; "-%-" | |
6752 ))) | |
6753 | |
6754 Here, I redefine the default mode line. Most of the parts are from the | |
6755 original; but I make a few changes. I set the _default_ mode line | |
6756 format so as to permit various modes, such as Info, to override it. | |
6757 | |
6758 Many elements in the list are self-explanatory: `mode-line-modified' is | |
6759 a variable that tells whether the buffer has been modified, `mode-name' | |
6760 tells the name of the mode, and so on. However, the format looks | |
6761 complicated because of two features we have not discussed. | |
6762 | |
6763 The first string in the mode line is a dash or hyphen, `-'. In the old | |
6764 days, it would have been specified simply as `"-"'. But nowadays, | |
6765 Emacs can add properties to a string, such as highlighting or, as in | |
6766 this case, a help feature. If you place your mouse cursor over the | |
6767 hyphen, some help information appears (By default, you must wait | |
6768 seven-tenths of a second before the information appears. You can | |
6769 change that timing by changing the value of `tooltip-delay'.) | |
6770 | |
6771 The new string format has a special syntax: | |
6772 | |
6773 #("-" 0 1 (help-echo "mouse-1: select window, ...")) | |
6774 | |
6775 The `#(' begins a list. The first element of the list is the string | |
6776 itself, just one `-'. The second and third elements specify the range | |
6777 over which the fourth element applies. A range starts _after_ a | |
6778 character, so a zero means the range starts just before the first | |
6779 character; a 1 means that the range ends just after the first | |
6780 character. The third element is the property for the range. It | |
6781 consists of a property list, a property name, in this case, | |
6782 `help-echo', followed by a value, in this case, a string. The second, | |
6783 third, and fourth elements of this new string format can be repeated. | |
6784 | |
6785 *Note Text Properties: (elisp)Text Properties, and see *Note Mode Line | |
6786 Format: (elisp)Mode Line Format, for more information. | |
6787 | |
6788 `mode-line-buffer-identification' displays the current buffer name. It | |
6789 is a list beginning `(#("%12b" 0 4 ...'. The `#(' begins the list. | |
6790 | |
6791 The `"%12b"' displays the current buffer name, using the `buffer-name' | |
6792 function with which we are familiar; the `12' specifies the maximum | |
6793 number of characters that will be displayed. When a name has fewer | |
6794 characters, whitespace is added to fill out to this number. (Buffer | |
6795 names can and often should be longer than 12 characters; this length | |
6796 works well in a typical 80 column wide window.) | |
6797 | |
6798 `:eval' was a new feature in GNU Emacs version 21. It says to evaluate | |
6799 the following form and use the result as a string to display. In this | |
6800 case, the expression displays the first component of the full system | |
6801 name. The end of the first component is a `.' (`period'), so I use the | |
6802 `string-match' function to tell me the length of the first component. | |
6803 The substring from the zeroth character to that length is the name of | |
6804 the machine. | |
6805 | |
6806 This is the expression: | |
6807 | |
6808 (:eval (substring | |
6809 (system-name) 0 (string-match "\\..+" (system-name)))) | |
6810 | |
6811 `%[' and `%]' cause a pair of square brackets to appear for each | |
6812 recursive editing level. `%n' says `Narrow' when narrowing is in | |
6813 effect. `%P' tells you the percentage of the buffer that is above the | |
6814 bottom of the window, or `Top', `Bottom', or `All'. (A lower case `p' | |
6815 tell you the percentage above the _top_ of the window.) `%-' inserts | |
6816 enough dashes to fill out the line. | |
6817 | |
6818 Remember, "You don't have to like Emacs to like it" -- your own Emacs | |
6819 can have different colors, different commands, and different keys than | |
6820 a default Emacs. | |
6821 | |
6822 On the other hand, if you want to bring up a plain `out of the box' | |
6823 Emacs, with no customization, type: | |
6824 | |
6825 emacs -q | |
6826 | |
6827 This will start an Emacs that does _not_ load your `~/.emacs' | |
6828 initialization file. A plain, default Emacs. Nothing more. | |
6829 | |
6830 | |
6831 File: eintr, Node: Debugging, Next: Conclusion, Prev: Emacs Initialization, Up: Top | |
6832 | |
6833 17 Debugging | |
6834 ************ | |
6835 | |
6836 GNU Emacs has two debuggers, `debug' and `edebug'. The first is built | |
6837 into the internals of Emacs and is always with you; the second requires | |
6838 that you instrument a function before you can use it. | |
6839 | |
6840 Both debuggers are described extensively in *Note Debugging Lisp | |
6841 Programs: (elisp)Debugging. In this chapter, I will walk through a | |
6842 short example of each. | |
6843 | |
6844 * Menu: | |
6845 | |
6846 * debug:: | |
6847 * debug-on-entry:: | |
6848 * debug-on-quit:: | |
6849 * edebug:: | |
6850 * Debugging Exercises:: | |
6851 | |
6852 | |
6853 File: eintr, Node: debug, Next: debug-on-entry, Prev: Debugging, Up: Debugging | |
6854 | |
6855 17.1 `debug' | |
6856 ============ | |
6857 | |
6858 Suppose you have written a function definition that is intended to | |
6859 return the sum of the numbers 1 through a given number. (This is the | |
6860 `triangle' function discussed earlier. *Note Example with Decrementing | |
6861 Counter: Decrementing Example, for a discussion.) | |
6862 | |
6863 However, your function definition has a bug. You have mistyped `1=' | |
6864 for `1-'. Here is the broken definition: | |
6865 | |
6866 (defun triangle-bugged (number) | |
6867 "Return sum of numbers 1 through NUMBER inclusive." | |
6868 (let ((total 0)) | |
6869 (while (> number 0) | |
6870 (setq total (+ total number)) | |
6871 (setq number (1= number))) ; Error here. | |
6872 total)) | |
6873 | |
6874 If you are reading this in Info, you can evaluate this definition in | |
6875 the normal fashion. You will see `triangle-bugged' appear in the echo | |
6876 area. | |
6877 | |
6878 Now evaluate the `triangle-bugged' function with an argument of 4: | |
6879 | |
6880 (triangle-bugged 4) | |
6881 | |
6882 In GNU Emacs version 21, you will create and enter a `*Backtrace*' | |
6883 buffer that says: | |
6884 | |
6885 | |
6886 ---------- Buffer: *Backtrace* ---------- | |
6887 Debugger entered--Lisp error: (void-function 1=) | |
6888 (1= number) | |
6889 (setq number (1= number)) | |
6890 (while (> number 0) (setq total (+ total number)) | |
6891 (setq number (1= number))) | |
6892 (let ((total 0)) (while (> number 0) (setq total ...) | |
6893 (setq number ...)) total) | |
6894 triangle-bugged(4) | |
6895 eval((triangle-bugged 4)) | |
6896 eval-last-sexp-1(nil) | |
6897 eval-last-sexp(nil) | |
6898 call-interactively(eval-last-sexp) | |
6899 ---------- Buffer: *Backtrace* ---------- | |
6900 | |
6901 (I have reformatted this example slightly; the debugger does not fold | |
6902 long lines. As usual, you can quit the debugger by typing `q' in the | |
6903 `*Backtrace*' buffer.) | |
6904 | |
6905 In practice, for a bug as simple as this, the `Lisp error' line will | |
6906 tell you what you need to know to correct the definition. The function | |
6907 `1=' is `void'. | |
6908 | |
6909 However, suppose you are not quite certain what is going on? You can | |
6910 read the complete backtrace. | |
6911 | |
6912 In this case, you need to run GNU Emacs 22, which automatically starts | |
6913 the debugger that puts you in the `*Backtrace*' buffer; or else, you | |
6914 need to start the debugger manually as described below. | |
6915 | |
6916 Read the `*Backtrace*' buffer from the bottom up; it tells you what | |
6917 Emacs did that led to the error. Emacs made an interactive call to | |
6918 `C-x C-e' (`eval-last-sexp'), which led to the evaluation of the | |
6919 `triangle-bugged' expression. Each line above tells you what the Lisp | |
6920 interpreter evaluated next. | |
6921 | |
6922 The third line from the top of the buffer is | |
6923 | |
6924 (setq number (1= number)) | |
6925 | |
6926 Emacs tried to evaluate this expression; in order to do so, it tried to | |
6927 evaluate the inner expression shown on the second line from the top: | |
6928 | |
6929 (1= number) | |
6930 | |
6931 This is where the error occurred; as the top line says: | |
6932 | |
6933 Debugger entered--Lisp error: (void-function 1=) | |
6934 | |
6935 You can correct the mistake, re-evaluate the function definition, and | |
6936 then run your test again. | |
6937 | |
6938 | |
6939 File: eintr, Node: debug-on-entry, Next: debug-on-quit, Prev: debug, Up: Debugging | |
6940 | |
6941 17.2 `debug-on-entry' | |
6942 ===================== | |
6943 | |
6944 GNU Emacs 22 starts the debugger automatically when your function has | |
6945 an error. | |
6946 | |
6947 Incidentally, you can start the debugger manually for all versions of | |
6948 Emacs; the advantage is that the debugger runs even if you do not have | |
6949 a bug in your code. Sometimes your code will be free of bugs! | |
6950 | |
6951 You can enter the debugger when you call the function by calling | |
6952 `debug-on-entry'. | |
6953 | |
6954 Type: | |
6955 | |
6956 M-x debug-on-entry RET triangle-bugged RET | |
6957 | |
6958 Now, evaluate the following: | |
6959 | |
6960 (triangle-bugged 5) | |
6961 | |
6962 All versions of Emacs will create a `*Backtrace*' buffer and tell you | |
6963 that it is beginning to evaluate the `triangle-bugged' function: | |
6964 | |
6965 ---------- Buffer: *Backtrace* ---------- | |
6966 Debugger entered--entering a function: | |
6967 * triangle-bugged(5) | |
6968 eval((triangle-bugged 5)) | |
6969 eval-last-sexp-1(nil) | |
6970 eval-last-sexp(nil) | |
6971 call-interactively(eval-last-sexp) | |
6972 ---------- Buffer: *Backtrace* ---------- | |
6973 | |
6974 In the `*Backtrace*' buffer, type `d'. Emacs will evaluate the first | |
6975 expression in `triangle-bugged'; the buffer will look like this: | |
6976 | |
6977 ---------- Buffer: *Backtrace* ---------- | |
6978 Debugger entered--beginning evaluation of function call form: | |
6979 * (let ((total 0)) (while (> number 0) (setq total ...) | |
6980 (setq number ...)) total) | |
6981 * triangle-bugged(5) | |
6982 eval((triangle-bugged 5)) | |
6983 eval-last-sexp-1(nil) | |
6984 eval-last-sexp(nil) | |
6985 call-interactively(eval-last-sexp) | |
6986 ---------- Buffer: *Backtrace* ---------- | |
6987 | |
6988 Now, type `d' again, eight times, slowly. Each time you type `d', | |
6989 Emacs will evaluate another expression in the function definition. | |
6990 | |
6991 Eventually, the buffer will look like this: | |
6992 | |
6993 ---------- Buffer: *Backtrace* ---------- | |
6994 Debugger entered--beginning evaluation of function call form: | |
6995 * (setq number (1= number)) | |
6996 * (while (> number 0) (setq total (+ total number)) | |
6997 (setq number (1= number))) | |
6998 * (let ((total 0)) (while (> number 0) (setq total ...) | |
6999 (setq number ...)) total) | |
7000 * triangle-bugged(5) | |
7001 eval((triangle-bugged 5)) | |
7002 eval-last-sexp-1(nil) | |
7003 eval-last-sexp(nil) | |
7004 call-interactively(eval-last-sexp) | |
7005 ---------- Buffer: *Backtrace* ---------- | |
7006 | |
7007 Finally, after you type `d' two more times, Emacs will reach the error, | |
7008 and the top two lines of the `*Backtrace*' buffer will look like this: | |
7009 | |
7010 ---------- Buffer: *Backtrace* ---------- | |
7011 Debugger entered--Lisp error: (void-function 1=) | |
7012 * (1= number) | |
7013 ... | |
7014 ---------- Buffer: *Backtrace* ---------- | |
7015 | |
7016 By typing `d', you were able to step through the function. | |
7017 | |
7018 You can quit a `*Backtrace*' buffer by typing `q' in it; this quits the | |
7019 trace, but does not cancel `debug-on-entry'. | |
7020 | |
7021 To cancel the effect of `debug-on-entry', call `cancel-debug-on-entry' | |
7022 and the name of the function, like this: | |
7023 | |
7024 M-x cancel-debug-on-entry RET triangle-bugged RET | |
7025 | |
7026 (If you are reading this in Info, cancel `debug-on-entry' now.) | |
7027 | |
7028 | |
7029 File: eintr, Node: debug-on-quit, Next: edebug, Prev: debug-on-entry, Up: Debugging | |
7030 | |
7031 17.3 `debug-on-quit' and `(debug)' | |
7032 ================================== | |
7033 | |
7034 In addition to setting `debug-on-error' or calling `debug-on-entry', | |
7035 there are two other ways to start `debug'. | |
7036 | |
7037 You can start `debug' whenever you type `C-g' (`keyboard-quit') by | |
7038 setting the variable `debug-on-quit' to `t'. This is useful for | |
7039 debugging infinite loops. | |
7040 | |
7041 Or, you can insert a line that says `(debug)' into your code where you | |
7042 want the debugger to start, like this: | |
7043 | |
7044 (defun triangle-bugged (number) | |
7045 "Return sum of numbers 1 through NUMBER inclusive." | |
7046 (let ((total 0)) | |
7047 (while (> number 0) | |
7048 (setq total (+ total number)) | |
7049 (debug) ; Start debugger. | |
7050 (setq number (1= number))) ; Error here. | |
7051 total)) | |
7052 | |
7053 The `debug' function is described in detail in *Note The Lisp Debugger: | |
7054 (elisp)Debugger. | |
7055 | |
7056 | |
7057 File: eintr, Node: edebug, Next: Debugging Exercises, Prev: debug-on-quit, Up: Debugging | |
7058 | |
7059 17.4 The `edebug' Source Level Debugger | |
7060 ======================================= | |
7061 | |
7062 Edebug is a source level debugger. Edebug normally displays the source | |
7063 of the code you are debugging, with an arrow at the left that shows | |
7064 which line you are currently executing. | |
7065 | |
7066 You can walk through the execution of a function, line by line, or run | |
7067 quickly until reaching a "breakpoint" where execution stops. | |
7068 | |
7069 Edebug is described in *Note Edebug: (elisp)edebug. | |
7070 | |
7071 Here is a bugged function definition for `triangle-recursively'. *Note | |
7072 Recursion in place of a counter: Recursive triangle function, for a | |
7073 review of it. | |
7074 | |
7075 (defun triangle-recursively-bugged (number) | |
7076 "Return sum of numbers 1 through NUMBER inclusive. | |
7077 Uses recursion." | |
7078 (if (= number 1) | |
7079 1 | |
7080 (+ number | |
7081 (triangle-recursively-bugged | |
7082 (1= number))))) ; Error here. | |
7083 | |
7084 Normally, you would install this definition by positioning your cursor | |
7085 after the function's closing parenthesis and typing `C-x C-e' | |
7086 (`eval-last-sexp') or else by positioning your cursor within the | |
7087 definition and typing `C-M-x' (`eval-defun'). (By default, the | |
7088 `eval-defun' command works only in Emacs Lisp mode or in Lisp | |
7089 Interactive mode.) | |
7090 | |
7091 However, to prepare this function definition for Edebug, you must first | |
7092 "instrument" the code using a different command. You can do this by | |
7093 positioning your cursor within the definition and typing | |
7094 | |
7095 M-x edebug-defun RET | |
7096 | |
7097 This will cause Emacs to load Edebug automatically if it is not already | |
7098 loaded, and properly instrument the function. | |
7099 | |
7100 After instrumenting the function, place your cursor after the following | |
7101 expression and type `C-x C-e' (`eval-last-sexp'): | |
7102 | |
7103 (triangle-recursively-bugged 3) | |
7104 | |
7105 You will be jumped back to the source for `triangle-recursively-bugged' | |
7106 and the cursor positioned at the beginning of the `if' line of the | |
7107 function. Also, you will see an arrowhead at the left hand side of | |
7108 that line. The arrowhead marks the line where the function is | |
7109 executing. (In the following examples, we show the arrowhead with | |
7110 `=>'; in a windowing system, you may see the arrowhead as a solid | |
7111 triangle in the window `fringe'.) | |
7112 | |
7113 =>-!-(if (= number 1) | |
7114 | |
7115 In the example, the location of point is displayed as `-!-' (in a | |
7116 printed book, it is displayed with a five pointed star). | |
7117 | |
7118 If you now press <SPC>, point will move to the next expression to be | |
7119 executed; the line will look like this: | |
7120 | |
7121 =>(if -!-(= number 1) | |
7122 | |
7123 As you continue to press <SPC>, point will move from expression to | |
7124 expression. At the same time, whenever an expression returns a value, | |
7125 that value will be displayed in the echo area. For example, after you | |
7126 move point past `number', you will see the following: | |
7127 | |
7128 Result: 3 (#o3, #x3, ?\C-c) | |
7129 | |
7130 This means the value of `number' is 3, which is octal three, | |
7131 hexadecimal three, and ASCII `control-c' (the third letter of the | |
7132 alphabet, in case you need to know this information). | |
7133 | |
7134 You can continue moving through the code until you reach the line with | |
7135 the error. Before evaluation, that line looks like this: | |
7136 | |
7137 => -!-(1= number))))) ; Error here. | |
7138 | |
7139 When you press <SPC> once again, you will produce an error message that | |
7140 says: | |
7141 | |
7142 Symbol's function definition is void: 1= | |
7143 | |
7144 This is the bug. | |
7145 | |
7146 Press `q' to quit Edebug. | |
7147 | |
7148 To remove instrumentation from a function definition, simply | |
7149 re-evaluate it with a command that does not instrument it. For | |
7150 example, you could place your cursor after the definition's closing | |
7151 parenthesis and type `C-x C-e'. | |
7152 | |
7153 Edebug does a great deal more than walk with you through a function. | |
7154 You can set it so it races through on its own, stopping only at an | |
7155 error or at specified stopping points; you can cause it to display the | |
7156 changing values of various expressions; you can find out how many times | |
7157 a function is called, and more. | |
7158 | |
7159 Edebug is described in *Note Edebug: (elisp)edebug. | |
7160 | |
7161 | |
7162 File: eintr, Node: Debugging Exercises, Prev: edebug, Up: Debugging | |
7163 | |
7164 17.5 Debugging Exercises | |
7165 ======================== | |
7166 | |
7167 * Install the `count-words-region' function and then cause it to | |
7168 enter the built-in debugger when you call it. Run the command on a | |
7169 region containing two words. You will need to press `d' a | |
7170 remarkable number of times. On your system, is a `hook' called | |
7171 after the command finishes? (For information on hooks, see *Note | |
7172 Command Loop Overview: (elisp)Command Overview.) | |
7173 | |
7174 * Copy `count-words-region' into the `*scratch*' buffer, instrument | |
7175 the function for Edebug, and walk through its execution. The | |
7176 function does not need to have a bug, although you can introduce | |
7177 one if you wish. If the function lacks a bug, the walk-through | |
7178 completes without problems. | |
7179 | |
7180 * While running Edebug, type `?' to see a list of all the Edebug | |
7181 commands. (The `global-edebug-prefix' is usually `C-x X', i.e. | |
7182 `<CTRL>-x' followed by an upper case `X'; use this prefix for | |
7183 commands made outside of the Edebug debugging buffer.) | |
7184 | |
7185 * In the Edebug debugging buffer, use the `p' | |
7186 (`edebug-bounce-point') command to see where in the region the | |
7187 `count-words-region' is working. | |
7188 | |
7189 * Move point to some spot further down the function and then type the | |
7190 `h' (`edebug-goto-here') command to jump to that location. | |
7191 | |
7192 * Use the `t' (`edebug-trace-mode') command to cause Edebug to walk | |
7193 through the function on its own; use an upper case `T' for | |
7194 `edebug-Trace-fast-mode'. | |
7195 | |
7196 * Set a breakpoint, then run Edebug in Trace mode until it reaches | |
7197 the stopping point. | |
7198 | |
7199 | |
7200 File: eintr, Node: Conclusion, Next: the-the, Prev: Debugging, Up: Top | |
7201 | |
7202 18 Conclusion | |
7203 ************* | |
7204 | |
7205 We have now reached the end of this Introduction. You have now learned | |
7206 enough about programming in Emacs Lisp to set values, to write simple | |
7207 `.emacs' files for yourself and your friends, and write simple | |
7208 customizations and extensions to Emacs. | |
7209 | |
7210 This is a place to stop. Or, if you wish, you can now go onward, and | |
7211 teach yourself. | |
7212 | |
7213 You have learned some of the basic nuts and bolts of programming. But | |
7214 only some. There are a great many more brackets and hinges that are | |
7215 easy to use that we have not touched. | |
7216 | |
7217 A path you can follow right now lies among the sources to GNU Emacs and | |
7218 in *Note The GNU Emacs Lisp Reference Manual: (elisp)Top. | |
7219 | |
7220 The Emacs Lisp sources are an adventure. When you read the sources and | |
7221 come across a function or expression that is unfamiliar, you need to | |
7222 figure out or find out what it does. | |
7223 | |
7224 Go to the Reference Manual. It is a thorough, complete, and fairly | |
7225 easy-to-read description of Emacs Lisp. It is written not only for | |
7226 experts, but for people who know what you know. (The `Reference | |
7227 Manual' comes with the standard GNU Emacs distribution. Like this | |
7228 introduction, it comes as a Texinfo source file, so you can read it | |
7229 on-line and as a typeset, printed book.) | |
7230 | |
7231 Go to the other on-line help that is part of GNU Emacs: the on-line | |
7232 documentation for all functions and variables, and `find-tags', the | |
7233 program that takes you to sources. | |
7234 | |
7235 Here is an example of how I explore the sources. Because of its name, | |
7236 `simple.el' is the file I looked at first, a long time ago. As it | |
7237 happens some of the functions in `simple.el' are complicated, or at | |
7238 least look complicated at first sight. The `open-line' function, for | |
7239 example, looks complicated. | |
7240 | |
7241 You may want to walk through this function slowly, as we did with the | |
7242 `forward-sentence' function. (*Note The `forward-sentence' function: | |
7243 forward-sentence.) Or you may want to skip that function and look at | |
7244 another, such as `split-line'. You don't need to read all the | |
7245 functions. According to `count-words-in-defun', the `split-line' | |
7246 function contains 102 words and symbols. | |
7247 | |
7248 Even though it is short, `split-line' contains expressions we have not | |
7249 studied: `skip-chars-forward', `indent-to', `current-column' and | |
7250 `insert-and-inherit'. | |
7251 | |
7252 Consider the `skip-chars-forward' function. (It is part of the | |
7253 function definition for `back-to-indentation', which is shown in *Note | |
7254 Review: Review.) | |
7255 | |
7256 In GNU Emacs, you can find out more about `skip-chars-forward' by | |
7257 typing `C-h f' (`describe-function') and the name of the function. | |
7258 This gives you the function documentation. | |
7259 | |
7260 You may be able to guess what is done by a well named function such as | |
7261 `indent-to'; or you can look it up, too. Incidentally, the | |
7262 `describe-function' function itself is in `help.el'; it is one of those | |
7263 long, but decipherable functions. You can look up `describe-function' | |
7264 using the `C-h f' command! | |
7265 | |
7266 In this instance, since the code is Lisp, the `*Help*' buffer contains | |
7267 the name of the library containing the function's source. You can put | |
7268 point over the name of the library and press the RET key, which in this | |
7269 situation is bound to `help-follow', and be taken directly to the | |
7270 source, in the same way as `M-.' (`find-tag'). | |
7271 | |
7272 The definition for `describe-function' illustrates how to customize the | |
7273 `interactive' expression without using the standard character codes; | |
7274 and it shows how to create a temporary buffer. | |
7275 | |
7276 (The `indent-to' function is written in C rather than Emacs Lisp; it is | |
7277 a `built-in' function. `help-follow' takes you to its source as does | |
7278 `find-tag', when properly set up.) | |
7279 | |
7280 You can look at a function's source using `find-tag', which is bound to | |
7281 `M-.' Finally, you can find out what the Reference Manual has to say | |
7282 by visiting the manual in Info, and typing `i' (`Info-index') and the | |
7283 name of the function, or by looking up the function in the index to a | |
7284 printed copy of the manual. | |
7285 | |
7286 Similarly, you can find out what is meant by `insert-and-inherit'. | |
7287 | |
7288 Other interesting source files include `paragraphs.el', `loaddefs.el', | |
7289 and `loadup.el'. The `paragraphs.el' file includes short, easily | |
7290 understood functions as well as longer ones. The `loaddefs.el' file | |
7291 contains the many standard autoloads and many keymaps. I have never | |
7292 looked at it all; only at parts. `loadup.el' is the file that loads | |
7293 the standard parts of Emacs; it tells you a great deal about how Emacs | |
7294 is built. (*Note Building Emacs: (elisp)Building Emacs, for more about | |
7295 building.) | |
7296 | |
7297 As I said, you have learned some nuts and bolts; however, and very | |
7298 importantly, we have hardly touched major aspects of programming; I | |
7299 have said nothing about how to sort information, except to use the | |
7300 predefined `sort' function; I have said nothing about how to store | |
7301 information, except to use variables and lists; I have said nothing | |
7302 about how to write programs that write programs. These are topics for | |
7303 another, and different kind of book, a different kind of learning. | |
7304 | |
7305 What you have done is learn enough for much practical work with GNU | |
7306 Emacs. What you have done is get started. This is the end of a | |
7307 beginning. | |
7308 | |
7309 | |
7310 File: eintr, Node: the-the, Next: Kill Ring, Prev: Conclusion, Up: Top | |
7311 | |
7312 Appendix A The `the-the' Function | |
7313 ********************************* | |
7314 | |
7315 Sometimes when you you write text, you duplicate words--as with "you | |
7316 you" near the beginning of this sentence. I find that most frequently, | |
7317 I duplicate "the"; hence, I call the function for detecting duplicated | |
7318 words, `the-the'. | |
7319 | |
7320 As a first step, you could use the following regular expression to | |
7321 search for duplicates: | |
7322 | |
7323 \\(\\w+[ \t\n]+\\)\\1 | |
7324 | |
7325 This regexp matches one or more word-constituent characters followed by | |
7326 one or more spaces, tabs, or newlines. However, it does not detect | |
7327 duplicated words on different lines, since the ending of the first | |
7328 word, the end of the line, is different from the ending of the second | |
7329 word, a space. (For more information about regular expressions, see | |
7330 *Note Regular Expression Searches: Regexp Search, as well as *Note | |
7331 Syntax of Regular Expressions: (emacs)Regexps, and *Note Regular | |
7332 Expressions: (elisp)Regular Expressions.) | |
7333 | |
7334 You might try searching just for duplicated word-constituent characters | |
7335 but that does not work since the pattern detects doubles such as the | |
7336 two occurrences of `th' in `with the'. | |
7337 | |
7338 Another possible regexp searches for word-constituent characters | |
7339 followed by non-word-constituent characters, reduplicated. Here, | |
7340 `\\w+' matches one or more word-constituent characters and `\\W*' | |
7341 matches zero or more non-word-constituent characters. | |
7342 | |
7343 \\(\\(\\w+\\)\\W*\\)\\1 | |
7344 | |
7345 Again, not useful. | |
7346 | |
7347 Here is the pattern that I use. It is not perfect, but good enough. | |
7348 `\\b' matches the empty string, provided it is at the beginning or end | |
7349 of a word; `[^@ \n\t]+' matches one or more occurrences of any | |
7350 characters that are _not_ an @-sign, space, newline, or tab. | |
7351 | |
7352 \\b\\([^@ \n\t]+\\)[ \n\t]+\\1\\b | |
7353 | |
7354 One can write more complicated expressions, but I found that this | |
7355 expression is good enough, so I use it. | |
7356 | |
7357 Here is the `the-the' function, as I include it in my `.emacs' file, | |
7358 along with a handy global key binding: | |
7359 | |
7360 (defun the-the () | |
7361 "Search forward for for a duplicated word." | |
7362 (interactive) | |
7363 (message "Searching for for duplicated words ...") | |
7364 (push-mark) | |
7365 ;; This regexp is not perfect | |
7366 ;; but is fairly good over all: | |
7367 (if (re-search-forward | |
7368 "\\b\\([^@ \n\t]+\\)[ \n\t]+\\1\\b" nil 'move) | |
7369 (message "Found duplicated word.") | |
7370 (message "End of buffer"))) | |
7371 | |
7372 ;; Bind `the-the' to C-c \ | |
7373 (global-set-key "\C-c\\" 'the-the) | |
7374 | |
7375 | |
7376 Here is test text: | |
7377 | |
7378 one two two three four five | |
7379 five six seven | |
7380 | |
7381 You can substitute the other regular expressions shown above in the | |
7382 function definition and try each of them on this list. | |
7383 | |
7384 | |
7385 File: eintr, Node: Kill Ring, Next: Full Graph, Prev: the-the, Up: Top | |
7386 | |
7387 Appendix B Handling the Kill Ring | |
7388 ********************************* | |
7389 | |
7390 The kill ring is a list that is transformed into a ring by the workings | |
7391 of the `current-kill' function. The `yank' and `yank-pop' commands use | |
7392 the `current-kill' function. | |
7393 | |
7394 This appendix describes the `current-kill' function as well as both the | |
7395 `yank' and the `yank-pop' commands, but first, consider the workings of | |
7396 the kill ring. | |
7397 | |
7398 The kill ring has a default maximum length of sixty items; this number | |
7399 is too large for an explanation. Instead, set it to four. Please | |
7400 evaluate the following: | |
7401 | |
7402 (setq old-kill-ring-max kill-ring-max) | |
7403 (setq kill-ring-max 4) | |
7404 | |
7405 Then, please copy each line of the following indented example into the | |
7406 kill ring. You may kill each line with `C-k' or mark it and copy it | |
7407 with `M-w'. | |
7408 | |
7409 (In a read-only buffer, such as the `*info*' buffer, the kill command, | |
7410 `C-k' (`kill-line'), will not remove the text, merely copy it to the | |
7411 kill ring. However, your machine may beep at you. (`kill-line' calls | |
7412 `kill-region'.) Alternatively, for silence, you may copy the region of | |
7413 each line with the `M-w' (`kill-ring-save') command. You must mark | |
7414 each line for this command to succeed, but it does not matter at which | |
7415 end you put point or mark.) | |
7416 | |
7417 Please invoke the calls in order, so that five elements attempt to fill | |
7418 the kill ring: | |
7419 | |
7420 first some text | |
7421 second piece of text | |
7422 third line | |
7423 fourth line of text | |
7424 fifth bit of text | |
7425 | |
7426 Then find the value of `kill-ring' by evaluating | |
7427 | |
7428 kill-ring | |
7429 | |
7430 It is: | |
7431 | |
7432 ("fifth bit of text" "fourth line of text" | |
7433 "third line" "second piece of text") | |
7434 | |
7435 The first element, `first some text', was dropped. | |
7436 | |
7437 To return to the old value for the length of the kill ring, evaluate: | |
7438 | |
7439 (setq kill-ring-max old-kill-ring-max) | |
7440 | |
7441 * Menu: | |
7442 | |
7443 * current-kill:: | |
7444 * yank:: | |
7445 * yank-pop:: | |
7446 * ring file:: | |
7447 | |
7448 | |
7449 File: eintr, Node: current-kill, Next: yank, Prev: Kill Ring, Up: Kill Ring | |
7450 | |
7451 B.1 The `current-kill' Function | |
7452 =============================== | |
7453 | |
7454 The `current-kill' function changes the element in the kill ring to | |
7455 which `kill-ring-yank-pointer' points. (Also, the `kill-new' function | |
7456 sets `kill-ring-yank-pointer' to point to the latest element of the the | |
7457 kill ring.) | |
7458 | |
7459 The `current-kill' function is used by `yank' and by `yank-pop'. Here | |
7460 is the code for `current-kill': | |
7461 | |
7462 (defun current-kill (n &optional do-not-move) | |
7463 "Rotate the yanking point by N places, and then return that kill. | |
7464 If N is zero, `interprogram-paste-function' is set, and calling it | |
7465 returns a string, then that string is added to the front of the | |
7466 kill ring and returned as the latest kill. | |
7467 If optional arg DO-NOT-MOVE is non-nil, then don't actually move the | |
7468 yanking point; just return the Nth kill forward." | |
7469 (let ((interprogram-paste (and (= n 0) | |
7470 interprogram-paste-function | |
7471 (funcall interprogram-paste-function)))) | |
7472 (if interprogram-paste | |
7473 (progn | |
7474 ;; Disable the interprogram cut function when we add the new | |
7475 ;; text to the kill ring, so Emacs doesn't try to own the | |
7476 ;; selection, with identical text. | |
7477 (let ((interprogram-cut-function nil)) | |
7478 (kill-new interprogram-paste)) | |
7479 interprogram-paste) | |
7480 (or kill-ring (error "Kill ring is empty")) | |
7481 (let ((ARGth-kill-element | |
7482 (nthcdr (mod (- n (length kill-ring-yank-pointer)) | |
7483 (length kill-ring)) | |
7484 kill-ring))) | |
7485 (or do-not-move | |
7486 (setq kill-ring-yank-pointer ARGth-kill-element)) | |
7487 (car ARGth-kill-element))))) | |
7488 | |
7489 In addition, the `kill-new' function sets `kill-ring-yank-pointer' to | |
7490 the latest element of the the kill ring. And indirectly so does | |
7491 `kill-append', since it calls `kill-new'. In addition, `kill-region' | |
7492 and `kill-line' call the `kill-new' function. | |
7493 | |
7494 Here is the line in `kill-new', which is explained in *Note The | |
7495 `kill-new' function: kill-new function. | |
7496 | |
7497 (setq kill-ring-yank-pointer kill-ring) | |
7498 | |
7499 * Menu: | |
7500 | |
7501 * Understanding current-kill:: | |
7502 |