Mercurial > emacs
comparison lispref/strings.texi @ 22138:d4ac295a98b3
*** empty log message ***
author | Richard M. Stallman <rms@gnu.org> |
---|---|
date | Tue, 19 May 1998 03:45:57 +0000 |
parents | 90da2489c498 |
children | 40089afa2b1d |
comparison
equal
deleted
inserted
replaced
22137:2b0e6a1e7fb9 | 22138:d4ac295a98b3 |
---|---|
26 * Predicates for Strings:: Testing whether an object is a string or char. | 26 * Predicates for Strings:: Testing whether an object is a string or char. |
27 * Creating Strings:: Functions to allocate new strings. | 27 * Creating Strings:: Functions to allocate new strings. |
28 * Modifying Strings:: Altering the contents of an existing string. | 28 * Modifying Strings:: Altering the contents of an existing string. |
29 * Text Comparison:: Comparing characters or strings. | 29 * Text Comparison:: Comparing characters or strings. |
30 * String Conversion:: Converting characters or strings and vice versa. | 30 * String Conversion:: Converting characters or strings and vice versa. |
31 * Formatting Strings:: @code{format}: Emacs's analog of @code{printf}. | 31 * Formatting Strings:: @code{format}: Emacs's analogue of @code{printf}. |
32 * Case Conversion:: Case conversion functions. | 32 * Case Conversion:: Case conversion functions. |
33 * Case Tables:: Customizing case conversion. | 33 * Case Tables:: Customizing case conversion. |
34 @end menu | 34 @end menu |
35 | 35 |
36 @node String Basics | 36 @node String Basics |
95 | 95 |
96 For more information about general sequence and array predicates, | 96 For more information about general sequence and array predicates, |
97 see @ref{Sequences Arrays Vectors}, and @ref{Arrays}. | 97 see @ref{Sequences Arrays Vectors}, and @ref{Arrays}. |
98 | 98 |
99 @defun stringp object | 99 @defun stringp object |
100 This function returns @code{t} if @var{object} is a string, @code{nil} | 100 This function returns @code{t} if @var{object} is a string, @code{nil} |
101 otherwise. | 101 otherwise. |
102 @end defun | 102 @end defun |
103 | 103 |
104 @defun char-or-string-p object | 104 @defun char-or-string-p object |
105 This function returns @code{t} if @var{object} is a string or a | 105 This function returns @code{t} if @var{object} is a string or a |
106 character (i.e., an integer), @code{nil} otherwise. | 106 character (i.e., an integer), @code{nil} otherwise. |
107 @end defun | 107 @end defun |
108 | 108 |
109 @node Creating Strings | 109 @node Creating Strings |
110 @section Creating Strings | 110 @section Creating Strings |
111 | 111 |
112 The following functions create strings, either from scratch, or by | 112 The following functions create strings, either from scratch, or by |
113 putting strings together, or by taking them apart. | 113 putting strings together, or by taking them apart. |
114 | 114 |
115 @defun make-string count character | 115 @defun make-string count character |
116 This function returns a string made up of @var{count} repetitions of | 116 This function returns a string made up of @var{count} repetitions of |
117 @var{character}. If @var{count} is negative, an error is signaled. | 117 @var{character}. If @var{count} is negative, an error is signaled. |
118 | 118 |
119 @example | 119 @example |
120 (make-string 5 ?x) | 120 (make-string 5 ?x) |
121 @result{} "xxxxx" | 121 @result{} "xxxxx" |
126 Other functions to compare with this one include @code{char-to-string} | 126 Other functions to compare with this one include @code{char-to-string} |
127 (@pxref{String Conversion}), @code{make-vector} (@pxref{Vectors}), and | 127 (@pxref{String Conversion}), @code{make-vector} (@pxref{Vectors}), and |
128 @code{make-list} (@pxref{Building Lists}). | 128 @code{make-list} (@pxref{Building Lists}). |
129 @end defun | 129 @end defun |
130 | 130 |
131 @defun string &rest characters | |
131 @tindex string | 132 @tindex string |
132 @defun string &rest characters | |
133 This returns a string containing the characters @var{characters}. | 133 This returns a string containing the characters @var{characters}. |
134 | 134 |
135 @example | 135 @example |
136 (string ?a ?b ?c) | 136 (string ?a ?b ?c) |
137 @result{} "abc" | 137 @result{} "abc" |
230 returns an empty string. | 230 returns an empty string. |
231 | 231 |
232 @example | 232 @example |
233 (concat "abc" "-def") | 233 (concat "abc" "-def") |
234 @result{} "abc-def" | 234 @result{} "abc-def" |
235 (concat "abc" (list 120 (+ 256 121)) [122]) | 235 (concat "abc" (list 120 121) [122]) |
236 @result{} "abcxyz" | 236 @result{} "abcxyz" |
237 ;; @r{@code{nil} is an empty sequence.} | 237 ;; @r{@code{nil} is an empty sequence.} |
238 (concat "abc" nil "-def") | 238 (concat "abc" nil "-def") |
239 @result{} "abc-def" | 239 @result{} "abc-def" |
240 (concat "The " "quick brown " "fox.") | 240 (concat "The " "quick brown " "fox.") |
242 (concat) | 242 (concat) |
243 @result{} "" | 243 @result{} "" |
244 @end example | 244 @end example |
245 | 245 |
246 @noindent | 246 @noindent |
247 The second example above shows how characters stored in strings are | |
248 taken modulo 256. In other words, each character in the string is | |
249 stored in one byte. | |
250 | |
251 The @code{concat} function always constructs a new string that is | 247 The @code{concat} function always constructs a new string that is |
252 not @code{eq} to any existing string. | 248 not @code{eq} to any existing string. |
253 | 249 |
254 When an argument is an integer (not a sequence of integers), it is | 250 When an argument is an integer (not a sequence of integers), it is |
255 converted to a string of digits making up the decimal printed | 251 converted to a string of digits making up the decimal printed |
272 description of @code{mapconcat} in @ref{Mapping Functions}, | 268 description of @code{mapconcat} in @ref{Mapping Functions}, |
273 @code{vconcat} in @ref{Vectors}, and @code{append} in @ref{Building | 269 @code{vconcat} in @ref{Vectors}, and @code{append} in @ref{Building |
274 Lists}. | 270 Lists}. |
275 @end defun | 271 @end defun |
276 | 272 |
273 @defun split-string string separators | |
277 @tindex split-string | 274 @tindex split-string |
278 @defun split-string string separators | |
279 Split @var{string} into substrings in between matches for the regular | 275 Split @var{string} into substrings in between matches for the regular |
280 expression @var{separators}. Each match for @var{separators} defines a | 276 expression @var{separators}. Each match for @var{separators} defines a |
281 splitting point; the substrings between the splitting points are made | 277 splitting point; the substrings between the splitting points are made |
282 into a list, which is the value. If @var{separators} is @code{nil} (or | 278 into a list, which is the value. If @var{separators} is @code{nil} (or |
283 omitted), the default is @code{"[ \f\t\n\r\v]+"}. | 279 omitted), the default is @code{"[ \f\t\n\r\v]+"}. |
320 needs a different number of bytes from the character already present at | 316 needs a different number of bytes from the character already present at |
321 that index, @code{aset} signals an error. | 317 that index, @code{aset} signals an error. |
322 | 318 |
323 A more powerful function is @code{store-substring}: | 319 A more powerful function is @code{store-substring}: |
324 | 320 |
321 @defun store-substring string idx obj | |
325 @tindex store-substring | 322 @tindex store-substring |
326 @defun store-substring string idx obj | |
327 This function alters part of the contents of the string @var{string}, by | 323 This function alters part of the contents of the string @var{string}, by |
328 storing @var{obj} starting at index @var{idx}. The argument @var{obj} | 324 storing @var{obj} starting at index @var{idx}. The argument @var{obj} |
329 may be either a character or a (smaller) string. | 325 may be either a character or a (smaller) string. |
330 | 326 |
331 Since it is impossible to change the length of an existing string, it is | 327 Since it is impossible to change the length of an existing string, it is |
432 | 428 |
433 @defun string-lessp string1 string2 | 429 @defun string-lessp string1 string2 |
434 @code{string-lessp} is another name for @code{string<}. | 430 @code{string-lessp} is another name for @code{string<}. |
435 @end defun | 431 @end defun |
436 | 432 |
433 @defun compare-strings string1 start1 end1 string2 start2 end2 &optional ignore-case | |
434 @tindex compare-strings | |
435 This function compares a specified part of @var{string1} with a | |
436 specified part of @var{string2}. The specified part of @var{string1} | |
437 runs from index @var{start1} up to index @var{end1} (default, the end of | |
438 the string). The specified part of @var{string2} runs from index | |
439 @var{start2} up to index @var{end2} (default, the end of the string). | |
440 | |
441 The strings are both converted to multibyte for the comparison | |
442 (@pxref{Text Representations}) so that a unibyte string can be usefully | |
443 compared with a multibyte string. If @var{ignore-case} is | |
444 non-@code{nil}, then case is ignored as well. | |
445 | |
446 If the specified portions of the two strings match, the value is | |
447 @code{t}. Otherwise, the value is an integer which indicates how many | |
448 leading characters agree, and which string is less. Its absolute value | |
449 is one plus the number of characters that agree at the beginning of the | |
450 two strings. The sign is negative if @var{string1} (or its specified | |
451 portion) is less. | |
452 @end defun | |
453 | |
454 @defun assoc-ignore-case key alist | |
455 @tindex assoc-ignore-case | |
456 This function works like @code{assoc}, except that @var{key} must be a | |
457 string, and comparison is done using @code{compare-strings}. | |
458 Case differences are ignored in this comparison. | |
459 @end defun | |
460 | |
461 @defun assoc-ignore-representation key alist | |
462 @tindex assoc-ignore-representation | |
463 This function works like @code{assoc}, except that @var{key} must be a | |
464 string, and comparison is done using @code{compare-strings}. | |
465 Case differences are significant. | |
466 @end defun | |
467 | |
437 See also @code{compare-buffer-substrings} in @ref{Comparing Text}, for | 468 See also @code{compare-buffer-substrings} in @ref{Comparing Text}, for |
438 a way to compare text in buffers. The function @code{string-match}, | 469 a way to compare text in buffers. The function @code{string-match}, |
439 which matches a regular expression against a string, can be used | 470 which matches a regular expression against a string, can be used |
440 for a kind of string comparison; see @ref{Regexp Search}. | 471 for a kind of string comparison; see @ref{Regexp Search}. |
441 | 472 |
507 @code{int-to-string} is a semi-obsolete alias for this function. | 538 @code{int-to-string} is a semi-obsolete alias for this function. |
508 | 539 |
509 See also the function @code{format} in @ref{Formatting Strings}. | 540 See also the function @code{format} in @ref{Formatting Strings}. |
510 @end defun | 541 @end defun |
511 | 542 |
512 @defun string-to-number string base | 543 @defun string-to-number string &optional base |
513 @cindex string to number | 544 @cindex string to number |
514 This function returns the numeric value of the characters in | 545 This function returns the numeric value of the characters in |
515 @var{string}. If @var{base} is non-@code{nil}, integers are converted | 546 @var{string}. If @var{base} is non-@code{nil}, integers are converted |
516 in that base. If @var{base} is @code{nil}, then base ten is used. | 547 in that base. If @var{base} is @code{nil}, then base ten is used. |
517 Floating point conversion always uses base ten; we have not implemented | 548 Floating point conversion always uses base ten; we have not implemented |
520 | 551 |
521 The parsing skips spaces and tabs at the beginning of @var{string}, then | 552 The parsing skips spaces and tabs at the beginning of @var{string}, then |
522 reads as much of @var{string} as it can interpret as a number. (On some | 553 reads as much of @var{string} as it can interpret as a number. (On some |
523 systems it ignores other whitespace at the beginning, not just spaces | 554 systems it ignores other whitespace at the beginning, not just spaces |
524 and tabs.) If the first character after the ignored whitespace is not a | 555 and tabs.) If the first character after the ignored whitespace is not a |
525 digit or a minus sign, this function returns 0. | 556 digit or a plus or minus sign, this function returns 0. |
526 | 557 |
527 @example | 558 @example |
528 (string-to-number "256") | 559 (string-to-number "256") |
529 @result{} 256 | 560 @result{} 256 |
530 (string-to-number "25 is a perfect square.") | 561 (string-to-number "25 is a perfect square.") |
598 uses the first such value, the second format specification uses the | 629 uses the first such value, the second format specification uses the |
599 second such value, and so on. Any extra format specifications (those | 630 second such value, and so on. Any extra format specifications (those |
600 for which there are no corresponding values) cause unpredictable | 631 for which there are no corresponding values) cause unpredictable |
601 behavior. Any extra values to be formatted are ignored. | 632 behavior. Any extra values to be formatted are ignored. |
602 | 633 |
603 Certain format specifications require values of particular types. | 634 Certain format specifications require values of particular types. If |
604 However, no error is signaled if the value actually supplied fails to | 635 you supply a value that doesn't fit the requirements, an error is |
605 have the expected type. Instead, the output is likely to be | 636 signaled. |
606 meaningless. | |
607 | 637 |
608 Here is a table of valid format specifications: | 638 Here is a table of valid format specifications: |
609 | 639 |
610 @table @samp | 640 @table @samp |
611 @item %s | 641 @item %s |
650 Replace the specification with the decimal-point notation for a floating | 680 Replace the specification with the decimal-point notation for a floating |
651 point number. | 681 point number. |
652 | 682 |
653 @item %g | 683 @item %g |
654 Replace the specification with notation for a floating point number, | 684 Replace the specification with notation for a floating point number, |
655 using either exponential notation or decimal-point notation whichever | 685 using either exponential notation or decimal-point notation, whichever |
656 is shorter. | 686 is shorter. |
657 | 687 |
658 @item %% | 688 @item %% |
659 A single @samp{%} is placed in the string. This format specification is | 689 A single @samp{%} is placed in the string. This format specification is |
660 unusual in that it does not use a value. For example, @code{(format "%% | 690 unusual in that it does not use a value. For example, @code{(format "%% |
739 @cindex lower case | 769 @cindex lower case |
740 @cindex character case | 770 @cindex character case |
741 @cindex case conversion in Lisp | 771 @cindex case conversion in Lisp |
742 | 772 |
743 The character case functions change the case of single characters or | 773 The character case functions change the case of single characters or |
744 of the contents of strings. The functions convert only alphabetic | 774 of the contents of strings. The functions normally convert only |
745 characters (the letters @samp{A} through @samp{Z} and @samp{a} through | 775 alphabetic characters (the letters @samp{A} through @samp{Z} and |
746 @samp{z}); other characters are not altered. The functions do not | 776 @samp{a} through @samp{z}, as well as non-ASCII letters); other |
747 modify the strings that are passed to them as arguments. | 777 characters are not altered. (You can specify a different case |
778 conversion mapping by specifying a case table---@pxref{Case Tables}.) | |
779 | |
780 These functions do not modify the strings that are passed to them as | |
781 arguments. | |
748 | 782 |
749 The examples below use the characters @samp{X} and @samp{x} which have | 783 The examples below use the characters @samp{X} and @samp{x} which have |
750 @sc{ASCII} codes 88 and 120 respectively. | 784 @sc{ASCII} codes 88 and 120 respectively. |
751 | 785 |
752 @defun downcase string-or-char | 786 @defun downcase string-or-char |
821 @end defun | 855 @end defun |
822 | 856 |
823 @defun upcase-initials string | 857 @defun upcase-initials string |
824 This function capitalizes the initials of the words in @var{string}. | 858 This function capitalizes the initials of the words in @var{string}. |
825 without altering any letters other than the initials. It returns a new | 859 without altering any letters other than the initials. It returns a new |
826 string whose contents are a copy of @var{string-or-char}, in which each | 860 string whose contents are a copy of @var{string}, in which each word has |
827 word has been converted to upper case. | 861 been converted to upper case. |
828 | 862 |
829 The definition of a word is any sequence of consecutive characters that | 863 The definition of a word is any sequence of consecutive characters that |
830 are assigned to the word constituent syntax class in the current syntax | 864 are assigned to the word constituent syntax class in the current syntax |
831 table (@xref{Syntax Class Table}). | 865 table (@xref{Syntax Class Table}). |
832 | 866 |
835 (upcase-initials "The CAT in the hAt") | 869 (upcase-initials "The CAT in the hAt") |
836 @result{} "The CAT In The HAt" | 870 @result{} "The CAT In The HAt" |
837 @end group | 871 @end group |
838 @end example | 872 @end example |
839 @end defun | 873 @end defun |
874 | |
875 @xref{Text Comparison}, for functions that compare strings; some of | |
876 them ignore case differences, or can optionally ignore case differences. | |
840 | 877 |
841 @node Case Tables | 878 @node Case Tables |
842 @section The Case Table | 879 @section The Case Table |
843 | 880 |
844 You can customize case conversion by installing a special @dfn{case | 881 You can customize case conversion by installing a special @dfn{case |
858 @item upcase | 895 @item upcase |
859 The upcase table maps each character into the corresponding upper | 896 The upcase table maps each character into the corresponding upper |
860 case character. | 897 case character. |
861 @item canonicalize | 898 @item canonicalize |
862 The canonicalize table maps all of a set of case-related characters | 899 The canonicalize table maps all of a set of case-related characters |
863 into some one of them. | 900 into a particular member of that set. |
864 @item equivalences | 901 @item equivalences |
865 The equivalences table maps each of a set of case-related characters | 902 The equivalences table maps each one of a set of case-related characters |
866 into the next one in that set. | 903 into the next character in that set. |
867 @end table | 904 @end table |
868 | 905 |
869 In simple cases, all you need to specify is the mapping to lower-case; | 906 In simple cases, all you need to specify is the mapping to lower-case; |
870 the three related tables will be calculated automatically from that one. | 907 the three related tables will be calculated automatically from that one. |
871 | 908 |