comparison lispref/strings.texi @ 22138:d4ac295a98b3

*** empty log message ***
author Richard M. Stallman <rms@gnu.org>
date Tue, 19 May 1998 03:45:57 +0000
parents 90da2489c498
children 40089afa2b1d
comparison
equal deleted inserted replaced
22137:2b0e6a1e7fb9 22138:d4ac295a98b3
26 * Predicates for Strings:: Testing whether an object is a string or char. 26 * Predicates for Strings:: Testing whether an object is a string or char.
27 * Creating Strings:: Functions to allocate new strings. 27 * Creating Strings:: Functions to allocate new strings.
28 * Modifying Strings:: Altering the contents of an existing string. 28 * Modifying Strings:: Altering the contents of an existing string.
29 * Text Comparison:: Comparing characters or strings. 29 * Text Comparison:: Comparing characters or strings.
30 * String Conversion:: Converting characters or strings and vice versa. 30 * String Conversion:: Converting characters or strings and vice versa.
31 * Formatting Strings:: @code{format}: Emacs's analog of @code{printf}. 31 * Formatting Strings:: @code{format}: Emacs's analogue of @code{printf}.
32 * Case Conversion:: Case conversion functions. 32 * Case Conversion:: Case conversion functions.
33 * Case Tables:: Customizing case conversion. 33 * Case Tables:: Customizing case conversion.
34 @end menu 34 @end menu
35 35
36 @node String Basics 36 @node String Basics
95 95
96 For more information about general sequence and array predicates, 96 For more information about general sequence and array predicates,
97 see @ref{Sequences Arrays Vectors}, and @ref{Arrays}. 97 see @ref{Sequences Arrays Vectors}, and @ref{Arrays}.
98 98
99 @defun stringp object 99 @defun stringp object
100 This function returns @code{t} if @var{object} is a string, @code{nil} 100 This function returns @code{t} if @var{object} is a string, @code{nil}
101 otherwise. 101 otherwise.
102 @end defun 102 @end defun
103 103
104 @defun char-or-string-p object 104 @defun char-or-string-p object
105 This function returns @code{t} if @var{object} is a string or a 105 This function returns @code{t} if @var{object} is a string or a
106 character (i.e., an integer), @code{nil} otherwise. 106 character (i.e., an integer), @code{nil} otherwise.
107 @end defun 107 @end defun
108 108
109 @node Creating Strings 109 @node Creating Strings
110 @section Creating Strings 110 @section Creating Strings
111 111
112 The following functions create strings, either from scratch, or by 112 The following functions create strings, either from scratch, or by
113 putting strings together, or by taking them apart. 113 putting strings together, or by taking them apart.
114 114
115 @defun make-string count character 115 @defun make-string count character
116 This function returns a string made up of @var{count} repetitions of 116 This function returns a string made up of @var{count} repetitions of
117 @var{character}. If @var{count} is negative, an error is signaled. 117 @var{character}. If @var{count} is negative, an error is signaled.
118 118
119 @example 119 @example
120 (make-string 5 ?x) 120 (make-string 5 ?x)
121 @result{} "xxxxx" 121 @result{} "xxxxx"
126 Other functions to compare with this one include @code{char-to-string} 126 Other functions to compare with this one include @code{char-to-string}
127 (@pxref{String Conversion}), @code{make-vector} (@pxref{Vectors}), and 127 (@pxref{String Conversion}), @code{make-vector} (@pxref{Vectors}), and
128 @code{make-list} (@pxref{Building Lists}). 128 @code{make-list} (@pxref{Building Lists}).
129 @end defun 129 @end defun
130 130
131 @defun string &rest characters
131 @tindex string 132 @tindex string
132 @defun string &rest characters
133 This returns a string containing the characters @var{characters}. 133 This returns a string containing the characters @var{characters}.
134 134
135 @example 135 @example
136 (string ?a ?b ?c) 136 (string ?a ?b ?c)
137 @result{} "abc" 137 @result{} "abc"
230 returns an empty string. 230 returns an empty string.
231 231
232 @example 232 @example
233 (concat "abc" "-def") 233 (concat "abc" "-def")
234 @result{} "abc-def" 234 @result{} "abc-def"
235 (concat "abc" (list 120 (+ 256 121)) [122]) 235 (concat "abc" (list 120 121) [122])
236 @result{} "abcxyz" 236 @result{} "abcxyz"
237 ;; @r{@code{nil} is an empty sequence.} 237 ;; @r{@code{nil} is an empty sequence.}
238 (concat "abc" nil "-def") 238 (concat "abc" nil "-def")
239 @result{} "abc-def" 239 @result{} "abc-def"
240 (concat "The " "quick brown " "fox.") 240 (concat "The " "quick brown " "fox.")
242 (concat) 242 (concat)
243 @result{} "" 243 @result{} ""
244 @end example 244 @end example
245 245
246 @noindent 246 @noindent
247 The second example above shows how characters stored in strings are
248 taken modulo 256. In other words, each character in the string is
249 stored in one byte.
250
251 The @code{concat} function always constructs a new string that is 247 The @code{concat} function always constructs a new string that is
252 not @code{eq} to any existing string. 248 not @code{eq} to any existing string.
253 249
254 When an argument is an integer (not a sequence of integers), it is 250 When an argument is an integer (not a sequence of integers), it is
255 converted to a string of digits making up the decimal printed 251 converted to a string of digits making up the decimal printed
272 description of @code{mapconcat} in @ref{Mapping Functions}, 268 description of @code{mapconcat} in @ref{Mapping Functions},
273 @code{vconcat} in @ref{Vectors}, and @code{append} in @ref{Building 269 @code{vconcat} in @ref{Vectors}, and @code{append} in @ref{Building
274 Lists}. 270 Lists}.
275 @end defun 271 @end defun
276 272
273 @defun split-string string separators
277 @tindex split-string 274 @tindex split-string
278 @defun split-string string separators
279 Split @var{string} into substrings in between matches for the regular 275 Split @var{string} into substrings in between matches for the regular
280 expression @var{separators}. Each match for @var{separators} defines a 276 expression @var{separators}. Each match for @var{separators} defines a
281 splitting point; the substrings between the splitting points are made 277 splitting point; the substrings between the splitting points are made
282 into a list, which is the value. If @var{separators} is @code{nil} (or 278 into a list, which is the value. If @var{separators} is @code{nil} (or
283 omitted), the default is @code{"[ \f\t\n\r\v]+"}. 279 omitted), the default is @code{"[ \f\t\n\r\v]+"}.
320 needs a different number of bytes from the character already present at 316 needs a different number of bytes from the character already present at
321 that index, @code{aset} signals an error. 317 that index, @code{aset} signals an error.
322 318
323 A more powerful function is @code{store-substring}: 319 A more powerful function is @code{store-substring}:
324 320
321 @defun store-substring string idx obj
325 @tindex store-substring 322 @tindex store-substring
326 @defun store-substring string idx obj
327 This function alters part of the contents of the string @var{string}, by 323 This function alters part of the contents of the string @var{string}, by
328 storing @var{obj} starting at index @var{idx}. The argument @var{obj} 324 storing @var{obj} starting at index @var{idx}. The argument @var{obj}
329 may be either a character or a (smaller) string. 325 may be either a character or a (smaller) string.
330 326
331 Since it is impossible to change the length of an existing string, it is 327 Since it is impossible to change the length of an existing string, it is
432 428
433 @defun string-lessp string1 string2 429 @defun string-lessp string1 string2
434 @code{string-lessp} is another name for @code{string<}. 430 @code{string-lessp} is another name for @code{string<}.
435 @end defun 431 @end defun
436 432
433 @defun compare-strings string1 start1 end1 string2 start2 end2 &optional ignore-case
434 @tindex compare-strings
435 This function compares a specified part of @var{string1} with a
436 specified part of @var{string2}. The specified part of @var{string1}
437 runs from index @var{start1} up to index @var{end1} (default, the end of
438 the string). The specified part of @var{string2} runs from index
439 @var{start2} up to index @var{end2} (default, the end of the string).
440
441 The strings are both converted to multibyte for the comparison
442 (@pxref{Text Representations}) so that a unibyte string can be usefully
443 compared with a multibyte string. If @var{ignore-case} is
444 non-@code{nil}, then case is ignored as well.
445
446 If the specified portions of the two strings match, the value is
447 @code{t}. Otherwise, the value is an integer which indicates how many
448 leading characters agree, and which string is less. Its absolute value
449 is one plus the number of characters that agree at the beginning of the
450 two strings. The sign is negative if @var{string1} (or its specified
451 portion) is less.
452 @end defun
453
454 @defun assoc-ignore-case key alist
455 @tindex assoc-ignore-case
456 This function works like @code{assoc}, except that @var{key} must be a
457 string, and comparison is done using @code{compare-strings}.
458 Case differences are ignored in this comparison.
459 @end defun
460
461 @defun assoc-ignore-representation key alist
462 @tindex assoc-ignore-representation
463 This function works like @code{assoc}, except that @var{key} must be a
464 string, and comparison is done using @code{compare-strings}.
465 Case differences are significant.
466 @end defun
467
437 See also @code{compare-buffer-substrings} in @ref{Comparing Text}, for 468 See also @code{compare-buffer-substrings} in @ref{Comparing Text}, for
438 a way to compare text in buffers. The function @code{string-match}, 469 a way to compare text in buffers. The function @code{string-match},
439 which matches a regular expression against a string, can be used 470 which matches a regular expression against a string, can be used
440 for a kind of string comparison; see @ref{Regexp Search}. 471 for a kind of string comparison; see @ref{Regexp Search}.
441 472
507 @code{int-to-string} is a semi-obsolete alias for this function. 538 @code{int-to-string} is a semi-obsolete alias for this function.
508 539
509 See also the function @code{format} in @ref{Formatting Strings}. 540 See also the function @code{format} in @ref{Formatting Strings}.
510 @end defun 541 @end defun
511 542
512 @defun string-to-number string base 543 @defun string-to-number string &optional base
513 @cindex string to number 544 @cindex string to number
514 This function returns the numeric value of the characters in 545 This function returns the numeric value of the characters in
515 @var{string}. If @var{base} is non-@code{nil}, integers are converted 546 @var{string}. If @var{base} is non-@code{nil}, integers are converted
516 in that base. If @var{base} is @code{nil}, then base ten is used. 547 in that base. If @var{base} is @code{nil}, then base ten is used.
517 Floating point conversion always uses base ten; we have not implemented 548 Floating point conversion always uses base ten; we have not implemented
520 551
521 The parsing skips spaces and tabs at the beginning of @var{string}, then 552 The parsing skips spaces and tabs at the beginning of @var{string}, then
522 reads as much of @var{string} as it can interpret as a number. (On some 553 reads as much of @var{string} as it can interpret as a number. (On some
523 systems it ignores other whitespace at the beginning, not just spaces 554 systems it ignores other whitespace at the beginning, not just spaces
524 and tabs.) If the first character after the ignored whitespace is not a 555 and tabs.) If the first character after the ignored whitespace is not a
525 digit or a minus sign, this function returns 0. 556 digit or a plus or minus sign, this function returns 0.
526 557
527 @example 558 @example
528 (string-to-number "256") 559 (string-to-number "256")
529 @result{} 256 560 @result{} 256
530 (string-to-number "25 is a perfect square.") 561 (string-to-number "25 is a perfect square.")
598 uses the first such value, the second format specification uses the 629 uses the first such value, the second format specification uses the
599 second such value, and so on. Any extra format specifications (those 630 second such value, and so on. Any extra format specifications (those
600 for which there are no corresponding values) cause unpredictable 631 for which there are no corresponding values) cause unpredictable
601 behavior. Any extra values to be formatted are ignored. 632 behavior. Any extra values to be formatted are ignored.
602 633
603 Certain format specifications require values of particular types. 634 Certain format specifications require values of particular types. If
604 However, no error is signaled if the value actually supplied fails to 635 you supply a value that doesn't fit the requirements, an error is
605 have the expected type. Instead, the output is likely to be 636 signaled.
606 meaningless.
607 637
608 Here is a table of valid format specifications: 638 Here is a table of valid format specifications:
609 639
610 @table @samp 640 @table @samp
611 @item %s 641 @item %s
650 Replace the specification with the decimal-point notation for a floating 680 Replace the specification with the decimal-point notation for a floating
651 point number. 681 point number.
652 682
653 @item %g 683 @item %g
654 Replace the specification with notation for a floating point number, 684 Replace the specification with notation for a floating point number,
655 using either exponential notation or decimal-point notation whichever 685 using either exponential notation or decimal-point notation, whichever
656 is shorter. 686 is shorter.
657 687
658 @item %% 688 @item %%
659 A single @samp{%} is placed in the string. This format specification is 689 A single @samp{%} is placed in the string. This format specification is
660 unusual in that it does not use a value. For example, @code{(format "%% 690 unusual in that it does not use a value. For example, @code{(format "%%
739 @cindex lower case 769 @cindex lower case
740 @cindex character case 770 @cindex character case
741 @cindex case conversion in Lisp 771 @cindex case conversion in Lisp
742 772
743 The character case functions change the case of single characters or 773 The character case functions change the case of single characters or
744 of the contents of strings. The functions convert only alphabetic 774 of the contents of strings. The functions normally convert only
745 characters (the letters @samp{A} through @samp{Z} and @samp{a} through 775 alphabetic characters (the letters @samp{A} through @samp{Z} and
746 @samp{z}); other characters are not altered. The functions do not 776 @samp{a} through @samp{z}, as well as non-ASCII letters); other
747 modify the strings that are passed to them as arguments. 777 characters are not altered. (You can specify a different case
778 conversion mapping by specifying a case table---@pxref{Case Tables}.)
779
780 These functions do not modify the strings that are passed to them as
781 arguments.
748 782
749 The examples below use the characters @samp{X} and @samp{x} which have 783 The examples below use the characters @samp{X} and @samp{x} which have
750 @sc{ASCII} codes 88 and 120 respectively. 784 @sc{ASCII} codes 88 and 120 respectively.
751 785
752 @defun downcase string-or-char 786 @defun downcase string-or-char
821 @end defun 855 @end defun
822 856
823 @defun upcase-initials string 857 @defun upcase-initials string
824 This function capitalizes the initials of the words in @var{string}. 858 This function capitalizes the initials of the words in @var{string}.
825 without altering any letters other than the initials. It returns a new 859 without altering any letters other than the initials. It returns a new
826 string whose contents are a copy of @var{string-or-char}, in which each 860 string whose contents are a copy of @var{string}, in which each word has
827 word has been converted to upper case. 861 been converted to upper case.
828 862
829 The definition of a word is any sequence of consecutive characters that 863 The definition of a word is any sequence of consecutive characters that
830 are assigned to the word constituent syntax class in the current syntax 864 are assigned to the word constituent syntax class in the current syntax
831 table (@xref{Syntax Class Table}). 865 table (@xref{Syntax Class Table}).
832 866
835 (upcase-initials "The CAT in the hAt") 869 (upcase-initials "The CAT in the hAt")
836 @result{} "The CAT In The HAt" 870 @result{} "The CAT In The HAt"
837 @end group 871 @end group
838 @end example 872 @end example
839 @end defun 873 @end defun
874
875 @xref{Text Comparison}, for functions that compare strings; some of
876 them ignore case differences, or can optionally ignore case differences.
840 877
841 @node Case Tables 878 @node Case Tables
842 @section The Case Table 879 @section The Case Table
843 880
844 You can customize case conversion by installing a special @dfn{case 881 You can customize case conversion by installing a special @dfn{case
858 @item upcase 895 @item upcase
859 The upcase table maps each character into the corresponding upper 896 The upcase table maps each character into the corresponding upper
860 case character. 897 case character.
861 @item canonicalize 898 @item canonicalize
862 The canonicalize table maps all of a set of case-related characters 899 The canonicalize table maps all of a set of case-related characters
863 into some one of them. 900 into a particular member of that set.
864 @item equivalences 901 @item equivalences
865 The equivalences table maps each of a set of case-related characters 902 The equivalences table maps each one of a set of case-related characters
866 into the next one in that set. 903 into the next character in that set.
867 @end table 904 @end table
868 905
869 In simple cases, all you need to specify is the mapping to lower-case; 906 In simple cases, all you need to specify is the mapping to lower-case;
870 the three related tables will be calculated automatically from that one. 907 the three related tables will be calculated automatically from that one.
871 908